An Analysis of Asynchronous Data

(1)

An Analysis of Asynchronous Data

K I M C H A T A L L N I K L A S J O H A N S S O N

Master of Science Thesis Stockholm, Sweden 2013

(2)

(3)

An Analysis of Asynchronous Data

K I M C H A T A L L N I K L A S J O H A N S S O N

Master’s Thesis in Mathematical Statistics (30 ECTS credits)

Master Programme in Mathematics (120 credits)

Supervisor at Handelsbanken was Fredrik Bohlin Supervisor at KTH was Boualem Djehiche Examiner was Boualem Djehiche

TRITA-MAT-E 2013:21 ISRN-KTH/MAT/E--13/21--SE

Royal Institute of Technology School of Engineering Sciences KTH SCI SE-100 44 Stockholm, Sweden URL: www.kth.se/sci

(4)

(5)

Abstract

Risk analysis and financial decision making requires true and appropriate estimates of correlations today and how they are expected to evolve in the future. If a portfolio consists of assets traded in markets with different trading hours, there could potentially occur an underestimation of the right correlation. This is due the asynchronous data - there exist an asynchronicity within the assets time series in the portfolio. The purpose of this paper is twofold. First, we suggest a modified synchronization model of Burns, Engle and Mezrich (1998) which replaces the first-order vector moving average with an first-order vector autoregressive process.

Second, we study the time-varying dynamics along with forecasting the conditional variance-covariance and correlation through a DCC model. The performance of the DCC model is compared to the industrial standard RiskMetrics Exponentially Weighted Moving Averages (EWMA) model. The analysis shows that the covariance of the DCC model is slightly lower than of the RiskmMetrics EWMA model. Our conclusion is that the DCC model is simple and powerful and therefore a promising tool. It provides good insight into how correlations are likely to evolve in the short-run time horizon.

(6)

(7)

Acknowledgements

We would like to express our deep gratitude to Professor Boualem Djehiche for his patient guidance, enthusi- astic encouragement and useful critiques of this paper. Moreover, we also want to thank him for introducing stochastic calculus early in our engineering studies - this inspired us to go for advanced studies in mathematics. We would also like to thank Fredrik Bohlin, Quantitative Analyst at Handelsbanken Capital Markets, for his advice, assistance and thoughtful comments during the process. Our grateful thanks are also extended to Mattias Lundahl, Vice President at Goldman Sachs, for his assistance in finishing this paper. We would also like to extend our thanks to all the Quants at the Model Development Group at Handelsbanken Capital Markets for insightful discussions and for offering us work place during the entire process. Finally, we wish to thank our parents and girlfriends for their support and encouragement throughout our studies.

(8)

(9)

Chapter 1 Introduction

1.1 Background

The dynamics of daily correlations plays an important role in several applications in finance and economics.

It will result either from correlations between risk premiums, dividend news events or expected returns.

Riskmetrics use correlation to calculate Value-At-Risk at short horizons. Erb, Harvey and Viskanta (1994) present examples of the possibility that time varying correlation forecasts will influence optimal portfolio weights. Kroner and V.K. (1998) present how hedging ratios is affected by time varying covariance matrices.

Burns, Engle, and Mezrich J. (1998) illustrate that a term structure of correlation is constructed from a multivariate GARCH model on a daily basis. The correlation term structure can be applied when pricing derivative products which have payoffs that depend on the values of more than one asset. During turbulent market conditions, one wish to value international portfolios in real-time. To calculate correct portfolio value, one need, among other things, correct correlation estimates. For example, standard portfolio theory claims that the tangency portfolio is the only efficient stock portfolio. On the other hand, it has been observed that an investment in the global minimum variance portfolio (GMVP) frequently yields better out-of-sample results than an investment in the tangency portfolio (Kempf and Memmel, 2006). The problem can be seen as a minimizations problem

minωt

ω_t⁰Htωt

s.t.

N

X

i=1

ωi,t= 1,

(1.1.0.1)

where ω is the vector of portfolio weights, Ht is the variance -covariance matrix of the assets. When the weights has been determined, the variance σ_t² = ω⁰_tHtωt at time t can be computed. The most efficient property with the GMVP process is its uniqueness, which means that "the correct" covariance is associated with an improved performace. The portfolio with the most accurate covariance has the smallest variance at time t(Sheppard, 2003). This problem will be analyzed and put into perspective later in the paper.

The idea of modeling and forecasting volatilities and correlations through univariate time series was first

(12)

introduced by Engle (1982). Ever since the first paper, several attempts have been made to model multivariate GARCHs models such as Engle et al. (1984), Bollerslev et al (1988;1994), Engle Mezrich (1996), Bauwens et al. (2006), Silvennoinen and Terasvirta (2008). Bollerslev (1990) introduced a class of multivariate GARCH models called constant conditional correlation (CCC). The main assumption in this model is that the conditional correlations between all assets are assumed to be time invariant (B.2.0.6). However, correlations tend to vary in time and the CCC model cannot incorporate this fact. Another attempt has been made by Alexander and Barbosa (2008) which is known as the Orthogonal GARCH (OGARCH). OG- ARCH assumes that every diagonal conditional variance is a univariate GARCH model (B.3.0.22). Engle (2002a) generalized the CCC model to the dynamic conditional correlation (DCC) model. It has the same structure as the CCC model, besides that it allows the correlations to vary over time instead of being constant.

If the portfolio consists of assets traded in markets with different trading hours, there could potentially occur an underestimation of the right correlation. This is due the asynchronous data - there exist an asynchronicity within the assets time series in the portfolio. Consequently, a common method to lessen the impact of asynchronous data is to use weekly or monthly data. However, weekly and monthly data are unable to capture daily correlation dynamics. Burns et al. (1998) and Riskmetrics (1996;2006) proposed a variety of approaches for treating the issue and calculating a synchronized correlation from a data set containing non-synchronization assets on a daily basis. Therefore Burns et al. (1998) and Riskmetrics (1996:2006) will form the foundation for the synchronization process presented in this paper. The purpose of this paper is twofold. First, we suggest a method for synchronize the returns. Second, we study the daily dynamics along with forecasting of the covariance, the conditional variance and correlation. The results are then tested in the global minimum variance portfolio problem.

1.2 Implication of Asynchronous Data

The difference in trading hours between the world’s stock exchanges plays a vital part when calculating correlations and asset prices. Typically prices are measured from one point in time to the same point 24 hours later. In some cases, stock exchanges in different markets are not open at the same time. Due to different trading hours, news that influences the prices of the assets in the open exchange will also affect the prices in the closed exchange. This is indicated in the opening price and therefore attributing to the following daily return. If returns are measured over distinguishable periods, the correlations may be understated due to asynchronous returns. If assets are traded in markets with diverse trading hours the correlation between those will be influenced. For instance, the correlation between the Japanese market and the U.S. market measured on daily closing prices is significant lower than when simultaneous returns are measured, the markets have a partial overlap during the day. Thus, news events influencing the Japanese market will influence the assets traded on the U.S market the day after (Burns et al, 1998), (Scholes and Williams, 1977)

(13)

and (Lo and McKinlay, 1990a). If the time then differ by several hours, the effects can have seriously impact.

For instance, it is important for hedging strategies and value at risk measurers to have correct values, or estimates these, for any given point in time, to know the value of the assets. Thus, if prices are not measured at the same time for all assets in a portfolio, systematic errors can occur.

1.3 Fundamentals of Correlations

To achieve an understanding of correlations between assets and why they change, it is necessary to glance at the economics behind movements in asset prices. Investors hold assets in anticipation of payments to be made in the future. Thus, the value of an asset is related to forecasts of future payments; changes in prices are a function of changing forecasts of future payments. The changes in forecasts of future payments we simply call news. This is the foundation of the basic model for changes in asset prices (Samuelson, 1965).

Hence, the return of an asset as well as its volatilities and correlation between other assets are dependent on news. The values of all assets are influenced by news to a greater or lesser extent. In equities, news tends to affect some equity prices greater than others because their lines of business are different. Thus, correlations in company’s returns tend to depend on their business. Naturally, if a company changes its business model, its correlations with other companies are likely to change. This is essential to why correlations change over time.

1.4 Fundamentals of Volatility

When observing correlations between assets, it is relevant to get a solid understanding of the expected volatility that might occur among them. Modeling and forecasting the volatility have attracted much attention in recent years, largely driven by its importance in asset-pricing models and risk measurements. There are certain patterns that financial time series exhibits which are essential for correct model specification, estimation and forecasting. Some of these patterns are briefly described below.

Fat Tails is when the distribution of asset returns exhibit fatter tails than those of a normal distribution.

This is becuase they exhibit excess kurtosis.

Volatility Clustering is the clustering of periods of volatility where large movements are followed by further large movements. This is an indication of shock persistence. Corresponding Box-Ljung statistics show significant correlations which exist at extended lag lengths.

Leverage Effects is when volatility increases due to fall in asset prices.

Long Memory occurs in high frequency data, when volatility is highly persistent and there is evidence of near unit root behavior in the conditional variance process.

(14)

Co-Movements in Volatility has been observed in financial time series across different markets (curren- cies). Namely, that big movements in one currency is matched by big movements in another. This suggests the importance of multivariate models in modeling cross-correlation in different markets Investors are interested in modeling volatility in asset returns as volatility is essential in risk measurement and investors wants a premium for bearing risk. To illustrate this, the daily percentage change in the US stock market has periods of high and low volatility. High volatilities were observed during the financial crisis in 2008 and low volatilities in the middle of the 1990’s when there was a consolidation in the market. It has been observed that large changes in volatility tend to be followed by further large changes, and small changes tend to be followed by further small changes, and this is true for either sign. Consequently, there exists some sort of correlation between the magnitudes of the fluctuations. This phenomenon, when a series of data goes through periods of high and low volatility, is called volatility clustering. A more quantitative view of this fact is while asset returns themselves are uncorrelated, the absolute returns or their squares display a positive, significant and slowly decaying autocorrelation function. Due to the fact that the volatility appears in clusters, the variance of the daily returns can be forecasted even though the daily returns itself is difficult to forecast.

(15)

Chapter 2 Theory

2.1 The Univariate GARCH Model

Before we introduce the multivariate Dynamic Conditional Correlation model (DCC), we need to get a fundamental understanding of how the univariate GARCH model functions as it plays an essential role in the study of the DCC Model of Engle and Sheppard (2001). Because the DCC model is a linear combination of the individual GARCH models, and more interestingly, the correlation matrix from the DCC model originates from the GARCH model. Suppose that we have the following return process

rt= µt+ ξt. (2.1.0.1)

where the conditional expectation µt= E[rt|Ft−1], the conditional error ξtand Ft−1= σ({rs: s ≤ t − 1}) is the sigma field generated by the values of {rt} up to time t − 1. Furthermore, assume that the condtitional error is the conditional standard deviation of the return times the I.I.D. normally distributed zero mean unit variance stochastic variable. Hence,

ξt|Ft−1=p

htzt∼ N (0, ht), where zt∼ N (0, 1). (2.1.0.2)

Note that h_t, ξ_tare assumed to be independent of time t. Finally, assume that µ_t= 0, which gives us

rt=p

htztand r|Ft−1∼ N (0, ht). (2.1.0.3)

If µt6= 0 the process could be either ARMA filtered or deamening. However, for µt= 0 the variances of the returns and error coincides, therefore ξtis an innovation process. Bollerslev (1986) stated the GARCH(p,q) process which in general consists of three terms: the weighted long run variance ω, the autoregressive term Pp

i=1γiht−i (the sum of the previous lagged variances times the assigned weight for each lagged variance), and the moving avarage term Pq

i=1δiξ_t−i² (the sum of the previous lags of squared-innovations times the

(16)

assigned weight for each lagged square innovation). Hence, the process can be written as

ht= ω +

p

X

i=1

γiht−i q

X

i=1

δiξ_t−i² , (2.1.0.4)

where p ≥ 0, q > 0, ω ≥ 0, γ_j, δ_j ≥ 0 for j=1,2,. . .

One drawback with this model is that the innovations in the moving average term is raised to the power of two and does not assume assymetry of the errors. Glosten, Jagannathan and Runkel (1993) developed an extensions of the GARCH model, GJR-GARCH, which embraces the asymmetry effect. It will not be analyzed further in this report. However, by definition, the variance process is non-negative, which implies that the process {ht}^∞_t=0must be non-negative valued. Further and more detailed constrains of the GARCH(p,q) model can be found in Nelson and Cao (1992).

2.1.1 The Univariate GARCH(1,1) Model

One of the most common and popular applications of the generalized GARCH(p,q) model is the simple GARCH(1,1) model. This process has the following dynamics

ht= ω + δξ²_t−1+ γht−1

ω ≥ 0, γ_i≥ 0, δ_i≥ 0

(2.1.1.1)

The intuition behind this model is similar to the GARCH(p,q) except that the squared innovation and the variance term only contributes to one lag each. If we successively apply backward recursion of h_tall the way up to time t − T , we get the following expression

h_t= ω(1 + γ + γ²+ · · · + γ^{T −1}) + δ

T

X

k=1

γ^k−1ξ²_t−k+ γ^Th_t−T

= ω1 − γ^T 1 − γ + δ

T

X

k=1

γ^k−1ξ_t−k² + γ^Th_t−T

(2.1.1.2)

Letting T approaching to infinity and since γ ∈ (0, 1), we get the following limit convergence

T →∞lim ht= ω 1 − γ + δ

∞

X

k=1

γ^k−1ξ_t−k² (2.1.1.3)

This shows that the current variance is an Exponential Weighted Moving Average (EMWA) of the past squared innovations. In spite of the fact that there are substantial differences between the GARCH(1,1) and the EMWA model, in the GARCH process the parameters must be estimated. However, a mean-reverting approach has been incorporated in this model.

(17)

It is convenient to work with as few parameters as possible and Engle and Mezhrich (1996) introduced a method called “variance targeting” which makes the computation slightly easier. The idea behind "variance targeting" is as follows. Denote the unconditional variance ¯h, then equation (2.1.1.1) can be re-written in terms of the unconditional variance

ht− ¯h = ω − ¯h + δ(ξ²_t−1− ¯h) + γ(ht−1− ¯h) + γ¯h + δ¯h h_t= ω − (1 − δ − γ)¯h + (1 − δ − γ)¯h + δξ²_t−1+ γh_t−1

(2.1.1.4)

If we let ω = (1 − δ − γ)¯h the above equation becomes

ht= (1 − δ − γ)¯h + δξ²_t−1+ γht−1 (2.1.1.5)

The advantages with this model is that the unconditional variance can be expressed as ¯h = _1−δ−γ^ω , which makes the computation easier. Engle and Mezrich (1996) call this "variance targeting", as it forces the variance matrix to take on a particular and plausible value. Such a moment conditions is particularly attractive since it will be consistent regardless of whether the model (2.1.1.1) is correctly specified. Clearly, this holds only under the assumptions that γ + δ < 1 and if ω > 0, δ > 0, and δ > 0. In order to ensure that the conditional variance htremains non-negative with probability one, the conditions ω ≥ 0, δ ≥ 0, and γ ≥ 0 is sufficient in this case. However, another important feature that preserves the non-negativeness of the conditional covariance htis when the process is stationary. However, the GARCH(1,1) is weakly-stationary, which means that neither the mean process nor the autocovariance of the process depends on the time t with expected value and covariance according to

E(rt) = 0 Cov(r_t, r_t−s) = ω

1 − δ − γ if and only if γ + δ < 1

(2.1.1.6)

Hence, the inequality constrains for the GARCH(1, 1) when using variance targeting is ω > 0, δ > 0, γ >

0, and γ + δ < 1, and is considered covariance stationary.

2.2 The Multivariate Dynamic Conditional Correlation

The Multivariate Dynamic Conditional Correlation (DCC) model is an extension from the univariate GARCH model given in the previous section. Instead of the one dimensional case, suppose that we have a portfolio consisting of n assets. Let rt= (r1,t, r2,t, . . . , rn,t)⁰be a n dimensional column vector of asset returns at time t, such that rtis normally distributed with E[rt|Ft−1] = 0 and covariance matrix Ht= E[rtr⁰_t|Ft−1], where

(18)

Ft−1is the complete set of asset returns up to time t − 1. For example, rt could be the returns of stocks in the S&P 500 equity index. Then we get that

rt= H_t^1/2zt, r_t|Ft−1∼ N (0, Ht),

(2.2.0.7)

where zt= (z1,t, z2,t, . . . , zn,t)⁰ ∼ N (0, In) and In is the identity matrix of order n. One way to obtain H_t^1/2 is by applying Cholesky decomposition of Ht. Furthermore, the conditional covariance matrix in the DCC model is developed be Engle (2002) and is decomposed into a relation between the estimated univariate GARCH variances (Dt) and the conditional correlation matrix (Rt)

Ht= DtRtDt. (2.2.0.8)

Clearly, H_t and R_t are positive definite when there are no linear dependencies in the returns. We need to be assure that all correlation and covariance matrices are positive definite. Thus, that all variances are non- negative. These matrices are in fact stochastic processes and need to be positive definite with probability one, so all past covariance matrices must also be positive definite. If not, there exist linear combinations of rt

and that gives negative or zero variances. Furthermore, Dtis a diagonal matrix of the estimated univariate GARCH variances, i.e.

Dt=







ph1,t 0 . . . 0 0 ph2,t . . . 0 ... ... . .. ... 0 0 . . . phn,t







(2.2.0.9)

The elements in Dtare specified ioing to calculate the quasi-correlation matrix Qtby using a mean-reverting model (2.2.0.12). The n section (2.1). But it works for any GARCH(p,q) process with normally distributed errors which fulfills the requirements to be a stationary process and the non-negative conditions. Moreover, Rt is defined as

Rt=







1 q12,t q13,t . . . q1n,t

q21,t 1 q23,t . . . q2n,t

q31,t q32,t 1 . . . q3n,t

... ... . .. ... q_n1,t q_n2,t q_n3,t . . . 1







(2.2.0.10)

and is the conditional correlation matrix of the standardized residuals t = D_t⁻¹r_t ∼ N (0, Rt). Before we go further and analyze R_t, we should take a step back and evaluate the covariance matrix H_t. We know by definition that the covariance matrix is positive definite. Further we know from (2.2.0.8) that H_t is

(19)

on quadratic form based on Rt. Then it follows that Rt must be positive definite in order to ensure that H_t is positive definite. Hence, by the definition of conditional correlation matrix, all elements in Rt must satisfy the requirement that they are less or equal to one. To guarantee that these requirements are met, R_t is decomposed to R_t= Q^∗−1_t Q_tQ^∗−1_t where Q^∗−1_t ensures that all elements in Q_t fulfills the requirement

|q_ij| ≤ 1. Note that Q_tis positive definite.

Q^∗−1_t =







√q1_11,t 0 . . . 0 0 √q¹_11,t . . . 0 ... ... . .. ...

0 0 . . . √¹

q11,t







(2.2.0.11)

Let us assume that Q_tfollows the dynamics

Qt= Ω + αt−1⁰_t−1+ βQt−1

Ω = (1 − α − β) ¯R R = Cov(¯ t⁰_t) = E(t⁰_t)

(2.2.0.12)

where α, β are scalars. The proposed structure of Qtmight be considered as complicated, but if we compare it with the structure derived from section (2.1.1), where the GARCH(1, 1) model is derived, things seems to make sense. Notice that the structure of Qt is nearly identical to one of the GARCH(1, 1) case with variance targeting. In particular, this dynamical structure is called “mean reverting” an analogue of the

"Scalar GARCH". However, one drawback in this model is that all correlations assume the same structure.

Engle and Sheppard (2002) extended the simple model to a more general structure, called the DCC(P, Q) model. The correlation structure of this model is defined as,

Q_t= (1 −

P

X

i=1

α_i−

Q

X

j=1

β_j) ¯R +

P

X

i=1

α_i_t−i⁰_t−i+

Q

X

j=1

β_jQ_t−j (2.2.0.13)

Further in this paper we are only going to consider the DCC(1,1) model. For more information about alternative procedures, see Engle and Sheppard (2002).

The specification and estimation of the DCC model contains three general steps. The first step is to "DE- GARCHING" the data, which means that the volatilities must be estimated to constructed standardized residuals (or volatility-adjusted returns). Secondly, we use these standardized residuals to estimate the quasi-correlation matrix Qt. The third step is to re-scale the quasi-correlation matrix so it becomes a valid correlation matrix, since the quasi-correlation is an approximation of the true correlation. Hence, some elements in the quasi-correlation matrix may not belong in the defined region of correlation [−1, 1], which

(20)

in theory is not possible. Therefore, we need to adjust these mishaps after the first estimation is completed.

2.2.1 Step 1: DE-GARCHING

The first step is to construct the standardized residuals or the adjusted volatility-returns. Recall that for the DCC model, we have that

H_t= D_tR_tD_t, (2.2.1.1)

D_t²= diag[Ht]. (2.2.1.2)

We know from the previous section that the conditional correlation matrix is the covariance matrix of the standardized residuals, given by

Rt= Cov(D_t⁻¹rt) = E[t⁰_t], given Ft−1 (2.2.1.3)

All sufficient information that we need to estimate the conditional correlation is captured in these standardized residuals. But estimating H_tis difficult, so it is convenient to divided the estimation-procedure into two operations. First, we estimate the diagonal elements and then use these estimates to determine the elements not belonging to the diagonal. The diagonal elements of Dt are the expected standard deviation of each asset with respect to the complete set of information Ft−1. Hence,

H_i,i,t= E[r_i,t² ] given F_t−1 (2.2.1.4)

The issue that has gained a lot of attention over the years is to find an appropriate model to estimate the conditional variance. Bollerslev (1986) provides a short answer and argue that the variance of a random variable, conditioned on its past information, may be represented by a simple GARCH model. Therefore, we are considering the standard GARCH(1,1) in this case, defined as

Hi,i,t= ωi+ αir²_t−1+ βiHi,i,t−1. (2.2.1.5)

Thus, every univariate process in a multivariate portfolio of assets can be estimated using the above model to get its conditional covariance, so the standardized residuals are

_i,t= ri,t

pHi,i,t

. (2.2.1.6)

2.2.2 Step 2: Estimating the Quasi-Correlation

In this step we are GARCH(1,1) process embrace the assumption that most correlations changes temporary and are mean-reverting. This specification give us the dynamics of the quasi-correlation process in the

(21)

mean-reverting model between asset i, j and is specified as

Qi,j,t= ωi,j+ αi,t−1j,t−1+ βQi,j,t−1. (2.2.2.1)

In matrix notation, we can write the above process simply as

Qt= Ω + αt−1⁰_t−1+ βQt−1. (2.2.2.2)

Correspondingly, there are two unknown parameters of the dynamical part (α, β), and ¹₂N (N − 1) unknowns in the intercept matrix. However, there is a simple estimator available for the parameters in the intercept matrix that is called "correlation targeting" (compare with variance targeting). This simple estimator essentially amounts to using an estimate of the unconditional correlations among the volatility-adjusted returns (Engle, 2009). More explicitly, using

Ω = (1 − α − β) ¯ˆ R, (2.2.2.3)

where ¯R = _T¹ PT

t=1t⁰_t, decreases the number of unknown parameters to two. This is something we need to consider and take into account when we are evaluating the properties of the estimator. Now, if we combine (2.2.2.3) and (2.2.2.2) will give us the dynamics for the mean-reverting DCC model:

Q_t= ¯R + α(_t−1⁰_t−1− ¯R) + β(Q_t−1− ¯R). (2.2.2.4)

Accordingly we can assure that Qtis positive definite (PD) if the initial value Q1 is PD and if











α > 0, β > 0 α + β < 1 (1 − α − β) > 0

(2.2.2.5)

An alternative way of seeing this is that each subsequent of Q_tis the sum of positive definit or positive semi definit matrices, so must Qt be PD. How does this model behave? As we already know, the off-diagonal elements of Qtevolves over time in reponse to new information in the returns. If the returns are moving in line at the same direction (either it goes up or down) the correlation will rise and remain over its average level for a while. However, as time goes by, the correlation will fall back to long-run levels as information will decay. Consequently, if the returns move in the opposite direction relative to each other, the correlation will (temporarily) fall below the unconditional value. Thus, this speed of adjustment is controlled by the parameters (α, β), which we need to estimate from the data set. Notice that this is a rather weak specification as only α and β is used, without hardly considering the size of the system.

(22)

2.2.3 Step 3: Rescaling the Quasi-Correlation

The diagonal elements in the matrix Qtwill be an approximation of the correlation matrix. But unfortunately not for every observation, as they may be outside the defined interval . Therefore, we cannot ensure that Qt is a correlation matrix. This problem can be solved through rescalinging the matrix. We can simply estimate the correlation as

ρi,j,t= Qi,j,t

pQi,i,tQ_j,j,t. (2.2.3.1)

While the expected value of Qi,i,t and Qj,j,t are one, they are not estimated to be 1 for every point in time.

Denote this equation as rescaling and its matrix is

Rt= diag(Qt−1/2)Qtdiag(Qt−1/2) (2.2.3.2)

This will introduce nonlinearity into the estimator. In general, Q_twill be linear in cross products and squares of the data. This implies that it is the case for R_t as well. Thus, R_t will not be an unbiased estimator of the correlation. Moreover, the forecasts are biased. This is true for all multivariate GARCH methods due to the obvious fact that correlations are bounded and the set of data are not.

2.2.4 Estimation of the DCC Model

To estimate the DCC model, we make a assumption about the distribution of the data being used. Once this is done, the problem can be restated as a maximum likelihood problem. We will assume that the data has a multivariate normal distribution with given covariance structure and mean. Moreover, the estimator will be quasi maximum likelihood, due to the fact that it will be inefficient but consistent; the covariance and mean models can be accurate while the distribution assumption is inaccurate. Recall that if rt|Ft−1∼ N (0, D_tR_tD_t), then











D²_t = diag(H_t)

Hi,i,t = ωi+ αir²_i,t−1+ βiHi,i,t−1

t = D_t⁻¹rt

Rt = diag(Qt−1/2)Qtdiag(Qt−1/2) Q_t = Ω + α_t−1⁰_t−1+ βQ_t−1

(2.2.4.1)

where (α_i, β_i) are positive ∀i and has a sum less than the unity. In order to estimate θ = (φ, ϕ) = (ω1, δ1, γ1, . . . , ωn, δn, γn, α, β) of Ht for the data set rt = (r1,t, . . . , rn,t), we can set up the following log

(23)

likelihood equation (since the errors are assumed to be multivariate normally distributed)

L(rt, θ) = −1 2

T

X

t=1

(n log(2π) + log |Ht| + r⁰_tH_t⁻¹rt)

= −1 2

T

X

t=1

(n log(2π) + log |DtRtDt| + r_t⁰D⁻¹_t R⁻¹_t D_t⁻¹rt)

= −1 2

T

X

t=1

(n log(2π) + 2 log |Dt| + log |Rt| + ⁰_tR⁻¹_t t)

= −1 2

T

X

t=1

(n log(2π) + 2 log |D_t| + r⁰_tD²_tr_t+ ⁰_t_t+ log |R_t| + ⁰_tR_t⁻¹_t).

(2.2.4.2)

The above function can be maximized with respect to each parameter θ in the model. In particular, the first three terms contain the returns and the variance parameters and the remaining parts containts the correlation parameters as well as the volatility adjusted returns. Hence, we can split the function into two separate parts, namely

L(rt, θ) = L1(rt, φ) + L2(, ϕ)

=(−1) 2

T

X

t=1

(n log(2π) + 2 log |D_t| + r⁰_tD_t²r_t) +(−1) 2

T

X

t=1

(⁰_t_t+ log |R_t| + ⁰_tR⁻¹_t _t).

Two-step estimation of the parameters

To estimate the parameters of the variance matrix H_t we use a two-step estimation method according to (Engle 2009). First, we maximize the variance part of the log-likelihood function where we treat each time series as independent and simply compute the univariate GARCH models for respectively time series. Hence, the Rtmatrix is replaced with the identity matrix In and the variance part then becomes

L₁(r_t, φ) = −1 2

T

X

t=1

n log(2π) + 2 log |D_t| + log |I_n| + r⁰_tD_t⁻¹I_nD⁻¹_t r_t

= −1 2

T

X

t=1

n log(2π) + 2 log |D_t| + r_t⁰D⁻¹_t D⁻¹_t r_t

= −1 2

T

X

t=1 n

X

t=1

log(2π) + 2 log(hi,t) + r²_i,t hi,t

!

= −1 2

n

X

t=1

T log(2π) +

T

X

t=1

2 log(hi,t) + r²_i,t h_i,t

!!

.

(24)

This equation helps us to estimate the parameters ˆφ = ( ˆω1, ˆδ1, ˆγ1, . . . , ˆωn, ˆδn, ˆγn) for each univariate GARCH process of rt. Further as hi,t is estimated for t ∈ [1, T ], so all elements in Dt are estimated over the same period. The second step clearly involves to estimate the parameters of the correlation part, i.e. ϕ = (α, β) conditioned on the previously estimated parameters ˆφ from the first step. We have from the log-likelihood equation

L2(rt, ϕ| ˆφ) = −1 2

T

X

t=1

(n log(2π) + 2 log |Dt| + log|Rt| + ⁰_tR⁻¹_t t)

≈ −1 2

T

X

t=1

(log|Rt| + ⁰_tR⁻¹_t t).

(2.2.4.5)

The approximation evolves because the first two terms n log(2π) + 2log|D_t| are constant and we are only interest to optimize the remaining parts which includes the Rt matrix. The residuals t are calculated according to (2.2.4.1) and the covariance matrix is then estimated by R = _T¹ PT

t=1t⁰_t. Recall the specification of the correlation matrix below. With the imposed restrictions of Ω gives the dynamics of the correlation

Q_t= Ω + α_t−1⁰_t−1+ βQ_t−1 Ω = ¯R(1 − α − β).

(2.2.4.6)

(25)

Chapter 3 Synchronization

3.1 The Data Set

The data set that we consider in this report includes equity indices returns from global markets around the world. Particularly, we intend to consider daily returns on a basis of a logarithmic approach as it captures the daily compounding and new information available from the market (A.0.4.1). The goal is then to study and enlighten the asynchronous properties of these global equity indices, especially when there are no common trading hours. However, there are some markets with partial overlaps, such as the U.K. and the U.S., when both markets are open at the same time. For now on, let us consider a global portfolio G of six equity indices, namely the S&P500 (U.S.), OMXS30 (Sweden), DAX (Germany), HSCE (Hong Kong) and NIKKEI (Japan) for the period February 2006 to March 2013. They were selected due to their size and trading hours which are given in the table below.

Table 3.1: Trading Hours

Stock Exchange(Index) Open(UTC) Close(UTC) Lunch

Japan (NIKKEI 225) 00:00 06:00 02:30-03:50

Hong Kong (HSCE) 01:20 08:00 04:00-05:00

Germany (DAX30) 07:00 21:00 No

UK (FTSE100) 08:00 16:30 No

Sweden (OMXS30) 08:00 16:30 No

U.S. (S&P500) 14:30 21:00 No

One concern that needs to be addressed before the evaluation of the portfolio G is the treatment of missing data points. The portfolio may contain gaps where no closing price (no returns) has been observed. The reason for this is mostly due to bank holidays which create an asymmetry between the other markets.

Recently, the case has also been that something catastrophic has occurred in a nation which causes the markets to temporarily close. To solve this gap-issue and fill the missing data points in our time-series, we apply a classic linear interpolation method Meijering, (2002). Naturally, there are other methods to fill the time series gap issue, such as the ’spline’ method (smooth polynomial function that is piecewise-defined),

(26)

which are not considered in this paper. As mentioned above the log-returns captures the daily information avaiable among the markets, but it also exhibit another beneficial characteristic. By plotting the time series for respectively index, we see that it exhibits the sought stationarity property as it fluctuate around a common mean (Brockwell and Davis, 1991). It is often desired to deal with a stationary data set, primarily because the trait allows us to assume that models are independent of a particular starting point. The time series and log returns are shown in Appendix C.

3.2 Synchronization of the Data

The synchronization technique presented in this section is influenced by the approach proposed by Burns, Engle and Mezrich (1998). For example, consider a sub-portfolio G containing only the U.S. market and the U.K. market. To measure the value of this portfolio when the market is closed in the U.S., we need to estimate the value of the corresponding shares traded in the U.K. Assume that the market in the U.S. falls by 5 percent after the market close in the U.K., then it is inappropriate to value the portfolio at the U.S.

closing time. More general, for the portfolio G, let us denote St,j, as the price measured continuously of an equity index j, such that St,j ∈ G. Let t for t ∈ N be the daily local time in the U.S. so that S1,1 is the price of the U.S. equity index at 16:00 on the first day of trading. The closing time in the U.S. at 16:00 corresponds to 22:00 in the U.K. as the U.K. closes 4 hours before the U.S. Denote the observed closing price of the equity index in the U.K. on the first trading day as S0.83,2 (Burns, Engle and Mezrich, 1998).

The observed data are captured from the closing times of the markets and has the structure St_{j ,j} where tj = t1− cj (0 ≤ cj ≤ 1), j = (1, . . . , 6). We have to synchronize to t1, the closing time in the U.S. of equity index j = 1, where t₁ ∈ {1, 2, . . . , T }. Our intention is to produce synchronized prices S_t,j^s , where t ∈ {1, 2, . . . , T }, for all equity indices. The synchronized prices S_t,j^s is defined as

log(S_t,j^s ) = E[log(St,j)|Ft], where Ft= {St_j,j; tj≤ t, j = 1, . . . , 6}, (3.2.0.1)

where the logarithms are consistent with continuously compounded returns. Thus, the best predicted log prices at t are the synchronized log prices given the complete information Ft, where Ftcontains all recorded prices up to time t. Moreover, the complete information contains only the prices St_j,j with tjt. A strict relationship tj< t is often observed when equity index j is trading in a market with a different closing time than the first equity in the U.S. For the equity index in the U.S., the closing price Stthat is observed at time t ∈ N has the conditional expectation, given Ft, the observed price. In the case when a market closes before t, its past closing prices and closing prices from other markets can be useful for predicting S_tat time t. For simplicity, assume that given F_tthe predicted log prices at t and at the following closing time t_j+ 1 are the same. Thus, future changes from predictions at time t to predicted returns at tj+ 1 are not predictable.

log(S_t,j^s ) = E[log(S_t,j)|F_t] = E[log(S_t_j_+1,j)|F_t], t_j≤ t < tj+ 1 (t ∈ N) (3.2.0.2)

An Analysis of Asynchronous Data

An Analysis of Asynchronous Data

An Analysis of Asynchronous Data

Contents

Chapter 1

Introduction

Chapter 2

Theory

Chapter 3

Synchronization