Exploring the Factors of the Credit Default Swap Spread in Different Business Sectors

(1)

IN

DEGREE PROJECT TECHNOLOGY, FIRST CYCLE, 15 CREDITS

STOCKHOLM SWEDEN 2017 ,

Exploring the Factors of the Credit Default Swap Spread in Different Business Sectors

KRISTOFER ENGMAN BETTY ÅLANDER

KTH ROYAL INSTITUTE OF TECHNOLOGY

(2)

(3)

Exploring the Factors of the Credit Default Swap Spread in Different Business Sectors

KRISTOFER ENGMAN BETTY ÅLANDER

Degree Projects in Applied Mathematics and Industrial Economics Degree Programme in Industrial Engineering and Management KTH Royal Institute of Technology year 2017

Supervisor at SEB: Morten Karlsmark, Salla Franzén, Supervisors at KTH: Pierre Nyquist, Hans Lööf Examiner at KTH: Henrik Hult

(4)

TRITA-MAT-K 2017:05 ISRN-KTH/MAT/K--17/05--SE

Royal Institute of Technology School of Engineering Sciences KTH SCI

(5)

Abstract

In this study, we investigate the effect of market factors on credit default swap spreads aggregated by specific business sectors. The market factors include commodity spot prices, foreign exchange spot prices, equity index prices and interest swap rates. Using linear regression modelling, we find that many of the factors are correlated to the credit default swap spreads. To examine the collective effect of the factors on the credit default swap spread, we produce linear models using best subsets regression.

The empirical results suggest that many of the factors are significant in explaining the

credit default swap. Our models show significance of regression on a 99% level, and most

variables have correlations that are consistent with previous research. Notably, we find

that the factors show different levels of significance for each of the sectors. Based on

this investigation we conclude that there in fact exist relationships between the market

factors and the credit default swap spread changes, and that these relationships are

business sector specific.

(6)

(7)

Marknadsfaktorers inverkan p˚ a spreaden f¨ or kreditswappar inom olika aff¨ arsomr˚ aden

Sammanfattning

I denna studie unders¨ oker vi marknadsfaktorers inverkan p˚ a spreaden f¨ or kreditswappar aggregerade med avseende p˚ a utvalda aff¨ arsomr˚ aden. Marknadsfaktorerna som inklud- eras i studien ¨ ar avistapriser f¨ or r˚ avaror, avistapriser f¨ or utl¨ andska valutakurser, aktie- index priser och r¨ anteswapkurser. Genom modellering med linj¨ arregression finner vi att m˚ anga av faktorerna p˚ avisar korrelation med spreaden f¨ or kreditswappar. F¨ or att under- s¨ oka den gemensamma effekten som faktorerna har p˚ a spreaden f¨ or kreditswappar skapar vi linj¨ ara modeller genom att testa alla m¨ ojliga permutationer av variablerna.

De empiriska resultaten antyder att m˚ anga av faktorerna uppvisar signifikans i sin f¨ orklarande f¨ orm˚ aga av spreaden f¨ or kreditswappar. Regressionsmodellerna p˚ avisar signifikans p˚ a en 99%-niv˚ a och majoriteten av variablerna visar p˚ a korrelationer som

˚ aterspeglar tidigare forskning inom omr˚ adet. I synnerhet ser vi att faktorerna visar olika signifikansniv˚ aer f¨ or de olika aff¨ arsomr˚ adena. D¨ arav dras slutsatsen att det finns ett samband mellan marknadsfaktorerna och spreaden f¨ or kreditswappar, samt att dessa

¨

ar aff¨ arsomr˚ adesspecifika.

(8)

(9)

Acknowledgements

We would like to thank our supervisors at the Royal Institute of Technology (KTH), Dr. Pierre Nyquist at the Department of Mathematics and Prof. Hans L¨ o¨ of at the Department of Industrial Economics and Management for the assistance before and during the study.

We would also like to thank Skandinaviska Enskilda Banken AB (SEB), and especially

Dr. Morten Karlsmark and Dr. Salla Franz´ en, for the help with choosing a subject,

providing us with data and for the valuable feedback.

(10)

(11)

1 Introduction 6

1.1 Background . . . . 6

1.2 Research Question . . . . 6

1.3 Goal and Purpose . . . . 7

1.4 Scope and Limitations . . . . 7

2 Theory 9 2.1 Counterparty Risk Theory . . . . 9

2.1.1 Lending Risk vs. Counterparty Risk . . . . 9

2.1.2 Pre-settlement Risk and Settlement Risk . . . . 9

2.1.3 Components of Counterparty Risk and Wrong-Way Risk . . . . 10

2.1.4 The Credit Default Swap . . . . 11

2.1.5 Pricing Credit Default Swaps . . . . 11

2.2 Mathematical Theory . . . . 12

2.2.1 Multiple Linear Regression . . . . 12

2.2.2 Underlying Assumptions . . . . 13

2.2.3 Method of Least Squares . . . . 13

2.2.4 t-test and F -test . . . . 14

2.2.5 Confidence Intervals of Estimated Regression Coefficients . . . . . 14

2.2.6 Heteroscedasticity . . . . 15

2.2.7 Multicollinearity Diagnostics . . . . 15

2.2.8 Basic Statistical Measures . . . . 16

2.2.9 Model Evaluation Methods . . . . 17

2.2.10 Variable Selection Techniques . . . . 19

2.2.11 Transformations . . . . 19

2.3 Literature Review . . . . 20

3 Methodology 22 3.1 Data . . . . 22

3.1.1 Response Variables . . . . 22

3.1.2 Predictor Variables . . . . 23

3.1.3 Data Structuring . . . . 25

3.2 Model Analysis . . . . 25

3.2.1 Structure of Regression Model . . . . 25

3.2.2 Residual Analysis . . . . 26

3.2.3 Variable Selection . . . . 26

3.2.4 Reduced Model Summary . . . . 26

4 Results 28 4.1 Residual Analysis . . . . 28

4.1.1 Normal Q-Q Plot and Heteroscedasticity . . . . 28

4.1.2 Multicollinearity . . . . 29

(12)

4.2 Variable Selection . . . . 30

4.3 Model Choice and Resulting Models . . . . 30

5 Discussion 32 5.1 Discussion of Residual Analysis . . . . 32

5.2 Discussion of Final Models . . . . 33

5.2.1 Commodities . . . . 33

5.2.2 Foreign Exchange Spot Rates . . . . 34

5.2.3 Equity Indices . . . . 35

5.2.4 Interest Swap Rates . . . . 36

5.3 Evaluation of Chosen Methods . . . . 37

5.4 Conclusion . . . . 37

5.4.1 Further Studies . . . . 38

References 39 6 Appendix 41 6.1 Table 5: Predictor Variables . . . . 41

6.2 Table 6: Variance Inflation Factors . . . . 41

6.3 Figure 3: Residual analyses after transformation . . . . 42

6.4 Figure 4: Residual analyses before transformation . . . . 43

6.5 Figure 5: Correlation plot for the Basic Materials sector . . . . 44

6.6 Figure 6: Correlation plot for the Energy sector . . . . 45

6.7 Figure 7: Correlation plot for the Financials sector . . . . 46

6.8 Table 7: Models for the Basic Materials sector after best subsets regression 47 6.9 Table 8: Models for the Energy sector after best subsets regression . . . . 48

6.10 Table 9: Models for the Financials sector after best subsets regression . . 49

(13)

1 Introduction

1.1 Background

At the turn of the century, credit derivatives became increasingly popular in the deriva- tives market. The market for credit derivatives grew from a total notional principal of

$800 billion in 2000 to a high point of $50 trillion before the financial crisis 2007, after which it decreased and subsequently stabilised [1]. Credit derivatives are an integral part of counterparty credit risk. This is the risk resulting from the failure of a counterparty to fulfill its contractual obligations in a financial contract. Historically, counterparty risk was generally managed by performing trades with counterparties regarded as hav- ing solid finances [2]. Counterparty risk has always been integral for risk management, but until the financial crisis, its importance was obscured by the myth of the ”too big to fail” institutions.

In light of the financial distress caused by the financial crisis, the resulting general decrease in credit quality on the market, as well as new regulations, the interest in coun- terparty risk skyrocketed [2]. A key method of mitigating counterparty risk is through hedging with a credit derivative known as the credit default swap (CDS). Therefore, it is of great interest to determine the factors that affect the CDS spread for monitoring purposes.

The most well-known models for pricing these kinds of derivatives are those of Black, Scholes, and Merton, known as the structured models [3, 4]. Such models use the underlying assumption that a firm defaults when the value of the firm’s assets drops below a certain threshold compared to its level of debt. Previous studies have proposed several models of the CDS spread empirically using linear regression with predictors based on economic theory. However, these studies were primarily concentrated on debt- related variables such as the firm’s leverage ratio. Since the value of a firm’s assets also play a central role for the occurance of firm default, we will in our study introduce asset-related variables in addition to proxies for the market condition to our models of the CDS spread.

1.2 Research Question

In this study, we formulate and analyse several models for the CDS spreads using mul- tiple linear regression. We regress the CDS spreads of firms in different sectors against common market factors to see how well they correlate and to see if we can find a model that adequately describes the CDS spread. Thus, the research questions we want to answer with this study are:

Is there a correlation between business sector specific credit default swap spreads and

common market factors?

(14)

and if a correlation is found,

Can the credit default swap spread be modelled, using these factors, with linear regression modelling?

Following the results of the study, we will also look at how the models compare to previous research on the determinants of CDS spreads, and how they can be interpreted from a financial perspective.

1.3 Goal and Purpose

The purpose of this study is to test if our set of market factors can describe the CDS spread of specific business sectors, and to find a model using these factors. Because of the stochastic nature of the financial market, we expect the explanatory and predictive ability of the model to be moderate.

As stated in the Background, the CDS is a popular credit derivative, and following the financial crisis the value of the risk mitigation they provide has become all the more evident. The CDS is a relatively recent phenomenon. Empirical research on the modelling of these was still in its early stages in 2007 [5]. It would be of value to determine novel models of CDS spreads that financial institutions could use to predict how market factor changes affect the CDS spread.

CDS spreads are also used as an implicit measure of the probability of default. One use of the probability of default is to monitor possible wrong-way risk. Since financial institutions perform transactions of large volumes, the credit risk associated with wrong- way risk is important to consider. An improved modelling of the probability of default through CDS spreads would therefore be valuable. By looking at the correlation of market factors on the CDS spread, the model can assist in monitoring wrong-way risk for companies in specific sectors.

Additionally, these models could be used in synergy with existing risk management mod- els when making investment decisions, providing a short-term market-based estimation of the expected loss of the transaction. If our models indicate that an investment with a counterparty can be considered as safe, the financial institution could increase the vol- ume of which they trade with said counterparty. This could be a profitable investment opportunity.

1.4 Scope and Limitations

To limit the size of this study, we have posed some limitations on its scope, primarily relating to the data that will be analysed.

The study will include analysis of CDS spreads corresponding to firms in the following

business sectors: Basic Materials, Energy and Financials. These were chosen based on

(15)

the empirical assumption that different predictors should be significant for each of these sectors, and based on the availability of data. To facilitate the calculations, we will consolidate the firms by business sector when we analyse the spreads and thus look at the CDS spread for a business sector as a whole instead of for a specific firm. This will be explained further in later sections.

The predictor variables used for the study are: commodity spot prices, equity indices, foreign exchange spot rates, and interest swap rates. These were chosen to represent a sample of the financial market, and because of the availability of data. In this study, we did not account for possible endogeneity arising from the chosen predictors.

Furthermore, we have limited the span of time from which the data is collected. The

data period is from 2008-01-02 to 2017-03-15. This study will not account for the effect

of time series in the data. However, the choice of looking at the weekly relative changes

in the data will reduce the effect of time series. This will be explained further in the

Methodology section.

(16)

2 Theory

In this section, we will present the theory that represents the backbone for the study’s relevance as well as its quantitative part. A presentation will firstly be made on the area of counterparty risk, followed by a short introduction of the mathematical methods used in the study. Finally, we will present a short literature review on the subject.

2.1 Counterparty Risk Theory

Counterparty credit risk, commonly referred to as counterparty risk, is the risk that a counterparty of a financial contract cannot fulfill its contractual obligations due to insolvency. Counterparty risk is often likened to lending risk, however there is a sub- stantial difference in the risk environment between these transaction types. In the coming subsections, counterparty risk as a concept will be presented, followed by methods for mitigating said risk. If nothing else is specified, the theory in this section is collected from the literature written by Jon Gregory, and John Hull [1, 2]

2.1.1 Lending Risk vs. Counterparty Risk

In lending risk, the exposure at risk at any time during the lending period can generally be thought of as constant. This is because market variables such as interest rates will only create moderate levels of uncertainty over the amount owed. Moreover, only the bondholder (the lender) will take on credit risk. If the bond defaults, the issuer of a bond (the borrower) does not face a loss.

Similarly to lending risk, the cause of loss in counterparty risk is the counterparty be- ing unable or unwilling to meet the contractual obligations. In this case, however, the exposure during the contract period is uncertain. This is because the underlying in most cases is a financial derivative, which generally exhibits substantial volatility. Fur- thermore, since a derivative can have a negative value, the value of the contract can be both positive and negative. Thus, counterparty risk is bilateral, meaning that each counterparty holds risk against the other.

2.1.2 Pre-settlement Risk and Settlement Risk

Generally, we divide the counterparty risk of a transaction into two parts: pre-settlement

risk and settlement risk. The pre-settlement risk is the risk that a counterparty will

default during the lifetime of the transaction, prior to the settlement period or expiration

of the transaction, while settlement risk is the risk of default during the settlement

period. Usually, counterparty risk refers to the pre-settlement risk, however, settlement

risk is also important to consider.

(17)

Settlement risk is characterised by a very large potential exposure (value at risk in an investment), which could even amount to 100 % of the notional of the investment [2]. The probability of default of the counterparty during the settlement period, however, is very low in most cases. Pre-settlement risk generally has a much lower potential exposure, but there is a substantially higher probability that the counterparty will default since the time horizon is considerably longer. The balance between settlement risk and pre- settlement risk depends on the nature of the underlying derivative that is traded. In later sections, counterparty risk refer to the pre-settlement risk.

2.1.3 Components of Counterparty Risk and Wrong-Way Risk

The main components of counterparty risk are the Expected Loss (EL), Probability of Default (PD), Exposure At Default (EAD), and Loss Given Default (LGD). They are related as follows:

EL = PD · EAD · LGD

The PD is the probability that the counterparty will default during the contract period, the EAD is the value of the underlying asset in an investment at the time of the default and the LGD is equal to one minus the recovery rate, the amount of the notional that will be recovered at default [6]. The LGD is generally set at 60 %, but can vary depending on for example the region in which the counterparty operates [1]. For more informa- tion on the estimation of these parameters, we refer to The Basel II Risk Parameters:

Estimation, Validation, and Stress Testing by Engelmann & Rauhmeier [6].

The components are also related to capital requirements imposed on financial institutions that are exposed to credit risk by the Basel Accords. The capital requirement is defined [7]:

K = LGD · [Φ(f (P D, R)) − P D] · C,

where R is a correlation coefficient for the asset, Φ is the standard normal cumulative distribution, f (P D, R) is a function depending on PD and R, and C is the full maturity adjustment as a function of PD and the maturity of the asset.

Throughout the duration of a financial contract, the PD and the EAD will vary to

some extent. The PD may for example vary due to a change in credit quality of the

firm, and the EAD will change according to market conditions [2]. Generally these

components are thought to be independent, such that the probability of the PD and

EAD values simultaneously increasing is very low. However, this may not be the case,

as was illustrated by the market events in 2007 and onwards [2]. The case when the PD

and the EAD are increasing at the same time is known as wrong-way risk (WWR), while

an increase in EAD and decrease in PD, is known as right-way risk (RWR). Thus, an

increase in the EAD can be both beneficial and detrimental depending on the PD.

(18)

A simple example of WWR is buying a put option on a stock where the underlying in question consists of assets that are highly correlated to those of the counterparty. The put option’s value will increase if the stock goes down, in which case the counterparty’s credit quality will likely be deteriorating.

2.1.4 The Credit Default Swap

A natural step for firms in the financial market is to attempt to mitigate counterparty risk. This can be done through a number of methods such as netting, collateral, and hedging. Hedging is the process of neutralising the risk of a transaction by taking a position that offsets the risk associated with the transaction, for example by using the underlying of the transaction that is hedged. However, a wide range of derivatives can be used for hedging, and one of note is the credit default swap (CDS).

The CDS is the most common type of credit derivative, and provides a way for companies to trade credit risk in the same way that they trade market risk: through asset-backed derivatives [1]. It is a contract that provides an insurance for a transaction against the risk of the default of a specific firm. In the context of the CDS, the counterparty is known as the reference entity, and the default of said counterparty is called a credit event.

The buyer of a CDS gains the right to sell bonds issued by the reference entity for their face value when a credit event occurs, and the seller of a CDS is then obliged to purchase these bonds for their face value. The face value of the bonds that can be sold is known as the CDS’s notional principal. In return, the buyer of a CDS will make periodic payments, normally quarterly, until the end of the insurance period or until a credit event occurs. The total amount that the buyer of the CDS pays per year, as a percent of the notional, is called the CDS spread.

2.1.5 Pricing Credit Default Swaps

The most well-known mathematical approaches for pricing the CDS are structural and reduced form models.

The structured models are based on Fischer Black, Myron Scholes, and Robert Merton’s assumption that a firm defaults when the value of its assets falls below a certain level [3, 4]. The weakness of structured models is that some of the variables needed in the models are difficult to estimate with an adequately high probability [8].

A later development is the concept of reduced models based on the research by for exam-

ple Jarrow & Turnbull, where the credit risk instead is determined by the probability of

default modeled as a stochastic process (known as the hazard rate) with a set recovery

rate in case of default [9]. The reduced models of Darrel Duffie are also frequently used

in the literature [10].

(19)

A third method of modelling credit risk is through empirical modelling [5, 8]. Empirical modelling is a form of ex post modelling, using historical data to determine a correlation or to predict expected future outcomes, commonly through mathematical modelling using some form of regression analysis. A selection of these studies will be presented in the later section named Literature Review.

In this study, empirical modelling through regression analysis will be performed using the Black-Scholes-Merton assumption of a correlation between the firm’s probability of default and the value of its assets compared to its level of debt.

2.2 Mathematical Theory

This section will describe the mathematical theory that is used in the quantitative part of the study. If nothing else is specified, the theory in this section is collected from the literature written by Douglas Montgomery et al., and Trevor Hastie et al. [11, 12].

2.2.1 Multiple Linear Regression

Regression analysis is a commonly used method within mathematical statistics for inves- tigating and modelling relationships between variables. Regression analysis has numer- ous applications in many different fields. The main goal is to find the best linear model with respect to a set of observations of the chosen variables. Ordinary least squares is the most commonly used method to obtain this model.

Generally, the inferences made using a regression model are:

– Identifying the relative effects of the predictor variables – Prediction and/or estimation

– Selection of an appropriate set of variables for the model

The multiple linear regression model is defined as the relationship between a response variable y, and multiple predictors {x

₁

, x

₂

, ..., x

_k

}, and can be expressed as follows:

y = Xβ + , where

y =





 y

₁

y

₂

.. . y

_n







, X =







1 x

₁₁

x

₁₂

... x

_1k

1 x

₂₁

x

₂₂

... x

_2k

.. . .. . .. . .. . 1 x

_n1

x

_n2

... x

_nk







, β =





 β

₀

β

₁

.. . β

_k







and =







₁

₂

.. .

_n





 .

The value k represents the number of predictors, and n represents the number of obser-

vations, such that x

_ij

is the i:th observation of predictor x

_j

, and y

_i

is the i:th observation

(20)

of the response variable. The vector β consists of the parameters β

j

, for j = 0, 1, . . . , k, and are unknown regressions coefficients that will be estimated by performing multiple linear regression. Thus, we have p = k + 1 coefficients: one for the intercept, β

₀

, and one for each of the k predictor variables, {β

1

, . . . , β

k

}. The vector is a vector containing the error terms for all n observations.

2.2.2 Underlying Assumptions

To apply multiple linear regression, some underlying assumptions have to be made [11, 13, 14]:

– The relationship between the response, y, and the predictors, {x

1

, x

2

, ..., x

k

}, is linear.

– The error term has zero mean, E() = 0

– The model is homoscedastic, which means that the error term has constant variance, V ar() = σ

²

– The errors are uncorrelated

– The errors are normally distributed

If the aforementioned assumptions are not fulfilled, the regression model may be ill- conditioned and might present misleading results.

2.2.3 Method of Least Squares

To estimate the regression coefficients β, we wish to find the vector of least squares estimators, denoted ˆ β. This is done by finding the solution to the following optimisation problem:

min

β

S(β) =

⁰

= (y − Xβ)

⁰

(y − Xβ),

where y are the observations of the response and X are the observations of the predictors.

This has the optimal solution:

β = (X ˆ

⁰

X)

⁻¹

X

⁰

y.

The derived least squares estimator ˆ β has the following properties:

– ˆ β is an unbiased estimator of β, E( ˆ β) = β.

– V ar( ˆ β

j

) = σ

²

(X

⁰

X)

⁻¹_jj

, for j = 1, . . . , p.

– Cov( ˆ β

_i

, ˆ β

_j

) = σ

²

(X

⁰

X)

⁻¹_ij

, where i 6= j.

– ˆ β is the best linear unbiased estimator of β. This follows from the Gauss-Markov

theorem.

(21)

2.2.4 t-test and F -test

The t-test can be used to test the significance of a regression coefficient for a specific regression model. The hypotheses for testing the significance of the predictor x

_j

corre- sponding to the regression coefficient β

_j

are:

H

₀

: β

_j

= 0, H

₁

: β

_j

6= 0.

If the null hypothesis, H

₀

, is rejected, we can with a predetermined confidence level conclude that the regression coefficient corresponding to predictor x

j

is non-zero and that the predictor should be included in the model. This implies that the predictor is significant for the regression model on the predetermined confidence level.

The t-test statistic for predictor x

j

is defined as:

t

_j

= β ˆ

_j

q

ˆ

σ

²

(X

⁰

X)

⁻¹_jj

,

where ˆ σ

²

is an estimation of the variance, commonly defined as:

ˆ

σ

²

= SS

Res

n − p = M S

Res

. SS

Res

will be explained further in section 2.2.8.

The null hypothesis is rejected if:

|t

_j

| > t

_{α/2,n−k−1}

, where α is the the chosen significance level for the t-test.

The t-test is equivalent to the partial F-test. The relationship between a t-test with v degrees of freedom, and a F -test with 1 degree of freedom in the numerator and v degrees of freedom in the denominator is [11]:

t

²_j

= F

_j

.

2.2.5 Confidence Intervals of Estimated Regression Coefficients

Another way to examine the hypotheses presented above is by constructing confidence

intervals. These are created for each regression coefficient with a predetermined sig-

nificance level α. We can say that the true value of the coefficient will be within the

confidence interval with a confidence level of 1−α. If zero is not included in the confidence

(22)

interval, then we can reject the null hypothesis, H

0

, with the predetermined confidence level. A confidence interval for the regression coefficient β

j

is defined as:

I

βj

= ˆ β

j

± t

_α/2,n−p

q

ˆ

σ

²

(X

⁰

X)

⁻¹_jj

.

2.2.6 Heteroscedasticity

In many cases, the assumption of constant variance of the error terms is not fulfilled, and this phenomenon is known as heteroscedasticity. To alleviate the effects of het- eroscedasticity, a transform is usually performed. If the transformation does not reduce the heteroscedasticity to an adequate level, it is also possible to perform a weighted least squares estimation, where the weights are the inverse of the variance, to reduce the error terms to unity. Another possibility is to perform a robust regression model, for example using White’s robust standard errors [15].

2.2.7 Multicollinearity Diagnostics

In multiple regression modelling, there are several phenomena that can cause degradation in the resulting model’s adequacy. An important assumption is that there is a linear relationship between the response variable and the predictors. However, in some cases there may also exist near-linear dependencies between the predictors, also known as multicollinearity.

Generally, some multicollinearity is almost always present between predictors, and we can assume that the inferences presented in section 2.2.1 can be made with only some degree of uncertainty. However, inferences from models with severe multicollinearity may produce misleading results. With severe multicollinearity, our least squares estimated coefficients will have a very large variance, and the Euclidean distance between our estimators and their true values can be very large.

To visualise the near-linear dependencies between the predictors, a correlation plot can

be useful. The correlations in this plot correspond to the correlations found when per-

forming a simple linear regression where one variable is chosen as the response and

another as the sole predictor.

(23)

Variance Inflation Factor

To identify multicollinearity, the variance inflation factors (VIF) can be examined. This method examines the diagonal elements of the C = (X

⁰

X)

⁻¹

-matrix. The VIF value associated with predictor x

j

is denoted C

jj

and is defined as:

V IF

j

= C

jj

= (X

⁰

X)

⁻¹_jj

= (1 − R

²_j

)

⁻¹

.

Here, R

_j²

is the coefficient of determination when we regress predictor x

j

against the remaining predictors. The coefficient of determination will be explained further in sec- tion 2.2.9. An R

²_j

near unity implies that there exists a near-linear relationship between the predictor x

j

and the remaining predictors. A V IF

j

value exceeding 10 implies that predictor x

_j

is multicollinear, and many V IF

_j

larger than 10 indicates strong multi- collinearity in the data set [11]. Since C

_jj

= (X

⁰

X)

⁻¹_jj

, we can from the definition of confidence intervals in section 2.2.5 see that a high level of multicollinearity will result in a wide confidence interval for the regression coefficients, and thus decreases the usability of the model.

Condition Number

Another way of identifying multicollinearity is to perform an eigensystem analysis. This method examines the eigenvalues of the matrix X

⁰

X. If there exists a near-linear rela- tionship between the columns of X, the eigenvalues will be small. One way of quantifying the multicollinearity from the above phenomenon is through calculating the condition number, which is the fraction of the largest and the smallest eigenvalues of the ma- trix:

κ = λ

_max

λ

_min

.

A κ that is less than 100 implies that there is no serious multicollinearity in the data, a κ between 100 and 1000 implies multicollinearity, and a κ larger than 1000 indicates severe multicollinearity [11].

2.2.8 Basic Statistical Measures

To examine the explainability of a regression model, three measures are used: Regression, Residual, and Total Sum of Squares.

Regression Sum of Squares

The regression sum of squares measures the variation in the observed data and quantifies the amount of variability in the observations accounted for by the model. The regression sum of squares is defined as:

SS

R

=

n

X

i=1

( ˆ y

i

− ¯ y)

²

,

(24)

where ˆ y

i

is the fitted value of the response and ¯ y = 1 n

P

n

i=1

y

i

is the mean value of the response.

Residual Sum of Squares

The residual sum of squares measures the variation in the error terms. It is an indication of how much the data differs from the estimated regression model. The residual sum of squares is defined as:

SS

Res

=

n

X

i=1

(y

i

− ˆ y

i

)

²

.

Total Sum of Squares

The total sum of squares, which is the sum of the regression and residual sum of squares, measures the total variability in the observations and is defined as:

SS

_T

= SS

_R

+ SS

_Res

=

n

X

i=1

(y

_i

− ¯ y)

²

.

2.2.9 Model Evaluation Methods

There exists multiple methods for comparing and evaluating different regression models.

The measures used in this study are: the coefficient of determination (R

²

), adjusted R

²

, PRESS statistic, akaike information criterion, bayesian information criterion, and Mallow’s C

_p

.

Coefficient of Determination

The coefficient of determination, commonly referred to as R

²

, is a measure of the pre- dictors’ ability to explain the variance in the response, and shows if the model replicates the observations adequately. Since the SS

_T

is always larger than or equal to SS

_R

, the measure is always within the interval 0 ≤ R

²

≤ 1. A value close to unity indicates that nearly all the variance in the response variable can be explained by the predictors.

Consequently, models with large values of R

²

are generally desired. However, by adding more predictors, it is always possible to obtain a larger R

²

. Therefore the measure should be used with carefulness. R

²

is defined as:

R

²

= SS

R

SS

_T

= 1 − SS

Res

SS

_T

.

(25)

Adjusted R

²

Adjusted R

²

, commonly denoted R

²_Adj

, is a measure that adjusts the R

²

for its degrees of freedom. Thus, the R

²_Adj

only increases if the mean square error decreases, which is not always the case when adding more predictors. This makes this statistic a more attractive model selection measure than R

²

. R

_Adj²

is defined as:

R

²_Adj

= 1 − SS

Res

/(n − p) SS

T

/(n − 1) . PRESS Statistic

The PRESS statistic measures how well the regression model will do when predicting new data points. This statistic is the sum of squares of the ordinary residuals adjusted for the observation’s distance from the centroid of the x-space through the diagonal elements of the hat matrix H, where h

_ii

= x

⁰_i

(X

⁰

X)

⁻¹

x

_i

. If we are seeking a model with high predictive ability, the model with the smallest PRESS statistic should be chosen.

The PRESS statistic is defined as:

P RESS =

n

X

i=1

_i

1 − h

_ii

2

.

Akaike Information Criterion and Bayesian Information Criterion

The akaike information criterion, or AIC, measures the quality of a model based on a set of data. It makes a trade-off between the model’s complexity and how well the model fits the data. Similarly, the bayesian information criterion, or BIC, measures the quality of a model and penalises model complexity. However, the penalisation of model complexity for BIC is higher than for the AIC measure. Hence, the AIC measure will prefer larger models than the BIC measure. Small values of AIC and BIC are desirable.

For the ordinary least squares case, AIC and BIC are defined as:

AIC = −nln SS

_Res

n

+ 2p, BIC = −nln SS

_Res

n

+ pln(n).

Mallows’s C

p

Statistic

Mallows’s C

p

measures the precision and bias of different models. Small values of C

p

are desirable, since this indicates that the model has small variance and therefore is more accurate when performing regression. The measure takes into account the issue of overfitting since it penalises models with a large number of predictors. Mallows’s C

_p

is defined as:

C

_p

= SS

_Res

(p) ˆ

σ

²

− n + 2p.

(26)

2.2.10 Variable Selection Techniques

In situations where there are a large amount of candidate predictors for the regression model, it is desirable to find the best subset of these variables. Methods that can facilitate this selection of variables are known as best subsets regression methods. These find the best possible subset of the candidate variables with respect to a chosen statistical measure among those presented in section 2.2.9.

All Possible Regressions

All possible regressions, also known as the exhaustive subset selection method, is a method that tests all the possible subsets of the candidate variables and finds the best one according to one of the statistical measures mentioned above. It starts with the null model, including only the intercept term β

₀

, and successively adds predictors until all different combinations for the models have been tested. Since we test all the possible models, the result will represent the best subset according to the chosen statistic.

The drawback of using this method is that it requires a large amount of computational power to perform. In the case of k candidate predictors, the algorithm will perform 2

^k

estimations. For today’s computational power, it is possible to perform the exhaustive method for up to approximately 30 candidate predictors [11].

Forward Stepwise Selection

For models that have more than 30 candidate predictors, the forward stepwise selection method can be used. Using this method, the assumption is made that there are zero predictors in the model initially. Then, the method will add the predictor with the largest simple correlation with the response variable. In other words, it is the predictor that will produce the largest F statistic. The algorithm will thereafter successively add the predictor with the largest correlation with the response, when adjusting for the effect of the previously added predictors. This is done until the model contains a predetermined amount of variables, or until all model sizes have been tested with respect to a statistical measure [11].

For larger models, the exhaustive and the forward approach are generally used in unison.

Firstly, one performs forward subset selection to find the 30 best predictors for the model. Secondly, an exhaustive approach is performed on the predictors selected from the forward approach to find the best possible subset using one of the statistical measures presented earlier [11].

2.2.11 Transformations

In the case of non-normality in the data used in the linear regression model, transfor-

mations of the data can be performed. Sometimes, we can use experience or theoretical

models to determine the best transformation. In many cases, however, it needs to be

determined analytically. For financial data, the logarithmic transform is commonly used.

(27)

The logarithmic transform is also useful since it decreases the level of heteroscedasticity for data that is not normally distributed [15].

When using the logarithmic transformation, the model takes the form:

lny = lnβ

₀

+ β

₁

x

₁

+ · · · + β

_k

x

_k

+ ln,

which corresponds to the following assumed relationship between the response and the predictor variables:

y = β

0

e

^β¹^x¹^+···+β^k^x^k

.

2.3 Literature Review

Previous research on the subject of CDS spreads have examined a number of determi- nants of CDS spreads chosen on the basis of economic theory. This section will present a selection of the previous studies performed in the area, and some notable results.

Merton’s model and recent extensions predict a negative relationship between the risk- free rate and the bond spread. This is confirmed by studies performed by Longstaff

& Schwartz, Duffee, and Skinner & Townend [16, 17, 18]. Empirical studies on credit risk modelling have generally used zero-coupon rates extracted from government bonds as their proxy for the default-free interest rate, however, in 1998 financial markets have moved away from using these. Today, the default-free rate is widely proxied by swap and repo contracts, and this has also been done in more recent studies [8]. Notably, Blanco et al., Houweling & Vorst, and Hull et al. used the swap rate as a proxy for the risk-free rate, and found that it more closely matches the CDS market’s use of the risk-free rate than using the treasury rate [8, 19, 20].

Aunon-Nerin et al. performed a thorough analysis of the economic theory on pricing models of the CDS and identified possibly interesting determinants for the spread, in- cluding credit rating, yield curves, stock prices, interest rates and leverage [21]. They found that the identified factors provide a large explanability for the CDS spreads. No- tably, they confirmed the negative correlation between local interest rates and CDS spreads, and also found that US risk-free rates are significant in explaining CDS spreads even in different countries. Furthermore, they showed a negative correlation between the stock price and the CDS spread. The most important factor for the CDS spread suggested by their model is the credit risk rating of the company, which is expected from the risk-neutral pricing models [1].

Many studies have examined the same variables that were presented by Aunon-Nerin et al., and performed more thorough empirical modelling of default risk premia. Houweling

& Vorst used reduced-form models to test their pricing performance [8]. Collin-Dufresne

et al., Campbell & Taksler, Fabozzi et al., and Ericsson et al. performed linear regres-

sion analysis on the relationship between credit spreads and key variables suggested by

economic theory [22, 23, 24]. However, most of these looked at credit spreads between

(28)

corporate bond yields and a benchmark risk-free rate while our study examines credit spreads as proxied by CDS spreads. Using CDS spreads has the advantage of avoid- ing noise arising from an inadequate model of the risk-free yield curve, but should still produce similar results [24].

Fabozzi et al. also performed linear regression using similar variables to the above stud- ies, but used CDS spreads as the credit spread. They showed that there is a significant correlation between CDS spreads and the business sector in which the firm operates in [5].

In a later study by Longstaff et al. examined the lead-lag relationship between the

stock returns and CDS spread changes, and found that the stock market leads the CDS

market [25]. Norden & Weber found in their study that there is a significant negative

correlation between stock returns and CDS spread changes, and also confirm the lead-lag

relationship presented by Longstaff et al. [26].

(29)

3 Methodology

In this section, we will describe the methodology that was used to answer the research questions of the study. A presentation will be made of the data used in the study, followed by the mathematical methodology used to obtain our results.

3.1 Data

This section will present the response and the predictor variables used in the study and how they are structured for the analysis. The data used in this study was provided by SEB.

3.1.1 Response Variables

The response variables used in the study were five year CDS spreads for firms active in a number of various business sectors and countries. The data is collected on a time horizon of approximately seven and a half years spanning between 2009-09-21 and 2017-03-14, and includes a total of approximately one million observations of the CDS spreads. We chose to only look at investment grade companies, since they have the most liquid CDS contracts, leaving out the high yield companies. Investment grade companies have a credit rating of ”BBB” and up, and high yield companies have a credit rating below

”BBB”. The weekly relative change in the CDS spreads was used in the analysis as presented in section 3.1.3.

The CDS spreads were from firms in the following business sectors: Basic Materials, Energy and Financials. See Table 1 and Table 2 below for a more detailed presentation of the data included in each sector. For each sector, and for each viable observation date, we calculated the arithmetic mean of the changes in CDS spreads of the firms to find a sector specific CDS spread. On each data set of sector specific CDS spreads, we performed a separate regression analysis using the same predictor variables.

Table 1: The geographical distribution of the companies included in the CDS data for each sector

Region\Sector Basic Materials Energy Financials

Europe 21 8 72

North America 1 1 13

East Europe 0 2 5

Latin America 0 1 0

Total 22 12 90

(30)

Table 2: The credit rating distribution of the companies included in the CDS data for each sector. We only use the CDS spreads for investment grade companies.

Rating Basic Materials Energy Financials

AAA 0 0 0

AA 0 2 17

A 3 2 36

BBB 6 5 17

BB 8 1 9

B 1 0 4

CCC 1 0 0

NA 3 2 7

Total 22 12 90

3.1.2 Predictor Variables

The predictors used in the model were daily quotes of a number of variables: commodity spot prices, FX spot rates, equity indices and interest swap rates. The data set consisted of roughly 89 000 observations of the variables during a period between 2008-01-02 and 2017-03-15. For this study, the weekly relative change for these variables was used, as presented in section 3.1.3. The predictors used in the study were:

1. Commodity Spot Prices

The chosen commodity spot prices were a selection of some of the most liquid commodities on the market that were deemed to possibly be interesting for the selected sectors. The commodity spot prices are quoted in either US dollars or Euro, but primarily in US dollars. We looked at the spot prices for:

– Corn bushels – Wheat bushels – Troys of gold

– Barrels of Brent crude oil (ICE) – Barrels of WTI crude oil (Nymex) – Metric tonnes of aluminium – Metric tonnes of copper

– Metric tonnes of lead – Metric tonnes of nickel – Metric tonnes of zinc – MWh of electricity – Pounds of coffee – Pounds of sugar no. 11 2. Foreign Exchange (FX) Spot Rates

An FX spot rate is the rate for which two currencies are exchanged at a point in

time. They are named as ’foreign/domestic’, and are quoted as the spot price of

(31)

the foreign currency in the domestic currency. For example, the EUR/USD FX spot is the price of one Euro in US dollars.

We looked at the FX spot rates for:

– EUR/USD – GBP/USD – USD/NOK – USD/SEK 3. Equity Indices

An equity index is a portfolio of stocks in a market, typically the ones with the largest market share. They are intended to represent a proxy for the condition of a localised stock market, and are quoted in the local currency.

The equity indices we looked at were:

– DAX index (30 most traded stocks on the Deutsche B¨ orse, Germany) – MSCI world index (Morgan Stanley Capital International World Index) – OBX index (25 most traded stocks on the Oslo Stock Exchange, Norway) – OMX index (30 most traded stock on the Stockholm Stock Exchange, Sweden) – S&P 500 index (Standard & Poor’s index of the 500 largest American com-

panies)

– Euro Stoxx 50 index (Stock index of the 50 largest and most liquid Eurozone companies)

4. Interest Swap Rates

The interest swap rates are zero-coupon rates calculated from swap quotes. These swap rates are an exchange of a fixed interest rate for a floating interest rate. The floating interest rate is the LIBOR, plus a risk premium. The interest swap rate is widely considered to be a proxy for the risk-free rate [8]. The maturity of the LIBOR used for each currency is written within a parenthesis. The data is quoted in percent.

We looked at the 1, 5, and 10 year swaps for:

– Danish krone (6 months) – Euro (6 months)

– Norwegian krone (6 months)

– Swedish krona (3 months)

– US dollar (3 months)

(32)

In total, we had 38 predictors. However, because of missing data and high multicollinear- ity, we removed the observations of zinc, MSCI World index, Euro Stoxx 50 index, as well as the 1y and 5y interest swap rates for all currencies. This reduced the total amount of predictors to 25. The final predictors are presented in Table 5 of the appendix.

3.1.3 Data Structuring

To examine the impact of the predictors on the response, we looked at the weekly changes in the observations. We used the relative change, such that we examined the percent change in the variables per week instead of the absolute changes. To enable transform of the data, we added one to the relative change of every observation such that they were centered around unity. For the set of n observations {Z

₁

, Z

₂

, . . . , Z

_i

, Z

_i+1

, . . . , Z

_n

}, the relative change centered around unity is:

∆Z

i

= Z

i+1

− Z

_i

Z

_i

+ 1 = Z

i+1

Z

_i

Some previous studies used the absolute change for some variables, such as interest rates.

However, in this study we chose to only look the relative changes.

After matching the dates of the response variables and the predictor variables, we had a total of 1 939 observations over a time span between 2009-09-21 and 2017-03-10. For these observations, we calculated the weekly relative change according to the above formula, resulting in a total of 388 usable observations. For each of these observations, we had one response variable and the above presented 25 predictors.

3.2 Model Analysis

Using the data presented in section 3.1, we performed a thorough regression analysis.

The analysis was performed using the programming language R in the statistical software RStudio.

3.2.1 Structure of Regression Model

For the regression analysis, we assumed that the regression model was structured as a multiple linear regression model. To facilitate comparison with previous research, we also performed a logarithmic transform of the response variable, such that our model became:

y

_sector⁰

= β

₀⁰

+ β

₁

x

₁

+ β

₂

x

₂

+ ... + β

₂₄

x

₂₄

+ β

₂₅

x

₂₅

+

⁰

,

(33)

where y

⁰_sector

represents the logarithmically transformed weekly changes in the CDS spread for the selected business sector, β

₀⁰

= ln(β

0

), and

⁰

= ln().

The predictors, {x

₁

, x

₂

, ..., x

₂₄

, x

₂₅

} are presented in Table 5 of the appendix. As pre- sented in the section 3.1.3, the data points used in the study were the weekly relative changes for each observations of the variables.

Since we wanted to look at sector specific correlations, we created one model for each of the selected business sectors.

3.2.2 Residual Analysis

To inspect the normality assumption for the linear regression model, we performed a residual analysis of the models. Firstly, we calculated the R-student residual of the re- gression model and plotted these against the estimated values for the response, ˆ y. Then, we produced a quantile-quantile normality plot (also known as the Q-Q plot). These two plots were subsequently inspected for any violations of the normality assumption and for heteroscedasticity.

Furthermore, we tested the models for multicollinearity. This was done by calculating the models’ V IF

j

and condition number, κ. In order to facilitate the discussion of these correlations, we produced a plot of the simple correlations for the predictors and the response.

3.2.3 Variable Selection

For each sector’s regression model we performed variable selection using best subset selection regression.

Since we already reduced the amount of variables to less than 30 in section 3.1, the unified forward-exhaustive approach presented in section 2.2.10 was not necessary. We therefore immediately performed an exhaustive all possible regressions procedure to find the optimal model from our set of 25 predictor variables. From the exhaustive selection we obtain three models, each being the optimal solution with respect to R

²_adj

, BIC or Mallow’s C

_p

statistics respectively.

Thus, we obtained a total of nine models from the best subsets regression, three for each sector.

3.2.4 Reduced Model Summary

Finally, we chose the three best models, one for each sector, based on the PRESS statistic.

The PRESS statistic was used since it is a common model selection statistic along with

(34)

those presented in section 2.2.9. Since all the other statistics had already been utilized to find the nine candidate models, we used the PRESS statistic for our final selection.

The final models were then analysed using the models’ coefficients as well as common

statistics such as R

²

and R

²_adj

. We also performed a t-test for the regression coefficients

to test their significance for the model, followed by an F -test to test the significance of

regression.

(35)

4 Results

In this section, we will present the results of the methodology presented in the previous section. The final models will then be discussed in the following section.

4.1 Residual Analysis

Firstly, we will present the results from the residual analysis. The results generally indi- cate that the underlying assumptions of the linear regression analysis are not perfectly fulfilled.

4.1.1 Normal Q-Q Plot and Heteroscedasticity

The results from the residual analysis are presented in Figure 1 below. Only the Basic Materials sector’s residual analysis is presented since the results are similar for all the sectors. The Q-Q plots and residual plots for all sectors can be found in Figure 3, and the untransformed residual plots in Figure 4 in appendix. The Normal Q-Q plots indicate a heavy-tailed distribution for all the sectors, and the residual plots show an indication of heteroscedasticity.

−0.1 0.0 0.1 0.2

−0.15−0.050.050.100.150.20

Fitted values

Residuals

Residuals vs Fitted

−3 −2 −1 0 1 2 3

−4−2024

Theoretical Quantiles

Standardized residuals

Normal Q−Q Basic Materials Sector

Figure 1: Residual plots for the Basic Materials sector.

(36)

4.1.2 Multicollinearity

The calculated Variance Inflation Factors (VIFs) are presented in Table 6 in the ap- pendix. Two of the 25 predictors used in the study have VIFs that are larger than the recommended limit of 10, which indicates that the data displays moderate multi- collinearity. The equity indices and the interest swap rates are the groups of predictor variables containing the highest VIF values. In the commodities group, the crude oil spot prices show moderate multicollinearity. The obtained condition number of 161.99 confirms that there is no serious problem with multicollinearity.

The correlation plot for the variables is presented below in Figure 2. Since the correlation plots for the different sectors are almost identical, we have here only presented the results for the Basic Materials sector. The correlation plots for all the sectors can be found in Figure 5 to 7 in appendix, where the y variable is the CDS spread for each sector.

−1

−0.8

−0.6

−0.4

−0.2 0 0.2 0.4 0.6 0.8 1 y X1.Corn X2.Wheat X3.Gold X4.Brent.Crude.Oil X5.Aluminium X6.Copper X7.Lead X8.Nickel X9.Electricity.MWh X10.Coffee X11.Sugar..11 X12.Nymex.Crude.Oil X13.FX.EUR.USD X14.FX.GBP.USD X15.FX.USD.NOK X16.FX.USD.SEK X17.DAX.Index X18.OBX.Index X19.OMX.Index X20.SPX.Index X21.IR.DKK.10y X22.IR.EUR.10y X23.IR.NOK.10y X24.IR.SEK.10y X25.IR.USD.10y

y X1.Corn

X2.Wheat X3.Gold X4.Brent.Crude.Oil

X5.Aluminium X6.Copper

X7.Lead X8.Nickel X9.Electricity.MWh

X10.Coffee X11.Sugar..11 X12.Nymex.Crude.Oil

X13.FX.EUR.USD X14.FX.GBP.USD

X15.FX.USD.NOK X16.FX.USD.SEK

X17.DAX.Index X18.OBX.Index

X19.OMX.Index X20.SPX.Index

X21.IR.DKK.10y X22.IR.EUR.10y

X23.IR.NOK.10y X24.IR.SEK.10y

X25.IR.USD.10y

Figure 2: Simple correlation matrix for the response y, here for the CDS spread of the

Basic Materials sector, and the predictors.

(37)

4.2 Variable Selection

The resulting models after exhaustive selection with respect to the three statistical mea- sures R

²_adj

, BIC, and Mallow’s C

_p

are presented in Tables 7 to 9 in appendix, one for each sector. The R

_adj²

statistic chooses the largest models, and the BIC chooses the smallest.

The models for the Basic Materials sector had the highest R

²_adj

, at approximately 47%, while the Energy sector had the lowest at approximately 38%. All the models show significance of regression at the 99% confidence level.

4.3 Model Choice and Resulting Models

We then inspect the obtained models using the PRESS statistic, presented in Table 3 below. The best PRESS statistics for each sector has been bolded. The corresponding model is used as the final model in the rest of the study.

Table 3: The PRESS-statistics for the models obtained by the variable selection techniques. The best PRESS statistics for each sector has been bolded.

Model\Sector Basic Materials Energy Financials

R

²_adj

0.6773 0.9989 0.8099

BIC 0.6807 0.9928 0.8142

Mallow’s C

_p

0.6679 0.9882 0.8028

The resulting models for each sector, including only the statistically significant predictors on a 95 % confidence level or above, thus are:

Basic Materials

y

Basic M aterials

=β

0

+ β

6

Copper + β

7

Lead + β

13

FX EUR/USD + β

15

FX USD/NOK+

β

17

DAX Index + β

19

OMX Index + β

22

IR EUR 10y + β

25

IR USD 10y Energy

y

_Energy

=β

₃

Gold + β

₄

Brent Crude Oil + β

₆

Copper + β

₁₄

FX GBP/USD+

β

15

FX USD/NOK + β

17

DAX Index + β

20

SPX Index Financials

y

F inancials

=β

0

+ β

3

Gold + β

6

Copper + β

13

FX EUR/USD + β

14

FX GBP/USD+

β

17

DAX Index + β

19

OMX Index + β

20

SPX Index

Table 4 presents a more detailed view of the final models, including the regression co-

efficients, statistical measures, and standard errors. These also include the coefficients

that had a significance level below 95 %.

(38)

Table 4: Final models for each sector

Response variable: CDS Spread

Basic Materials Energy Financials

X3.Gold 0.336^∗∗∗ 0.410^∗∗∗

(0.125) (0.114)

X4.Brent.Crude.Oil −0.171^∗∗

(0.076)

X6.Copper −0.352^∗∗∗ −0.330^∗∗∗ −0.199^∗∗

(0.105) (0.110) (0.097)

X7.Lead 0.144^∗

(0.081)

X9.Electricity.MWh −0.066

(0.044)

X13.FX.EUR.USD −0.469^∗ −1.495^∗∗∗

(0.256) (0.236)

X14.FX.GBP.USD 0.439^∗ 0.451^∗

(0.246) (0.237)

X15.FX.USD.NOK 0.420^∗ 0.760^∗∗∗

(0.219) (0.210)

X17.DAX.Index −0.555^∗∗∗ −0.367^∗∗ −0.732^∗∗∗

(0.158) (0.148) (0.181)

X19.OMX.Index −0.495^∗∗∗ −0.385^∗

(0.184) (0.207)

X20.SPX.Index −0.901^∗∗∗ −0.397^∗

(0.228) (0.218)

X22.IR.EUR.10y 0.106^∗∗∗

(0.040)

X25.IR.USD.10y −0.258^∗∗∗

(0.072)

Constant 1.458^∗∗∗ 0.236 2.416^∗∗∗

(0.481) (0.459) (0.240)

Observations 365 366 363

R² 0.483 0.395 0.474

Adjusted R² 0.472 0.383 0.462

Residual Std. Error 0.042 (df = 356) 0.051 (df = 358) 0.046 (df = 354) F Statistic 41.603^∗∗∗(df = 8; 356) 33.420^∗∗∗(df = 7; 358) 39.935^∗∗∗(df = 8; 354)

Note: ^∗p<0.1;^∗∗p<0.05;^∗∗∗p<0.01

Exploring the Factors of the Credit Default Swap Spread in Different Business Sectors

IN

DEGREE PROJECT TECHNOLOGY, FIRST CYCLE, 15 CREDITS

STOCKHOLM SWEDEN 2017 ,

Exploring the Factors of the Credit Default Swap Spread in Different Business Sectors

KRISTOFER ENGMAN BETTY ÅLANDER

KTH ROYAL INSTITUTE OF TECHNOLOGY

Exploring the Factors of the Credit Default Swap Spread in Different Business Sectors

KRISTOFER ENGMAN BETTY ÅLANDER

Abstract

The empirical results suggest that many of the factors are significant in explaining the

credit default swap. Our models show significance of regression on a 99% level, and most

variables have correlations that are consistent with previous research. Notably, we find

that the factors show different levels of significance for each of the sectors. Based on

this investigation we conclude that there in fact exist relationships between the market

factors and the credit default swap spread changes, and that these relationships are

business sector specific.

Marknadsfaktorers inverkan p˚ a spreaden f¨ or kreditswappar inom olika aff¨ arsomr˚ aden

Sammanfattning

De empiriska resultaten antyder att m˚ anga av faktorerna uppvisar signifikans i sin f¨ orklarande f¨ orm˚ aga av spreaden f¨ or kreditswappar. Regressionsmodellerna p˚ avisar signifikans p˚ a en 99%-niv˚ a och majoriteten av variablerna visar p˚ a korrelationer som

˚ aterspeglar tidigare forskning inom omr˚ adet. I synnerhet ser vi att faktorerna visar olika signifikansniv˚ aer f¨ or de olika aff¨ arsomr˚ adena. D¨ arav dras slutsatsen att det finns ett samband mellan marknadsfaktorerna och spreaden f¨ or kreditswappar, samt att dessa

¨

ar aff¨ arsomr˚ adesspecifika.

Acknowledgements

We would like to thank our supervisors at the Royal Institute of Technology (KTH), Dr. Pierre Nyquist at the Department of Mathematics and Prof. Hans L¨ o¨ of at the Department of Industrial Economics and Management for the assistance before and during the study.

We would also like to thank Skandinaviska Enskilda Banken AB (SEB), and especially

Dr. Morten Karlsmark and Dr. Salla Franz´ en, for the help with choosing a subject,

providing us with data and for the valuable feedback.

Contents

1 Introduction 6

1.1 Background . . . . 6

1.2 Research Question . . . . 6

1.3 Goal and Purpose . . . . 7

1.4 Scope and Limitations . . . . 7

2 Theory 9 2.1 Counterparty Risk Theory . . . . 9

2.1.1 Lending Risk vs. Counterparty Risk . . . . 9

2.1.2 Pre-settlement Risk and Settlement Risk . . . . 9

2.1.3 Components of Counterparty Risk and Wrong-Way Risk . . . . 10

2.1.4 The Credit Default Swap . . . . 11

2.1.5 Pricing Credit Default Swaps . . . . 11

2.2 Mathematical Theory . . . . 12

2.2.1 Multiple Linear Regression . . . . 12

2.2.2 Underlying Assumptions . . . . 13

2.2.3 Method of Least Squares . . . . 13

2.2.4 t-test and F -test . . . . 14

2.2.5 Confidence Intervals of Estimated Regression Coefficients . . . . . 14

2.2.6 Heteroscedasticity . . . . 15

2.2.7 Multicollinearity Diagnostics . . . . 15

2.2.8 Basic Statistical Measures . . . . 16

2.2.9 Model Evaluation Methods . . . . 17

2.2.10 Variable Selection Techniques . . . . 19

2.2.11 Transformations . . . . 19

2.3 Literature Review . . . . 20

3 Methodology 22 3.1 Data . . . . 22

3.1.1 Response Variables . . . . 22

3.1.2 Predictor Variables . . . . 23

3.1.3 Data Structuring . . . . 25

3.2 Model Analysis . . . . 25

3.2.1 Structure of Regression Model . . . . 25

3.2.2 Residual Analysis . . . . 26

3.2.3 Variable Selection . . . . 26

3.2.4 Reduced Model Summary . . . . 26

4 Results 28 4.1 Residual Analysis . . . . 28

4.1.1 Normal Q-Q Plot and Heteroscedasticity . . . . 28

4.1.2 Multicollinearity . . . . 29

4.2 Variable Selection . . . . 30

4.3 Model Choice and Resulting Models . . . . 30

5 Discussion 32 5.1 Discussion of Residual Analysis . . . . 32

5.2 Discussion of Final Models . . . . 33

5.2.1 Commodities . . . . 33

5.2.2 Foreign Exchange Spot Rates . . . . 34

5.2.3 Equity Indices . . . . 35

5.2.4 Interest Swap Rates . . . . 36

5.3 Evaluation of Chosen Methods . . . . 37

5.4 Conclusion . . . . 37

5.4.1 Further Studies . . . . 38

References 39 6 Appendix 41 6.1 Table 5: Predictor Variables . . . . 41

6.2 Table 6: Variance Inflation Factors . . . . 41

6.3 Figure 3: Residual analyses after transformation . . . . 42

6.4 Figure 4: Residual analyses before transformation . . . . 43