• No results found

Modelling Dependence of Insurance Risks

N/A
N/A
Protected

Academic year: 2021

Share "Modelling Dependence of Insurance Risks"

Copied!
42
0
0

Loading.... (view fulltext now)

Full text

(1)

Degree Project 

Marie Manyi Taku

(2)

Modelling Dependence of Insurance

Risks

(3)

Abstract

Modelling one-dimensional data can be performed by different well-known ways. Modelling two-dimensional data is a more open question. There is no unique way to describe dependency of two dimensional data. In this thesis dependency is modelled by copulas.

(4)

Contents

1 Introduction 4 2 Marginal distributions 5 2.1 Normal Distribution . . . 5 2.2 NIG Distribution . . . 6 2.3 Edgeworth Expansion . . . 7 2.4 Empirical Comparisons . . . 9 3 Model Dependence 11 3.1 Linear Correlation . . . 11 3.2 Rank Correlation . . . 14 3.2.1 Kendall’s tau . . . 14 3.3 Copulas . . . 16 3.4 Tail Dependence . . . 19 3.5 Elliptical Copulas . . . 20 3.5.1 Gaussian Copulas . . . 21 3.5.2 t-copulas . . . 22 3.6 Archimedean Copulas . . . 22 3.6.1 Clayton Copula . . . 25 3.6.2 Gumbel Copula . . . 25

3.6.3 Properties of Archimedean Copula . . . 27

4 Data Characterisation and Analysis 28 4.1 Nature of Data . . . 28

4.2 Results and Explanations . . . 29

5 Conclusion 36 A Appendix:Matlab codes 38 A.1 Random variate generation, an example . . . 39

(5)

1

Introduction

The Concise Oxford English Dictionary defines risk as ”hazard, a chance of bad consequences, loss or exposure to mischance”. In [8] A.J. McNeil, R. Frey and P. Embrechts define Financial risk as ”any event or action that may adversely affect an organisation’s ability to achieve its objectives and execute its strategies” or alternatively, ”the quantifiable likelihood of loss or less-than-expected returns”. Types of risks include Market risks, Credit risks and Operational risks. These pertain mostly to banking. An additional risk category entering through insurance is underwriting risk, the risk inherent in insurance policies sold. Examples of risks factors that play a role here are changing patterns of natural catastrophes or changing customer behaviour (such as payment patterns).

In risk theory, a field which has motivated a lot of interest during the last decade is the dependence between risks. It is sometimes amazing how depen-dence may entail large loss amounts. For example, hurricanes or earthquakes may cause many types of claims such as damages to buildings, car crashes, accidental deaths, etc. In this case risks cannot be regarded as independent. So it is important to have appropriate models for modelling dependence.

The main aim of this thesis is to describe appropriate distributions to model risks and to collect and clarify the useful ideas of dependence; linear correlation, rank correlation and copulas which should be known by anyone who wishes to model dependence phenomenae. A number of properties, advantages and drawbacks of these dependence measures are highlighted. What is of more interest is the concept of the copula to model dependence between risks due to the fact that it (the copula) maneuvers around the pitfalls of correlation. As an illustration, daily reported damages from an insurance company is investigated for dependence.

(6)

2

Marginal distributions

In this section we will study the Normal distribution, the Normal Inverse Guassian (NIG) distribution and Edgeworth expansions. The Normal distri-bution is known to be one of the most important distridistri-butions and is found in many areas of study. The Normal Inverse Gaussian (NIG) distribution on its part is suitable for modelling stochastic payments(in our work payments reflect damages) in insurance as investigated by [2]. The NIG distribution is generally very interesting for applications in finance due to their specific characteristics- they are a continuous four parameter distribution family that can produce fat tails and skewness, the class is convolution stable under cer-tain conditions and the cumulative distribution function, density and inverse distribution functions can be computed sufficiently fast. It should be noted that the normal distribution is not an approriate distribution for modelling stochastic payments due to the absence of skewness and excess kurtosis i.e. it underestimates both the thickness of the tails of the marginals and their dependence structure.

2.1

Normal Distribution

The Normal distribution with parameters µ ∈ R and σ2 > 0 has a density

function given by fN orm(x, µ, σ2) = 1 √ 2πσ2exp  −(x − µ) 2 2σ2  . (1)

The parameters µ and σ2 are the mean and variance. The distribution

with µ = 0 and σ2 = 1 is the Standard Normal Distribution.

Moment generating functionThe moment generating function is given

by MX(t) = E[etX] = eut+ 1 2σ 2 t2 . (2)

The Cumulant generating function (the logarithm of the moment gener-ating function) is given by

Φ(t) = log(MX(t)) = ut +

1 2σ

2t2.

Properties

The Normal distribution is symmetric around its mean, and always has a kurtosis equal to 3. Skewness is 0. The first and second moments are

κ1 = E[X] = Φ′(0) = µ

(7)

All higher moments are zeros.

2.2

NIG Distribution

The Normal Inverse Gaussian (NIG) distribution with parameters α, β, δ, µ has density function given by

fN IG(x, α, β, δ, µ) = αδ π · K1(α p δ2+ (x − µ)2) p δ2+ (x − µ)2 · e {δ√α2 −β2 +β(x−µ)} (3) x, β, µ ∈ R, α, δ ∈ R+, |β| < α, where K1(ω) := 12 R∞ 0 exp(− 1

2ω(t + t−1))dt is the modified Bessel function of

the third kind with index 1. The four parameters can be interpreted as fol-lows: α determines how "steep" the density is (tail heavyness), β determines how skewed it is, δ determines the scale, and µ the location.

Moment generating function

The moment generating function of the NIG distribution has a simple form though the density function is quite complicated.

The moment generating function of a random variable X∼NIG(α,β,δ,µ) is MX(t) = e  δ√α2 −β2 −δ√α2 −(β+t)2+tµ , (4)

α > |β + t|. The Cumulant generating function is given by

Φ(t) = log(MX(t)) = δ{

p

α2− β2pα2− (β + t)2} + µt,

α > |β + t|.

Properties and Parameter estimation of the NIG distribution The main properties of the NIG distribution are the scaling property

X ∼ NIG(α, β, δ, µ) ⇒ cX ∼ NIG  α c, β c, cδ, cµ  ,

and the closure under convolution for the independent random variables X and Y,

X ∼ NIG (α, β, δ1, µ1) , Y ∼ NIG (α, β, δ2, µ2) ⇒ X+Y ∼ NIG (α, β, δ1+ δ2, µ1+ µ2) .

(8)

gen-κ1 = E[X] = Φ′(0) = δβ p α2− β2 + µ κ2 = V ar[X] = Φ′′(0) = δα2 (pα2− β2)3 κ3 = E[(X − E[X])3] = Φ′′′(0) = 3δα2β (pα2− β2)5

κ4 = E[(X − E[X])4] − 3(V ar[X])2 = Φ(4)(0) =

3δα22+ 4β2)

(pα2− β2)7

Let γ1 = √κ3 (κ2)3

and γ2 = κ24)2. We then get:

ˆ α = 3 p 3γ2− 4γ12 (3γ2− 5γ12)√κ2 ˆ β = 3γ1 (3γ2− 5γ12)√κ2 ˆ δ = 3√κ2 p 3γ2− 5γ12 3γ2− 4γ12 ˆ µ = κ1− 3γ1√κ2 3γ2− 4γ12

The NIG distributions have semi-heavy tails. The right tail behaves as follows:

fN IG(α, β, δ, µ) ∼ |x|−3/2e(−α+β)xas x → +∞ The tails of the distribution

are heavier than those of the Normal distribution.

2.3

Edgeworth Expansion

Edgeworth expansion is an expansion which relates the probability density function, f , of a random variable, X, having expectation 0 and variance 1, to the probability density function of a Standard Normal distribution, using the Chebyshev-Hermite polynomials.

With the first four moments

κ1 = E[X] = µ

κ2 = V ar[X] = σ2 = s2

κ3 = E[(X − E[X])3]

(9)

and their respective estimates: ˆ µ1 = ¯x ˆ σ2 = 1 n − 1 n X 1 (xi− ¯x)2 ˆ κ3 = 1 n − 1 n X 1 (xi− ¯x)3 ˆ κ4 = 1 n − 1 n X 1 (xi− ¯x)4− 3s2,

and Chebyshev-Hermite polynomials

H3 = x3− 3

H4 = x4− 6x2+ 3,

the Edgeworth expansion is given by

f (x) = √1 2πσ exp  −(x − µ) 2 2σ2   1 + κ3 3!σ3H3  x − µ σ  + κ4 4!σ4H4  x − µ σ  + ...  . (5)

See [12]. The approximation which stops at the term with H3 is called the

(10)

2.4

Empirical Comparisons

Here real data X1, for Kronoberg and X2 for Göinge are modelled alongside

the Normal and Normal Inverse Gaussian distributions, and the Edgeworth

expansion of the third order. The parameters are as estimated above. X1

is the logarithm of home damages in Kronoberg with costs bigger than zero

and X2 is the logarithm of home damages in Göinge with costs bigger than

zero. 5 6 7 8 9 10 11 12 13 14 15 0 0.2 0.4 0.6 0.8 1 1.2 1.4 real data normal cdf 3 order Edgew cdf NIG cdf

(11)

0 2 4 6 8 10 12 14 16 18 0 0.2 0.4 0.6 0.8 1 1.2 1.4 real data normal cdf Edgew cdf NIG cdf

(12)

3

Model Dependence

This section focuses on modelling dependence between risks. It starts by looking at dependence measures: linear correlation, and rank correlation. These dependence measures yield scalar measurements for any pair of

ran-dom variables (X1, X2), although the nature and properties of the measures

might be different in each case. Also described in this section is the concept of copulas (an alternative measure of dependence) and tail dependence(an important concept since it addresses the phenomenon of joint extreme values in several risk factors).

3.1

Linear Correlation

The purpose of a linear correlation analysis is to determine whether there is a relationship between two sets of variables. An example will be the

rela-tionship between X1, Home damage costs in Göinge and X2, Home damage

costs in Kronoberg. This correlation could be because of seasonal or weather reasons.

Definition 3.1. The correlation ρ (X1, X2) between random variables X1

and X2 is defined as ρ1,2 = Cov (X1, X2) p V (X1) V (X2) , (6) where Cov (X1, X2) = E [(X1− E [X1]) (X2− E [X2])] and V (X1) = Cov (X1, X1) .

It is a measure of linear dependence and takes values in [−1, 1]. If X1

and X2 are independent then

ρ (X1, X2) = 0,

but it is well known that the converse is false: the uncorrelatedness of X1 and

X2 does not in general imply independence except for the case of normality.

Example 3.1: Consider the variate pair (X1, X2) which takes on values

{(0, 1) , (0, −1) , (1, 0) (−1, 0)} with equal probability 14. The linear

correla-tion of the co-dependent variates X1 and X2 for this discrete distribution is

identically zero, but in fact, these variates are strongly dependent. If X1 = 0

then X2 = 1 or X2 = −1. However, for X1 6= 0, y is fully determined: it has

(13)

If |ρ (X1, X2) | = 1, then this is equivalent to saying that X1 and X2 are

perfectly linearly dependent, that is X2 = α + βX1 almost surely for some

(14)

Limitations of Linear Correlation as a Dependence Measure • In strongly non-Normal distributions like in Example 3.1 above and

non-elliptical distributions, linear correlation can actually conceal the strong co-dependence information contained in the full joint distribu-tion.

• The marginal distributions and pairwise correlations of a random vector do not determine its joint distribution. Indeed, for a given pair of

marginal distribution densities FX1(X1) and FX2(X2), there may not

even be a joint distribution density F (X1, X2) for every possible ρ ∈

[−1, 1]

• The correlation is not invariant under nonlinear strictly increasing transformations. That is, for two real-valued random variables we

have, in general ρ (T (X1) , T (X2))) 6= ρ (X1, X2) where T : R → R

is a nonlinear strictly increasing transformation. For example, log(X) and log(Y ) generally do not have the same correlation as X and Y .

• Correlation is only defined when the variances of X1 and X2 are finite.

This restriction to finite-variance models is not ideal for a dependence measure and can cause problems when we work with heavy-tailed dis-tributions. For example, actuaries who model losses in different busi-ness lines with infinite-variance distributions may not describe the de-pendence of their risks using correlation. Indeed, the covariance and

correlation between the two components of a bivariate t2-distribution

random variable are not defined because of infinite second moments. Advantages of linear Correlation

• Correlation is invariant under strictly linear transformations. That is, for β1, β2 > 0, ρ (α1+ β1X1, α2+ β2X2) = ρ (X1, X2).

(15)

3.2

Rank Correlation

There are two types of rank correlations; Kendall’s tau and Spearman’s rho. They are better alternatives to linear correlation coefficient as a measure of dependence for nonelliptical distributions, for which the linear correlation coefficient is inappropriate and often misleading.

3.2.1 Kendall’s tau

Kendall’s tau is a co-dependence measure that focuses on the idea of

con-cordance and discon-cordance. Two points in R2, (X

1, X2) and (Y1, Y2) are

said to be concordant if (X1 − Y1)(X2 − Y2) > 0 and to be discordant if

(X1− Y1)(X2− Y2) < 0.

Definition 3.2. Kendall’s tau for the random vector (X1, X2) is defined as

τk = τk(X1, X2) = P[(X1− Y1)(X2− Y2) > 0] − P[(X1− Y1)(X2− Y2) < 0]

where (Y1, Y2) is an independent copy of (X1, X2).

Hence Kendall’s tau for (X1, X2) is simply the probability of concordance

minus the probability of discordance.

Theorem 3.1. Given two variables X1 ∈ R and X2 ∈ R, their marginal

dis-tribution densities fX1(X1) and fX2(X2), their respective cumulative marginal

distributions FX1(X1) = Z X1 −∞ f (x1)dx1, (7) FX2(X2) = Z X2 −∞ f (x2)dx2, (8)

and their joint distribution density function F (X1, X2) with

F (X1, X2) = Z X1 −∞ Z X2 −∞ f (x1, x2)dx2dx1. (9)

Then the Kendall’s tau for the continuous distribution densities is: τk = 4

Z Z

(16)

Properties

1. Kendall’s tau is symmetric. That is τk(X1, X2) = τk(X2, X1).

2. τk(X1, X2) ∈ [−1, 1].

3. Kendall’s tau gives the value zero for independent random variables, although a rank correlation of 0 does not necessarily imply indepen-dence.

4. τk(X1, X2) = 1 when X1 and X2 are comonotonic (common

monotonic-ity). By monotonicity, we mean that (X1, X2)

d = FX−11(U ), F −1 X2(U )  ,

where ’=’ stands for equality in distribution and U is a uniformly dis-d

tributed random variable on the unit interval. (See [5]). Furthermore,

τk(X1, X2) = −1 when X1 and X2 are countermonotonic.

The second limitation of linear correlation remains a pitfall under rank cor-relation: The marginal distributions and pairwise correlations of a random vector do not determine its joint distribution. Indeed, for a given pair of

marginal distribution densities FX1(X1) and FX2(X2), there may not even

be a joint distribution density F (X1, X2) for every possible ρ ∈ [−1, 1]!

(17)

3.3

Copulas

In [13], copulas are considered as ’functions that join or couple multivariate distribution functions to their one-dimensional marginal distribution func-tions’ and as ’distribution functions whose one-dimensional margins are uni-form.’ But neither of these is a definition. [13] therefore goes ahead to give a good definition of copulas and some examples which we find in this section. We are motivated to study copulas because they help in the understanding of dependence at a deeper level. They let us see the potential pitfalls of approaches to dependence that focuses only on correlation and show us how to define a number of useful alternative dependence measures. See [8]

The notion of an increasing function is used here. An increasing function is a function f such that x ≤ y implies f(x) ≤ f(y). Also, DomH and RanH, denote the domain and range of the function H respectively.

Definition 3.3. Let H be a function such that DomH = S1× S2 is a subset

of R2 and RanH = is a subset of R. H is said to be 2-increasing if:

VH(B) := H(x2, y2) − H(x2, y1) − H(x1, y2) + H(x1, y1) ≥ 0,

for all rectangle B = [x1, x2] × [y1, y2] whose vertices lie in DomH

Definition 3.4. A function H from DomH = S1× S2 into R is said to be

grounded if H(x, a2) = 0 = H(a1, y) for all (x, y) in S1 × S2, where a1 and

a2 are the least elements of S1 and S2 respectively.

Definition 3.5. A 2-dimensional copula is a function C from [0, 1]2 to [0, 1]

with the following properties: 1. For every u, v in [0, 1],

C(u, 0) = 0 = C(0, v) and

C(u, 1) = u C(1, v) = v;

2. For every u1, u2, v1, v2 in [0, 1] such that u1 ≤ u2 and v1 ≤ v2,

(18)

The following theorems are important results regarding copulas and its applications.

Theorem 3.2 (Sklar’s theorem). Let H be a joint distribution function with

cumulative marginals FX1(x1) and GX2(x2). Then, there exists a copula C

such that for all x1, x2 in R,

H(x1, x2) = C (FX1(x1), GX2(x2)) . (11)

If FX1(x1) and GX2(x2) are continuous, then C is unique; otherwise, C is

uniquely determined on RanF × RanG. Conversely, C is a copula and F and G are distribution functions, then the function H defined by (11) is a joint distribution function with cumulative marginals F and G.

Proof. See [13]

From the above theorem, we see that for continuous multivariate distri-bution functions the univariate marginals and the multivariate dependence structure can be seperated, and the dependence structure can be represented by a copula.

Theorem 3.3. Let H be a 2-dimensional distribution function with

contin-uous marginals F and G and copula C (where C satifies (11)). Then, for any (u, v) in [0, 1]2,

C(u, v) = H FX−11(u), G−1X2(v).

Theorem 3.4. For every copula C(u, v), we have the bounds

max (u + v − 1, 0) ≤ C(u, v) ≤ min (u, v).

Here is a general algorithm for random variate generation from copulas. The properties of the specific copula family is often essential for the efficiency of the corresponding algorithm. To generate observations (x, y) of a pair of random variables (X, Y ) with a joint distribution function H, first generate a pair (u, v) of observations of uniform (0, 1) random variables (U, V ) whose joint distribution function is C, the copula of X and Y (this is by virtue of Sklar’s theorem), and then transform those uniform variates using the inverse transform method. The conditional distribution method is used in this case to generate the pair (u, v). The conditional distribution of V given U is given by:

Cu(v) = P [V ≤ v |U = u] = lim

∂u→0

C(u + ∂u, v) − C(u, v)

∂u =

∂C(u, v)

∂u . (12)

(19)

Algorithm

1. Generate two independent U (0, 1) variates u and v.

2. Set v = C−1

u (t) where Cu−1 denotes a quasi-inverse of Cu.

3. The desired pair is (u, v).

For other algorithms, see [7] and [9]. See [10] for a generalization.

Example[13] Let X and Y be random variables whose joint distribution

function H (DomH = R2) is H(x, y) =    (x+1)(ey−1) x+2ey−1 , (x, y) ∈ [−1, 1] × [0, ∞] 1 − ey, (x, y) ∈ (1, ∞] × [0, ∞] 0, elsewhere (13) The copula C of X and Y is

C(u, v) = uv

u + v − uv

and so the conditional distribution function Cu and its inverse Cu−1 are given

by Cu(v) = ∂C(u, v) ∂u =  v u + v − uv 2 and Cu−1 = u √ t 1 − (1 − u)√t.

Thus an algorithm to generate random variates (x, y) is: 1. Generate two independent U (0, 1) variates u and t;

2. Set v = u√t

1−(1−u)√t,

3. Set x = 2u − 1 and y = − ln(1 − v)

Note: x and y are the inverses of the marginals F and G given by

(20)

−1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1 0 1 2 3 4 5 6

Figure 3: Random variates generated from a sample of 500. 4. The desired pair is (x, y).

See [10] for another example.

We now take a look at tail dependence and then study some copulas in detail.

3.4

Tail Dependence

Tail dependence measures the dependence between the variables in the

upper-right-quadrant tail or lower-left quadrant tail of [0, 1]2. It is a concept that

is relevant for the study of dependence between extreme values. Tail de-pendence between two continuous random variables X and Y is a copula property and hence the amount of tail dependence is invariant under strictly increasing transformations of X and Y .

Definition 3.6. Let X and Y be continuous random variables with marginal

distribution functions F and G respectively. The coefficient of upper tail dependence of X and Y is:

λU = lim

uր1P



Y > G−1(u)|X > F−1(u),

provided that the limit λU ∈ [0, 1] exists. If λU ∈ (0, 1], X and Y are said to

be asymptotically dependent in the upper tail; if λU = 0, X and Y are said

(21)

Similarly, coefficient of the lower tail dependence of X and Y is:

λL= lim

uց1P



Y ≤ G−1(u)|X ≤ F−1(u),

provided that the limit λL∈ [0, 1] exists.

Since P [Y > G−1(u)|X > F−1(u)] can be written as

1 − P [X ≤ F−1(u)] − P [Y ≤ G−1(u)] + P [X ≤ F−1(u), Y ≤ G−1(u)]

1 − P [X ≤ F−1(u)] ,

(See [15] [p.245]) an alternative and equivalent definition (for continuous random variables), from which it is seen that the concept of tail dependence is indeed a copula property, is the following found in [4];

Definition 3.7. If a bivariate copula C is such that

lim

uր1

1 − 2u + C(u, u)

1 − u = λU

exists, then C has upper tail dependence if λU ∈ (0, 1], and upper tail

inde-pendence if λU = 0.

In the last part of this section, we take a look at the Gaussian Copula, the t-copula, the Gumbel copula and the Clayton copula. The first two copulas are elliptical while the Gumbel and Clayton copulas are Archimedean copulas.

3.5

Elliptical Copulas

Elliptical copulas are simply the copulas of elliptical distributions. Since to simulate from elliptical distributions is easy, it is easy to simulate from ellip-tical copulas making use of Sklar’s theorem. They are however, not without drawbacks: They do not have closed form expressions and are restricted to

have radial symmetry C = ˆC. In many finance and insurance applications

it seems reasonable that there is stronger dependence between big losses (e.g a stock market crash) than between big gains. Such asymmetries cannot be modelled with elliptical copulas.

A general definition of elliptical distributions is the following:

Definition 3.8. [Elliptical Distributions] If X is an n-dimensional random

vector and, for some vector µ ∈ R2, some n × n nonnegative definite

sym-metric matrix P and some function φ : [0, ∞) → R, the characteristic

(22)

When n = 1, the class of elliptical distributions coincides with the class of one-dimensional symmetric distributions. A function φ as in Definition 3.8 is called a characteristic generator.

An important result to take note of is the following theorem which pro-vides a relation between the Kendall’s tau and the rank correlation matrix R

(with Rij = P ij √P ii P jj

) for nondegenerate elliptical distributions. R is easily estimated from data.

Theorem 3.5. Let X ∼ En(µ,P, φ) with P[Xi = µi] < 1 and P[Xj = µj] <

1. Then τ (Xi, Xj) =  1 −X(P[Xi = µ])2  2 πarcsin(Rij), (16)

where the sum extends over all atoms of the distribution of Xi. If in addition

rank(P) ≥ 2, then equation 13 simplifies to

τ (Xi, Xj) = 1 − (P[Xi = µ])2

 2

πarcsin(Rij). (17)

Proof. see [3]

Note: For 2-dimensional normal distribution with linear correlation coef-ficient ρ, τ = π2 arcsin ρ.

3.5.1 Gaussian Copulas

The Gaussian copula of the n-variate normal distribution with linear corre-lation matrix R is

CRGa(u) = Φ−1R Φ−1(u1), ..., Φ−1(un),

where Φn

R denotes the joint distribution function of the n-variate standard

normal distribution function with linear correlation matrix R, and Φ−1

de-notes the inverse of the distribution function of the univariate standard nor-mal distribution. For the bivariate case, which we are more concerned with, the copula expression can be written as

CRGa(u, v) = Z Φ−1(u) −∞ Z Φ−1(v) −∞ 1 2π(1 − ρ2)12 exp  −s 2− 2ρst + t2 2(1 − ρ2)  dsdt. Note: ρ = ρ(X1, X2).

(23)

3.5.2 t-copulas

The t-copula is conceptually very similar to the Gaussian Copula. It is given by the cumulative distribution function of the marginals of correlated t-variates. If X has the stochastic representation.

X=dµ + √ υ √ SZ (18) where µ ∈ Rn, S ∼ χ2 ν and Z ∼ N(0, P

) are independent, then X has an

n-variate tν-distribution with mean µ (for ν > 1) and covariance matrix ν−2ν P

(for ν > 2). If ν ≤ 2 then Cov(X) is not defined. In this case we interpret P

as being the shape parameter of the distribution of X. The n-dimensional t-copula of X given by (15) is

Cν,Rt (u) = tnν,R t−1ν (u1), ..., t−1ν (un),

where R is a correlation matrix, tn

ν,R is the joint distribution function of

the X with marginals tν. In the bivariate case the t-copula is expressed as

Cρt(u, v) = Z t−1 ν (u) −∞ Z t−1 ν (v) −∞ 1 2π(1 − ρ2)12 exp ( 1 + s 2− 2ρst + t2 ν(1 − ρ2) ν+2 2 ) dsdt

where ρ is the linear correlation of the corresponding bivariate tν distribution

if ν > 2.

Tail dependence: The t-copula has upper tail (and because of radial

symmetry) equal lower tail dependence.

3.6

Archimedean Copulas

Archimedean Copulas is a class of copulas worth studying for the following reasons:

1. The ease with which they can be constructed.

2. The great variety of families of copula which belong to this class. 3. The many nice properties possessed by the members of this class; They

have closed form expressions as opposed to elliptical coplulas.

(24)

multi-Definition 3.9. Let ϕ be a continuous, strictly decreasing function from [0, 1] to [0, ∞] such that ϕ(1) = 0. The pseudo-inverse of ϕ is the function

ϕ[−1] with Domϕ[−1] = [0, ∞] and Ranϕ[−1] = [0, 1] given by

ϕ[−1](t) = 

ϕ−1(t), 0 ≤ t ≤ ϕ(0)

0, ϕ(0) ≤ t ≤ ∞ (19)

ϕ[−1] so defined is continuous and nonincreasing on [0, ∞], and strictly

decreasing on [o, ϕ(0)]. Furthermore, ϕ[−1](ϕ(u)) = u on [0, 1] and

ϕ ϕ[−1](t)=  t, 0 ≤ t ≤ ϕ(0) ϕ(0), ϕ(0) ≤ t ≤ ∞ = min(t, ϕ(0)). Finally, if ϕ(0) = ∞, then ϕ[−1] = ϕ−1.

Lemma 3.1. Let ϕ be a continuous, strictly decreasing function from [0, 1]

to [0, ∞] such that ϕ(1) = 0 and let ϕ[−1] be the pseudo-inverse of ϕ defined

by (19). Let C be the function from [0, 1]2 to [0, 1] given by

C(u, v) = ϕ[−1](ϕ(u) + ϕ(v)) . (20)

Then C is 2-increasing if and only if for all v ∈ [0, 1], whenever u1 ≤ u2,

C(u2, v) − C(u1, v) ≤ u2− u1. (21)

Proof. [13] Because (21) is equivalent to

C(u2, 1)−C(u2, v1)−C(u1, 1)+C(u1, v1) = u2−u1+C(u, v1)−C(u2, v1) ≥ 0,

it holds whenever C is 2-increasing. Hence assume that C satisfies (21). Choose v1, v2 in [0, 1] such that v1 ≤ v2, and note that C(0, v2) = 0 ≤ v1 ≤

v2 = C(1, v2). But C is continuous (because ϕ and ϕ[−1] are), and thus there

is a t in [0, 1] such that C(t, v2) = v1, or ϕ(v2) + ϕ(t) = ϕ(v1). Hence

C(u2, v1) − C(u1, v1) = ϕ[−1](ϕ(u2) + ϕ(v1)) − ϕ[−1](ϕ(u1) + ϕ(v1)) ,

= ϕ[−1](ϕ(u2) + ϕ(v1) + ϕ(t)) − ϕ[−1](ϕ(u1) + ϕ(v1) + ϕ(t)) ,

= C (C(u2, v2), t) − C (C(u1, v2), t) ,

≤ C(u2, v2) − C(u1, v2),

(25)

Theorem 3.6. Let ϕ be a continuous, strictly decreasing function from [0, 1] to [0, ∞] such that ϕ(1) = 0 and let ϕ[−1] be the pseudo-inverse of ϕ defined

by (19). Let C be the function from [0, 1]2 to [0, 1] given by (20). Then C is

a copula if and only if ϕ is convex. Proof. [13]

1.

C(u, 0) = ϕ[−1](ϕ(u) + ϕ(0)) = 0;

C(u, 1) = ϕ[−1](ϕ(u) + ϕ(1))

= ϕ[−1](ϕ(u)) = u.

By symmetry, C(u, v) = 0 and C(1, v) = v

2. To complete the proof, we show that (21) holds (by the last lemma) if

and only if ϕ is convex. We note tht ϕ is convex if and only if ϕ[−1] is

convex. Observe that (21) is equivalent to

u1+ ϕ[−1](ϕ(u2) + ϕ(v)) ≤ u2+ ϕ[−1](ϕ(u1) + ϕ(v))

for u1 ≤ u2, so that if we set a = ϕ(u1), b = ϕ(u2), and c = ϕ(v), then

(21) is equivalent to

ϕ[−1](a) + ϕ[−1](b + c) ≤ ϕ[−1](b) + ϕ[−1](a + c), (22)

where a ≥ b and c ≤ 0. Now suppose (21) holds, i.e., suppose that ϕ[−1]

satisfies (22). Choose any s, t in [0, ∞] such that 0 ≤ s < t. If we set a = (s + t)/2, b = s, and c = (t − s)/2 in (22), we have ϕ[−1]  s + t 2  ≤ ϕ [−1](s) + ϕ[−1](t) 2 . (23)

Thus ϕ[−1] is midconvex, and because ϕ[−1] is continuous it follows that

ϕ[−1] is convex.

In the other direction, assume ϕ[−1] is convex. Fix a, b, and c in [0, 1]

such that a ≥ b and c ≤ 0; and let γ = (a − b)/(a − b + c), and hence ϕ[−1](a) ≤ (1 − γ)ϕ[−1](b) + γϕ[−1](a + c)

(26)

Copulas of the form (20) are called Archimedean copulas. The function ϕ is called a generator of the copula. If ϕ(0) = ∞, we say that ϕ is a strict

generator. In this case, ϕ[−1] = ϕ−1 and C(u, v) = ϕ−1(ϕ(u) + ϕ(v)) is said

to be a strict Archimedean copula.

For some examples of Archimedean Copulas, see and [13].

3.6.1 Clayton Copula

The Clayton Copula is generated by

ϕ(t) = (t−1− 1)/θ,

where θ ∈ [−1, ∞] 0 and defined as

Cθ(u, v) = max



u−θ+ v−θ− 1−1/θ, 0.

For θ > 0 the copulas are strict and the copula expression simplifies to

Cθ(u, v) = u−θ+ v−θ− 1

−1/θ

.

The Clayton family has lower tail dependence for θ > 0, and C−1 =

max(u + v, 0), limθ→0Cθ = uv and limθ→∞Cθ = min(u, v).

Kendall’s tau for this family is given by tθ = θ+2θ .

3.6.2 Gumbel Copula

The Gumbel Copula (sometimes also refered to as the Gumbel-Hougaard Copula) is controlled by a single parameter θ ≥ 1. It is generated by

ϕ(t) = (− ln t)θ, and defined as Cθ(u, v) = exp  −h(− ln u)θ+ (− ln v)θi1/θ  . Indeed: ϕ(t) is continuous and ϕ(0) = 0 ϕ′

(t) = −θ (− ln t)θ−1 1t, so ϕ is a strictly decreasing function from [0, 1] to

(27)

ϕ′′(t) ≥ 0 on [0, 1], so ϕ is convex. Moreover, ϕ(0) = ∞, so ϕ is a strict

generator. Thus ϕ[−1](t) = ϕ−1(t) = exp(−t). From (17) we get

Cθ(u, v) = ϕ[−1](ϕ(u) + ϕ(v)) = exp



−h(− ln u)θ+ (−lnv)θi1/θ 

.

If θ = 1 then C1(u, v) = uv and limθ→∞Cθ = min(u, v).

Tail Dependence: Consider the bivariate Gumbel family of copulas given above. Then by 3.7

1 − 2u + C(u, u) 1 − u = 1 − 2u + exp(21/θln u) 1 − u = 1 − 2u + u 21/θ 1 − u , and hence limuր1  1 − 2u + C(u, u) 1 − u  = lim uր1 −2 + 21/θu21/θ−1 −1 ! = 2 − lim uր1  21/θu21/θ−1 = 2 − 21/θ.

Thus for θ > 1, Cθ has upper tail dependence.

Kendall’s tau: With the generator for the Gumbel family given above,

ϕ(t) ϕ′

(t) =

t ln t

θ .

Using theorem 21in [1], we can calculate Kendall’s tau for the Gumbel family.

(28)

3.6.3 Properties of Archimedean Copula

Stated below are two theorems concerning some algebraic properties of Archimedean Copulas.

Theorem 3.7. Let C be an Archimedean copula with generator ϕ. Then

1. C is symmetric i.e. C(u, v) = C(v, u) for all u, v in [0, 1].

2. C is associative i.e. C(C(u, v), w) = C(u, C(v, w)) for all u, v, w in [0, 1]. Proof. 1. C(u, v) = ϕ[−1](ϕ(u) + ϕ(v)) = ϕ[−1](ϕ(v) + ϕ(u)) = C(v, u). 2. C(C(u, v), w) = ϕ[−1](ϕ(ϕ[−1](ϕ(u) + ϕ(v))) + ϕ(w)) = ϕ[−1](ϕ(u) + ϕ(v) + ϕ(w)) = ϕ[−1](ϕ(u) + ϕ(ϕ[−1](ϕ(v) + ϕ(w)))) = C(u, C(v, w)).

Note: The associative property of Archimedean copulas is not shared by copulas in general. See [10] for an example.

Theorem 3.8. Let C be an associative copula such that the diagonal section

δc satisfies δc(u) < u for all u in (0, 1). Then C is Archimedean.

(29)

4

Data Characterisation and Analysis

4.1

Nature of Data

Table 1: Data for Kronoberg

No. Region Damage yr. ObjNo. Objtype Damagedate payment Damage cost

(30)

Table 2: Data for Göinge

No. Region Damage yr. ObjNo. Objtype Damagedate payment Damage cost

1 Göinge 1998 3 Villa 19980102 0 0 2 Göinge 1998 3 Villa 19980101 0 0 3 Göinge 1998 3 Villa 19980101 0 0 4 Göinge 1998 3 Villa 19980102 0 0 5 Göinge 1998 3 Villa 19980101 0 0 6 Göinge 1998 3 Villa 19980105 1588 1588 7 Göinge 1998 3 Villa 19980101 4000 4000 8 Göinge 1998 6 Farms 19980105 0 0 9 Göinge 1998 11 Business 19980102 3705 3705 10 Göinge 1998 6 Farms 19980102 0 0 11 Göinge 1998 4 Holiday 19980105 1200 1200 12 Göinge 1998 12 koff 19980102 17405 17405 13 Göinge 1998 5 property 19980102 0 0 .. ... .... .. .... ... .... ...

The data above is data of daily reported damages obtained from Läns-försäkringar Kronoberg. It is collected over a five year period, 1998-2002. The damage costs are from two regions in Sweden (Göinge and Kronoberg) and are of different types home/villa, farms, business and accidents. It should not be surprising to see holiday as an entry with object number 4 since if an individual is involved in an accident while on vacation outside Sweden, he is covered by his home insurance.

The following table describes the column 4 of the data. Table 3: Data Description

Insurance Type Object number

Home and Villa 01,02,03,04,20

Farms 06

Property,Company 05,07,08,09,11,12,13,14

Accidents 30

4.2

Results and Explanations

(31)

types. Copulas are the prefered dependence measure and the Normal and NIG are the Marginal distributions used.

4 6 8 10 12 14 16 4 6 8 10 12 14 16 data

sim marg Norm Gauss cop

Figure 4: Simulated points from the Gaussian Copula using the Normal marginals 4 6 8 10 12 14 16 2 4 6 8 10 12 14 16 18 data

sim marg NIG Gauss cop

Figure 5: Simulated points from the Gaussian Copula using the Normal Inverse Gussian marginals

(32)

12 14 16 18 20 22 24 26 28 30 32 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 real data N marg Gauss cop NIG marg Gauss cop

20 22 24 26 28 30 0.65 0.7 0.75 0.8 0.85 0.9 0.95 real data N marg Gauss cop NIG marg Gauss cop

(33)

4 6 8 10 12 14 16 4 6 8 10 12 14 16 data N marg t cop

Figure 7: Simulated points from the t Copula with Normal marginals

4 6 8 10 12 14 16 4 6 8 10 12 14 16 data NIG marg t cop

(34)

2 4 6 8 10 12 14 16 0 2 4 6 8 10 12 14 16 18 data

NIG marg Clayton cop

Figure 9: Simulated points from the Clayton copula using the NIG marginals

To summerize it all, consider the total home damage cost eX1+ eX2. The

figure 10 below illustrates each of the distributions with their corresponding copulas perform. We should note that a good dependence is reflected by a good fit of the total cost.

One can quantify the error by a version of the two-sample Kolmogorov-Smirnov Test.

Table 4: Two-sample Kolmogorov-Smirnov test

Model p-value k-value

NIG marginals, Gaussian copula 0.9366 0.0178

NIG marginals, Clayton copula 0.4600 0.0283

Normal marginals, Gaussian copula 0.1389 0.0383

NIG marginals, Gumbel copula 0.1180 0.0399

Edgeworth marginals, Gaussian copula 0.0856 0.0417

Normal marginals, Independence 0.0157 0.0517

Here k = maxx bFmodel(x) − bFempirical(x)

(35)

102 103 104 105 106 107 108 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 real data

N marg Gauss cop NIG marg Gauss cop NIG marg Clayton copula Normal marg indep Edg marg Gauss cop

(36)
(37)

5

Conclusion

The main objective of this thesis is to model dependency of two-dimensional data using the concept of copulas. The investigated data reflects damage costs from two different regions.

The copula is realized to perform best where linear correlation fails. The copula joins together marginal distributions to form a joint distribution but pairwise correlation fails to do that. Therefore dependency is better defined by the copula. Nevertheless, the Gaussian Copula with Normal margins performs like linearly correlated random variables modelled with the Normal marginals.

(38)

Acknowledgement

(39)

A

Appendix:Matlab codes

Data in file damagedata.mat load damagedata

X1=log(hem(hem>0&ghem>0));%Kronoberg X2=log(ghem(hem>0&ghem>0));%Göinge

Estimated parameters:

[alpha beta delta mu]=cumnigest(X1);%1.5429 0.0261 3.7127 9.7679

[galpha gbeta gdelta gmu]=cumnigest(X2);%1.1875 0.3429 2.3686 8.5467

[r p]=corr(X1’,X2’)%0.1345 6.2524e-05

[rKendall pKendall]=corr(X1’,X2’,’type’,’Kendall’)%0.0869 1.1487e-04 tau=rKendall; rho=sin(tau*pi/2); n=length(X1) m1=mean(X1); sigma1=std(X1); m2=mean(X2); sigma2=std(X2);

cumulative density functions of approximated

distribu-tions:

[F1 x1]=ecdf(X1);

plot(x1,F1,x1,normcdf(x1,m1,sigma1),’--’,x1,

FEdg1_3(x1,m1,sigma1),’:’,x1,nigcdf(x1,alpha,beta,delta,mu),’-.’), legend(’real data’,’normal cdf’,’3 order Edgew cdf’,’NIG cdf’) print -depsc densX1Edge.eps

[F2 x2]=ecdf(X2);

plot(x2,F2,x2,normcdf(x2,m2,sigma2),’--’,x2,FEdg2_3(x2,m2,sigma2),’:’,x2, nigcdf(x2,galpha,gbeta,gdelta,gmu),’-.’),

(40)

A.1

Random variate generation, an example

n=500; u=rand(n,1); t=rand(n,1); v=u.*sqrt(t)./(1-(1-u)).*sqrt(t); x=2*u-1; y=-log(1-v); plot(x,y,’.’)

print -depsc copex.eps

Normal marginals Gaussian copula:

%Xnorm=[norminv(U(:,1),m1,sigma1) norminv(U(:,2),m2,sigma2)] XNmargGcop=[m1+sigma1*norminv(U(:,1)) m2+sigma2*norminv(U(:,2))] %figure(1);

plot(X1,X2,’.’,XNmargGcop(:,1),XNmargGcop(:,2),’+’) legend(’data’,’norm marg Gauss cop’)

%print -depsc NmargGausscop.eps

NIG marginals Gaussian copula:

XNIGmargGcop=[niginv(U(:,1),alpha,beta,delta,mu) niginv(U(:,2), galpha,gbeta,gdelta,gmu)] %subplot(211),plot(X1,X2,’.’) %subplot(212),plot(X(:,1),X(:,2),’.’) %figure(2); plot(X1,X2,’.’,XNIGmargGcop(:,1),XNIGmargGcop(:,2),’+’) legend(’data’,’NIG marg Gauss cop’)

%print -depsc NIGmargGausscop.eps

Models together:

semilogx(xe,Fe,xNmargGcop,FNmargGcop,’--’,xNIGmargGcop,FNIGmargGcop,’:’, xNIGmargCcop,FNIGmargCcop,’-.’,xNmargind,FNmargind,’-’,

xEdgmarGcop,FEdgmargGcop,’--’)

legend(’real data’,’N marg Gauss cop’,’NIG marg Gauss cop’,

’NIG marg Clayton copula’,’Normal marg indep’,’Edg marg Gauss cop’) print -depsc sumGcopcdf.eps

(41)

References

[1] C. Genest and J. Mackay The Joy of Copulas: Bivariate Distributions with Uniform Marginals, The American Statistician, 280-283, Vol. 40, 1986.

[2] E. Novikova, Modellering av försäkringsdata med normal invers gaussisk (NIG)-fördelning. Magister Thesis, Växjö Universitet,2006.

[3] F. Lindskog, A. McNeil, and U. Schmock Kendall’s Tau for Elliptical Distributions,2001.

[4] H. Joe Multivariate Models and Dependence Concepts. Chapman and Hall, London, 1997.

[5] J. Dhaene, S. Vanduffel, M. Goovaerts Comotonicity. 2007.

[6] J. Ekström Uppskattning av svansscannolikhet från försäkringsdata med Edeworthutveckling. Magister Thesis, Växjö Universitet, 2004.

[7] Johnson M. Multivariate Statistical Simulation. Wiley, New York, 1987. [8] J. McNeil, R. Frey and P. Embrechts Quantitative Risk

Manage-ment.Concepts, Techniques and Tools. Princeton University Press, 2005. [9] L. Devroye Non-Uniform Random Variate Generation. Springer, New

York, 1986.

[10] P. Embrechts, F. Lindskog and A. McNeil, Modelling Dependence with Copulas and Applications to Risk Management. 2001.

[11] P. Jäckel Monte Carlo Methods in Finance. Wiley, 2002.

[12] P. Hall The Bootstrap and Edgeworth Expansion. Springer, New York, 1992.

[13] R. B. Nelsen An Introduction to Copulas. Springer, New York, 2006. [14] S. Cambanis, S. Huang and G. Simons On the Theory of Elliptical

(42)

SE-351 95 Växjö / SE-391 82 Kalmar Tel +46-772-28 80 00

References

Related documents

This lack of effects corroborate that sucralose is unlikely to have developmental neurotoxic effects after exposure to neonatal mice; and therefore eliminates any doubt that

registered. This poses a limitation on the size of the area to be surveyed. As a rule of thumb the study area should not be larger than 20 ha in forest or 100 ha in

Focusing on the former, I examine how its most advanced component, the Court Defaulter Blacklist (CDB), can be considered a new form of non-violent repression, particularly to

You suspect that the icosaeder is not fair - not uniform probability for the different outcomes in a roll - and therefore want to investigate the probability p of having 9 come up in

Detta pekar på att det finns stora möjligheter för banker att använda sig av big data och att det med rätt verktyg skulle kunna generera fördelar.. Detta arbete är således en

Let A be an arbitrary subset of a vector space E and let [A] be the set of all finite linear combinations in

When Stora Enso analyzed the success factors and what makes employees &#34;long-term healthy&#34; - in contrast to long-term sick - they found that it was all about having a

The demand is real: vinyl record pressing plants are operating above capacity and some aren’t taking new orders; new pressing plants are being built and old vinyl presses are