A simulation method for skewness correction

(1)

U.U.D.M. Project Report 2008:22

Examensarbete i matematisk statistik, 30 hp Handledare och examinator: Silvelyn Zwanzig December 2008

Department of Mathematics

A simulation method for skewness correction

Måns Eriksson

(2)

(3)

Abstract

Let X1, . . . , Xn be i.i.d. random variables with known variance and skewness.

A one-sided confidence interval for the mean with approximate confidence level α can be constructed using normal approximation. For skew distributions the actual confidence level will then be α + o(1). We propose a method for obtaining confidence intervals with confidence level α+o(n^−1/2) using skewness correcting pseudo-random variables. The method is compared with a known method; Edgeworth correction.

(4)

h h

(5)

Acknowledgements

I would like to thank my advisor Silvelyn Zwanzig for introducing me to the subject, for mathematical and stylistic guidance and for always encouraging me.

I would also like to thank my friends and teachers at and around the department of mathematics for having inspired me to study mathematics, and for continuing to inspire me.

(6)

h h

(7)

1 Introduction

1.1 Skewness

The notion of skewness has been a part of statistics for a long time. It dates back to the 19th century and most notably to an article by Karl Pearson from 1895 ([26]). Skew distributions are found in all areas of applications, ranging from finance to biology and physics. It has been seen both in theory and in practice that deviations from normality in the form of skewness might have effects too big to ignore on the validity and performance of many statistical methods and procedures.

This thesis discusses skewness in the context of the central limit theorem and normal approximation, in particular applied to confidence intervals. Some methods for skewness correction are discussed and a new simulation method is proposed. We assume that the concept of skewness is known and refer to Appendix A for some basic facts about skewness.

1.2 Setting and notation

Throughout the thesis we assume that we have an i.i.d. univariate sample X1, . . . , Xn, with EX = µ, Var(X) = σ² and E|X|³ < ∞, such that X satisfies Cram´er’s condition lim sup_t→∞|ϕ(t)| < 1, where ϕ is the characteristic function of X. At times we will also assume that EX⁴ < ∞. We use X to denote a generic Xi, that is, X is a random variable with the same distribution as the Xi. Thus, for instance, EX is the mean of the distribution of the observations; EX = EX_i for all i.

The α-quantile vα of the distribution of some random variable X is defined to be such that P (X ≤ vα) = α. When X ∼ N (0, 1) we denote the quantile λα, that is, Φ(λ_α) = α.

We use An to denote a general statistic, Sn = n^1/2( ¯X − µ)/σ to denote the standardized sample mean and Tn= n^1/2( ¯X − µ)/ˆσ to denote the studentized sample mean, where ˆσ² = _n¹P(X_i− ¯X)².

The skewness E(X − µ)³/σ³ of a random variable X is denoted Skew(X), γ or γX

if we need to distinguish between different random variables. The kurtosis of X, E(X − µ)⁴/σ⁴− 3, is denoted Kurt(X), κ or κ_X. Basic facts about skewness and kurtosis are stated in Appendix A.

As for asymptotic notation, for real-valued sequences a_n and b_n we say that a_n= o(b_n) if an/bn→ 0 as n → ∞ and a_n= O(bn) if an/bn is bounded as n → ∞.

Finally, we say that a sequence X_n of random variables is bounded in probability if, lim_c→∞lim sup_n→∞P (|X_n| > c) = 0. We write this as X_n = O_P(1) and if, for some sequence an, anXn= OP(1) we write Xn= OP(1/an).

(10)

2 The Edgeworth expansion

In this section we introduce our main tool, the Edgeworth expansion. Later we will use it to determine the coverage of confidence intervals.

2.1 Definition and formal conditions

Theorem 1. Assume that X₁, . . . , X_n is an i.i.d. sample from a univariate distribution, with mean µ, variance σ² and E|X|^j+2 < ∞, that satisfies lim sup_t→∞|ϕ(t)| < 1. Let Sn= n^1/2( ¯X − µ)/σ. Then

P (Sn≤ x) = Φ(x) + n^−1/2p1(x)φ(x) + . . . + n^−j/2pj(x)φ(x) + o(n^−j/2) (1) uniformly in x, where Φ(x) and φ(x) are the standard normal distribution function and density function and pk is a polynomial of degree 3k − 1. In particular

p1(x) = −1

6γ(x²− 1) and p₂(x) = −x 1

24κ(x²− 3) + 1

72γ²(x⁴− 10x²+ 15) . Proof. A proof is given in Section 2.2.1. See also [7] and [8].

(1) is called an Edgeworth expansion for Sn.

The condition lim sup_t→∞|ϕ(t)| < 1 is known as Cram´er’s condition and was de- rived by Cram´er in [8]. Note that the condition holds whenever X is absolutely continuous. This is an immediate consequence of the Riemann-Lebesgue lemma (Theo- rem 1.5 in Chapter 4 of [18]). Moreover, if we limit the expansion to P (S_n ≤ x) = Φ(x) + n^−1/2p₁(x)φ(x) + o(n^−1/2), so that the remainder term is o(n^−1/2), then it suf- fices that X has a non-lattice distribution, which was shown by Esseen in [16].

The Edgeworth expansion was first developed for the statistic S_nbut has later been extended to other statistics.

Definition 1. Let Andenote a statistic. Then if

P (An≤ x) = Φ(x) + n^−1/2a1(x)φ(x) + . . . + n^−j/2aj(x)φ(x) + o(n^−j/2), (2) where Φ(x) and φ(x) are the standard normal distribution function and density function and a_k is a polynomial of degree 3k − 1, (2) is called the Edgeworth expansion for A_n.

If a_k= 0 for all k < i the normal approximation of the distribution is said to be i:th order correct.

In general, the polynomials a_k will depend on the moments of the statistic. We can therefore, in a sense, view the Edgeworth expansion as an extension of the central limit theorem, where information about the higher moments of the involved random variables is used to obtain a better approximation of the distribution function of An. The expansion gives an expression for the size of the remainder term depending on the sample size.

(11)

We might want to compare this to the Berry-Esseen theorem (see for instance [16] or Section 7.6 of [18]), which essentially, in our context, says that the error in the normal approximation is of order n^−1/2.

The Edgeworth expansion for simple statistics was first introduced in papers by Cheby- shev in 1890 ([4]) and Edgeworth in 1894, 1905 and 1907 ([12, 13, 14]). The idea was made mathematically rigorous by Cram´er in 1928 ([7]) and Esseen in 1945 ([16]). The expansions and their applications for more general statistics were then developed in several papers, including [2], by various authors in the mid-1900’s.

A thorough treatment of the Edgeworth expansion is found in Chapter 2 of [22].

Chapter 13 of [10] gives a brief introduction to the Edgeworth expansion with some of the most important results and Section 17.7 of [8] is a standard reference for the case where the expansion for Snis considered.

Conditions for (2) to hold are given next. Although we will focus on expansions of quite simple statistics, the Edgeworth expansion can be used in very general circum- stances. We state a theorem by Bhattacharya and Ghosh ([2]) that provides conditions for the Edgeworth expansion (2) to hold in a general case and illustrate how the theorem relates to the expansion for S_n.

Let X, X₁, X₂, . . . , X_n be i.i.d. random column vectors in R^d with mean µ and let X = n¯ ⁻¹Pn

i=1X_i. Let A : R^d→ R be a function of the form AS= (g(x) − g(µ))/h(µ) or A_T = (g(x) − g(µ))/h(x), where g and h are known, ˆθ = g( ¯X) is an estimator of the scalar θ = g(µ), h(µ)² is the asymptotical variance of n^1/2θ and h( ¯ˆ X) is an estimator of h(µ).

For t ∈ R^d, t = (t⁽¹⁾, t⁽²⁾, . . . , t^(d)), define ||t|| = ((t⁽¹⁾)²+ . . . + (t^(d))²)^1/2 and, for a random d-vector X, ϕ(t) = E(exp(iPd

j=1t^(j)X^(j)).

Theorem 2. Assume that A has j + 2 continuous derivates in a neighbourhood of µ = E(X) and that A(µ) = 0. Furthermore, assume that E(||X||^j+2) < ∞ and that the characteristic function ϕ of X is such that lim sup_||t||→∞|ϕ(t)| < 1. Let σ_A be the asymptotic standard deviation of n^1/2A( ¯X) and assume that σA> 0. Then, with An= n^1/2A( ¯X)/σ_A and for j ≥ 1,

P (A_n≤ x) = Φ(x) + n^−1/2a₁(x)φ(x) + . . . + n^−j/2a_j(x)φ(x) + o(n^−j/2) (3) uniformly in x. a_k is a polynomial of degree 3k − 1 with coefficients depending on A and on moments of X of order less than or equal k + 2. a_k is odd for even k and even for odd k.

Proof. See [2].

Note that the above theorem is a summary of the results in Bhattacharya’s and Ghosh’s 1978 paper and thus not stated in one single theorem in the original work. Our summary largely resembles that in Chapter 2 of [22].

(12)

The class of functions that satisfy the conditions in Theorem 2 contains many functions of great interest. In particular, we can write any moment estimator, i.e. an estimator based on sample moments, as a function A of vector means in the same way that we will do below for the mean.

The Edgeworth expansion is sometimes written as an infinte series, P (An ≤ x) = Φ(x) + n^−1/2a₁(x)φ(x) + . . . + n^−j/2a_j(x)φ(x) + . . .. However, for the series to converge it is required, for an absolutely continuous random variable, that E

exp(^(X−µ)_4·σ2²)

< ∞;

a condition that fails even for exponentially distributed X (see [7]). Thus we prefer the stopped series used in (3), which also turns out to be more useful in practice.

Before we show how Theorem 2 relates to the expansion for S_n we state a slightly more general corollary.

Corollary 1. Let An be either AS = n^1/2(ˆθ − θ)/σ or AT = n^1/2(ˆθ − θ)/ˆσ, where θ is some unknown scalar parameter, ˆθ is an asymptotically unbiased estimator of θ, σ² is the asymptotic variance of n^1/2θ and ˆˆ σ² is some consistent estimator of σ². Then the first two polynomials in the Edgeworth expansion are

a₁(x) = −

k_1,2+1

6k_3,1(x²− 1) and

a2(x) = −x

1

2(k2,2+ k_1,2² ) + 1

24(k4,1+ 4k1,2k3,1)(x²− 3) + 1

72k_3,1² (x⁴− 10x²+ 15)

where the kj,i comes from an expansion of the j:th cumulant of An:

κj,n= n^−(j−2)/2(kj,1+ n⁻¹kj,2+ n⁻²kj,3+ . . .).

Proof. Details are given in Section 2.2.2.

Theorem 2 can be used to show that S_n as well as the studentized sample mean T_n= n^−1/2( ¯X −µ)/ˆσ admit Edgeworth expansions. Assume that X₁, . . . , X_nis a sample from a univariate distribution, that the unknown parameter θ is the mean µ of the distribution and that the distribution has variance σ². Take d = 2, X_i = (Xi, X_i²)^T and µ = E(X) = (EX, EX²)^T and let g(x⁽¹⁾, x⁽²⁾) = x⁽¹⁾ and h(x⁽¹⁾, x⁽²⁾) = x⁽²⁾− (x⁽¹⁾)².

Then g(µ) = µ and g( ¯X) = ¯X. Furthermore h(µ) = σ² and h( ¯X) = n⁻¹

n

X

i=1

X_i²− n⁻¹

n

X

i=1

X_i2

= n⁻¹

n

X

i=1

(X_i− ¯X)² = ˆσ²

A_S = (g(x) − g(µ))/h(µ) and A_T = (g(x) − g(µ))/h(x) both fulfill the conditions in Theorem 2 and the asymptotic standard deviation σAof n^1/2A( ¯X) is 1. Furthermore, the condition E(pX²+ (X²)²^j+2) < ∞ can be reduced to E(|X|^j+2) < ∞ and Cram´er’s condition reduces to lim sup_t→∞|ϕ(t)| < 1. Thus the conditions for existence of the expansion for Sn and Tn follow and turn out to be those in Theorem 1. Both statistics are of the form that is considered in Corollary 1 and the polynomials in their expansions can thus be found.

(13)

2.2 Derivations

Knowing under which conditions the Edgeworth expansion exists, we are ready to derive the expressions for the polynomials p1 and p2.

2.2.1 Edgeworth expansion for S_n

As before, let Sn = n^1/2( ¯X − µ)/σ. Assuming that E|X|^j+2 < ∞ and that X fulfills Cram´er’s condition, by Corollary 1

P (S_n≤ x) = Φ(x) + n^−1/2p₁(x)φ(x) + . . . + n^−j/2p_j(x)φ(x) + o(n^−j/2). (4) To actually be able to make reasonable use of the expansion we need to find expressions for the polynomials p_k. The case where j = 2 will prove to be of special interest to us, so although we consider general j we will in the end only derive explicit expressions for p₁ and p2. Our exposition is mainly based on that in [8] and [22] but aims to be somewhat more thorough than those that we have seen in the existing literature.

Since we will assume the existence of moments of order higher than 2, Sn is asymptotically N (0, 1)-distributed, i.e. S_n converges in distribution to S ∼ N (0, 1), and thus the characteristic function ϕ_nof S_nconverges to e^−t/2, the characteristic function of the standard normal distribution, as n tends to infinity. That is, as n → ∞

ϕ_n(t) = E(exp(itS_n)) −→ E(exp(itS)) = e^−t²^/2 for −∞ < t < ∞ where S ∼ N (0, 1).

Recall that if Y1, . . . , Ynare i.i.d. and Sn= Y1+ . . . + Ynthen ϕSn(t) = (ϕY1(t))ⁿ(see Theorem 1.8 in Chapter 4 of [18]). Now, let Yi = (Xi−µ)/σ. Then S_n= n^1/2( ¯X −µ)/σ = n^{1/2 1}_nPn

i1Y_i = n^−1/2Pn

i1Y_i and thus ϕn(t) = E(exp(itSn)) = E

exp(itn^−1/2

n

X

i=1

Yi)

=

ϕY(tn^−1/2)

n

.

Next we define the cumulant generating function for Y as ln ϕ_Y(t). A MacLaurin expansion of ln ϕY(t) shows that if E(|Y |^k) < ∞ then, after a rearrangement of the terms, we can write ln ϕ_Y(t) on the form

ln ϕ_Y(t) =

k

X

j=1

(it)^j

j! κ_j + o(|t|^k) as t → 0

for some {κ_j}. We call the coefficients κ_j the cumulants of Y ; in particular κ_j is the j:th cumulant. See Section 15.10 of [8] for details.

By Theorem 4.2 in Chapter 4 of [18] we have that ϕ_Y(t) = 1 +Pk j=1

(it)^j

j! EY^j+ o(|t|^k) as t → 0 when E(|Y |^k) < ∞. If we for a moment don’t worry about the existence of

(14)

moments and convergence of the series we conclude that

∞

X

j=1

(it)^j

j! κ_j = ln ϕ_Y(t) = ln 1 +

∞

X

j=1

(it)^j j! EY^j

and by looking at the MacLaurin expansion of the right hand side (i.e. the MacLaurin expansion of the function ln(1 + x) where we replace x with P EY^j(it)^j/j!) it follows

that ∞

X

j=1

(it)^j j! κj =

∞

X

k=1

(−1)^k+11 k

X^∞

j=1

(it)^j j! EY^j

k

. Comparing the coefficients of (it)^j we find that

κ₁ = EY = 0

κ2 = EY²− (EY )²= 1

κ₃ = EY³− 3EY EY²+ 2(EY )³= EY³

κ4 = EY⁴− 3(EY )²− 4EY EY³+ 12(EY )²EY²− 6(EY )⁴= EY⁴

The expression for κj holds whenever E(|Y |^j) < ∞. Note that the assumption that E(|X|^j) < ∞ implies that E(|Y |^j) < ∞.

Returning our attention to Sn, the relation ϕn(t) = (ϕY(tn^−1/2))ⁿand the fact that κ1 = 0 and κ2 = 1 now give us that

ϕ_n(t) =

expX^∞

j=1

(itn^−1/2)^j j! κ_jn

= expX^∞

j=1

n^−(j−2)/2(it)^j j! κ_j

=

exp

−1

2t²+ n^−1/21

3!κ3(it)³+ . . . + n^−(j−2)/21

j!κj(it)^j+ . . .

= e^−t²^/2exp

n^−1/21

3!κ3(it)³+ . . . + n^−(j−2)/21

j!κj(it)^j+ . . .

= e^−t²^/2

1 + n^−1/2r1(it) + n⁻¹r2(it) + . . . + n^−j/2rj(it) + . . .

where the last equality is obtained through the MacLaurin expansion e^x= 1+x+x²/2!+

. . .

rj is a polynomial of degree 3j that depends on κ3, . . . , κj+2. By comparing the coefficients of n^−j/2 on the last two lines we find that

r₁(x) = 1 6κ₃x³ and

r2(x) = 1

24κ4x⁴+ 1 72κ²₃x⁶. We can rewrite the expression for ϕ_n(t) above as

ϕ_n(t) = e^−t²^/2+ n^−1/2r₁(it)e^−t²^/2+ n⁻¹r₂(it)e^−t²^/2+ . . . + n^−j/2r_j(it)e^−t²^/2+ . . . (5)

(15)

Now, since ϕn(t) = R∞

x=−∞e^itxdP (Sn ≤ X) and e^−t²^/2 = R∞

x=−∞e^itxdΦ(x) it seems plausible that there is an inversion of (5) of the form

P (Sn≤ x) = Φ(x) + n^−1/2R1(x) + . . . + n^−j/2Rj(x) + . . . where R_k is a function such thatR∞

−∞e^itxdR_k(x) = r_k(it)e^−t²^/2 so that ϕn(t) =

Z ∞ x=−∞

e^itxdP (Sn≤ X) = Z ∞

x=−∞

e^itxdΦ(x) + n^−1/2 Z ∞

−∞

e^itxdR1(x) + . . . = (5).

We would thus like to try to find such R_k. By repeating integration by parts j times we find that

e^−t²^/2= Z ∞

x=−∞

e^itxdΦ(x) = (−it)⁻¹ Z ∞

x=−∞

e^itxdΦ⁽¹⁾(x) = . . . = (−it)^−j

Z ∞ x=−∞

e^itxdΦ^(j)(x)

where Φ^(k)(x) = _dx^d^k_kΦ(x) = (_dx^d)^kΦ(x) = D^kΦ(x). Hence R∞

−∞e^itxd((−D)^kΦ(x)) = (it)^ke^−t²^/2. Interpreting r_k(−D) as a polynomial in D, making r_k(−D) a differential operator, we thus have that R∞

−∞e^itxd(r_k(−D)Φ(x)) = r_k(it)e^−t²^/2. Thus R_k(x) = r_k(−D)Φ(x).

By differentiating Φ(x) we find that, for k ≥ 1, (−D)^kΦ(x) = −Hk−1(x)φ(x) where φ(x) is the density function for the standard normal distribution and H_k are the Hermite polynomials:

H₀(x) = 1 H₁(x) = x, H2(x) = x²− 1 H₃(x) = x(x²− 3), H4(x) = x⁴− 6x²+ 3,

H₅(x) = x(x⁴− 10x²+ 15), . . .

We concluded above that r1(x) = ¹₆κ3x³ and r2(x) = ₂₄¹κ4x⁴+₇₂¹κ²₃x⁶ and since Rk(x) = r_k(−D)Φ(x) we thus have that

R₁(x) = 1

6κ₃(−D)³Φ(x) = −1

6κ₃H₂(x)φ(x) = −1

6(x²− 1)φ(x) and

R₂(x) = − 1

24κ₄H₃(x)φ(x) − 1

72κ²₃H₅(x)φ(x) =

− x1

24κ₄(x²− 3) + 1

72κ²₃(x⁴− 10x²+ 15) φ(x).

(16)

Thus

P (S_n≤ x) = Φ(x) + n^−1/2R₁(x) + . . . + n^−j/2R_j(x) + . . . =

Φ(x) + n^−1/2p1(x)φ(x) + n⁻¹p2(x)φ(x) + . . . + n^−j/2pj(x)φ(x) + . . . (6) with

p₁(x) = −1

6κ²₃(x²− 1) and p₂(x) = −x1

24κ₄(x²− 3) + 1

72κ²₃(x⁴− 10x²+ 15) . κ₃ is the skewness of X and κ₄ the kurtosis.

It can be shown (see Section 2.4 of [22]) that the inversion of (5) leading to (6) is valid when X is nonsingular and E(|X|^j+2) < ∞ if we limit the series to j terms:

P (Sn≤ x) = Φ(x) + n^−1/2R1(x) + . . . + n^−j/2Rj(x) + o(n^−j/2) =

Φ(x) + n^−1/2p₁(x)φ(x) + n⁻¹p₂(x)φ(x) + . . . + n^−j/2p_j(x)φ(x) + o(n^−j/2). (7) This completes the proof of half of Corollary 1.

If X ∼ N (µ, σ²) then Sn ∼ N (0, 1) so we’d expect p_j to be 0 for all j. This is indeed the case, since κj = 0 for j ≥ 3 for the standard normal distribution. Thus the

”expansion” still holds when P (S_n≤ x) = Φ(x).

It is of interest to note that both the skewness and the kurtosis are scale and trans- lation invariant, so that the third and fourth cumulants for X and Y coincide (we use Y in the calculations above because standardized variables are easier to handle). It can be shown, by looking at characteristic functions or by straightforward calculation (as is done in Appendix A), that the skewness of ¯X is n^−1/2 times the skewness of X.

Similarly the kurtosis of ¯X is n⁻¹ times the kurtosis of X and in general, for j ≥ 2 the j:th cumulant of ¯X will be the j:th cumulant of X times n^−(j−2)/2. Thus we can view the factors n^−(j−2)/2in the Edgeworth expansion for Sn as coming from the cumulants of ¯X.

2.2.2 Edgeworth expansions for more general statistics

The procedure for finding the Edgeworth expansion for more general statistics A_n is essentially the same as that in the previous section. We briefly mention the result. Let Anbe either AS= n^1/2(ˆθ − θ)/σ or AT = n^1/2(ˆθ − θ)/ˆσ, where θ is some unknown scalar parameter, ˆθ is an asymtotically unbiased estimator of θ, σ² is the asymptotic variance of n^1/2θ and ˆˆ σ² is some consistent estimator of σ².

Denote by κj,n the j:th cumulant of An. Under the regularity conditions stated in Theorem 2, for j ≥ 1, we can expand κj,n as

κ_j,n= n^−(j−2)/2(k_j,1+ n⁻¹k_j,2+ n⁻²k_j,3+ . . .).

(17)

for some kj,i, where κ1,1 = 0 and κ2,1 = 1. It can be shown, through calculations that are completely analogous to the Sn case, where the cumulants of An are replaced by their expansions, that the first two polynomials in the Edgeworth expansion for A_nare

a1(x) = −

k1,2+1

6k3,1H2(x)

= −

k1,2+1

6k3,1(x²− 1)

(8) and

a₂(x) = −1

2(k_2,2+ k_1,2² )H₁(x) + 1

24(k_4,1+ 4k_1,2k_3,1)H₃(x) + 1

72k_3,1² H₅(x)

=

− x1

2(k2,2+ k_1,2² ) + 1

24(k4,1+ 4k1,2k3,1)(x²− 3) + 1

72k²_3,1(x⁴− 10x²+ 15)

(9)

As before the inversion is valid, giving the truncated expansion

P (A_n≤ x) = Φ(x) + n^−1/2a₁(x)φ(x) + . . . + n^−j/2a_j(x)φ(x) + o(n^−j/2) when E|X|^j+2< ∞ and X satisfies Cram´er’s condition.

Thus the problem of finding the Edgeworth expansion for An on the form n^1/2(ˆθ − θ)/σ or n^1/2(ˆθ − θ)/ˆσ amounts to finding the terms kj,i in the expansion

κj,n = n^−(j−2)/2(kj,1+ n⁻¹kj,2+ n⁻²kj,3+ . . .) of the cumulants of A_n.

As we discussed in the previous section, if An= Sn is known then κ_j,n = n^−(j−2)/2κ_j

for j ≥ 2, where κj is the j:th cumulant for Y = (X − µ)/σ. Thus kj,1= κj and kj,i = 0 for i ≥ 2. This reduces the expressions for a1 and a2 above to those for p1 and p2 in (6).

2.2.3 Edgeworth expansion for Tn

Finally, consider the statistic Tn= n^1/2( ¯X − µ)/ˆσ where ˆσ² = _n¹P(X_i− ¯X)². It can be shown that the k_j,i in the expansion of the cumulants of T_n are

k_1,2= −1 2γ, k2,2= 1

4(7γ²+ 12), k_3,1= −2γ and k4,1= 12γ²− 2κ + 6.

Inserting these into (8) and (9) we get q₁(x) = a₁(x) = −(−1

2γ −2

6γ(x²− 1)) = 1

6γ(2x²+ 1)

(18)

and

q₂(x) = a₂(x) = −1 2(1

4(7γ²+ 12) + (−1 2γ)²)+

1

24(12γ²− 2κ + 6 + 4(−1

2γ)(−2γ)(x²− 3) + 1

72(−2γ)²(x⁴− 10x²+ 15)

= x

1

12κ(x²− 3) − 1

18γ²(x⁴+ 2x²− 3) − 1

4(x²+ 3)

.

We require that EX⁴< ∞ and that X satisifes Cram´er’s condition for the expansion P (Tn≤ x) = Φ(x) + n^−1/2q1(x)φ(x) + n⁻¹q2(x)φ(x) + o(n⁻¹)

to hold.

2.2.4 Some remarks; skewness correction

We’ve seen above that for both Sn and Tn the first polynomial a1 depends on the skewness κ3 = γ = E(X − µ)³/σ³ and that the second polynomial a2 depends on γ² and the kurtosis κ₄ = κ = E(X − µ)⁴/σ⁴− 3. This is also true for many more general statistics An. In such cases, a1 is said to describe the primary effect of skewness while a2 is said to describe the primary effect of kurtosis and the secondary effect of skewness.

A skewness corrected statistic is thus a statistic that has been modified in some way so that a1 = 0.

2.3 Cornish-Fisher expansions for quantiles

An interesting use of the Edgeworth expansion is asymptotic expansions of the quantiles of A_n, obtained by what is essentially an inversion of the Edgeworth expansion. Such expansions are called Cornish-Fisher expansions and first appeared in [6] and [17].

Let A_nbe a statistic with the Edgeworth expansion

P (A_n≤ x) = Φ(x) + n^−1/2a₁(x)φ(x) + . . . + n^−j/2a_j(x)φ(x) + . . . (10) and let v_α be the α-quantile of A_n, so that P (A_n ≤ v_α) = α. Furthermore, let λ_α be the α-quantile of the N (0, 1)-distribution, i.e. let Φ(λ_α) = α. Then there exists an expansion of vα in terms of λα:

vα = λα+ n^−1/2s1(λα) + n⁻¹s2(λα) + . . . + n^−j/2sj(λα) + . . . (11) (11) is called the Cornish-Fisher expansion of vα.

The functions s_k are polynomials of degree at most k + 1, odd for even k and even for odd k, that depend on cumulants of order at most k + 2. They are determined by the polynomials ak in (10).

[22] contains a short introduction to the Cornish-Fisher expansion, where it is shown that s₁(x) = −a₁(x) and s₂(x) = a₁(x)a⁰₁(x) − ¹₂a₁(x)²− a₂(x).

(19)

3 Methods for skewness correction

We discuss the need of skewness correction. Some methods for obtaining second order correct confidence intervals are described. We assume throughout the section that X is absolutely continuous.

3.1 Coverages of confidence intervals

Definition 2. Let Iθ(α) be a confidence interval for an unknown parameter θ with approximate confidence level α. We call α the nominal coverage of I_θ(α). Furthermore we call α⁰ = P (θ ∈ I_θ(α)) the actual coverage of the confidence interval and define the coverage error as the difference between the actual coverage and the nominal coverage, α⁰ − α.

If Iθ(α) has been reasonably constructed then this difference will converge to zero as the sample size n increases. We will illustrate how the Edgeworth expansion can be used to estimate the order of coverage errors.

Let X1, . . . , Xn be an i.i.d. sample from a distribution, with mean µ and variance σ², such that the Edgeworth expansion for S_n and T_n exists. Consider the two sided confidence interval Jµ(α) = ( ¯X − n^−1/2σzα, ¯X + n^−1/2σzα) where P (|Z| ≤ zα) = α when Z ∼ N (0, 1). We find that

P (µ ∈ J_µ(α)) = P (S_n> −z_α) − P (S_n> z_α) = P (Sn≤ z_α) − P (Sn≤ −z_α) =

Φ(zα) − Φ(−zα) + n^−1/2(p1(zα)φ(zα) − p1(−zα)φ(−zα)) + n⁻¹(p₂(z_α)φ(z_α) − p₂(−z_α)φ(−z_α))

+ n^−3/2(p₃(z_α)φ(z_α) − p₃(−z_α)φ(−z_α)) + o(n^−3/2) = α + 2n⁻¹p2(zα)φ(zα) + o(n^−3/2).

The last equality follows since p₂ is odd and p₁, p₃and φ are even. The same result holds if we have ˆσ instead of σ, with q2 instead of p2. Thus the coverage error for two-sided normal approximation confidence intervals is of order n⁻¹. In some sense we can think of the two-sided confidence intervals as containing an implicit skewness correction.

The situation is not as good for one-sided confidence intervals. Consider the one- sided normal approximation confidence interval I_µ(α) = (−∞, ¯X + n^−1/2σλ_α), where Φ(λ_α) = α. The coverage of I_µ(α) is

P (µ ∈ Iµ(α)) = P (µ ≤ ¯X + n^−1/2σλˆ α) = P (Sn≥ −λ_α) = 1 − (Φ(−λα) + n^−1/2p1(−λα)φ(−λα) + o(n^−1/2)) = α − n^−1/2p1(λα)φ(λα) + o(n^−1/2).

(20)

The coverage for the interval I_µ⁰(α) = (−∞, ¯X + n^−1/2ˆσλα) is analogously found to be α − n^−1/2q1(λα)φ(λα) + o(n^−1/2)

Thus, for one-sided confidence intervals, normal approximation gives a coverage error of order n^−1/2.

The polynomials p₁ and q₁ both contain the skewness γ. Thus we see that the skewness of X affects the actual coverage of the normal approximation confidence intervals. In particular, when the skewness is zero the n^−1/2 term of the coverage error disappears and when the skewness is large the coverage error might be large. When X is skew it is possible to obtain confidence intervals with a better coverage by correcting for skewness.

Some methods for this are presented next.

We will assume that the variance σ² and the skewness γ are known. It might seem like a bit of a contradiction that the second and third central moments are known, but not the mean. An example where such a situation could occur is when a measuring instrument that has been used sufficiently much, so that the variance and skewness of it’s measures are known, is used to measure something that has not been measured before. One could of course argue that in that case the distribution, or at least the quantiles, of the measurement errors might be known as well and that a parametric confidence interval would make more sense. Let us however assume that the quantiles are unknown and that the density function is unkown or too complicated to work with for such procedures to be fruitful.

3.2 Edgeworth correction

In many cases we wish to derive a confidence interval using some statistic An. In cases where the Edgeworth expansion for A_nis known, we can obtain confidence intervals with a coverage error of a smaller order than that of the normal approximation interval. In particular, we can make an explicit correction for skewness using the following theorem, various versions of which were proved in [27], [19], [29] and [1].

Theorem 3. Let A_n be either S_n= n^1/2( ¯X − µ)/σ or T_n= n^1/2( ¯X − µ)/ˆσ and assume that A_n admits the Edgeworth expansion

P (A_n≤ x) = Φ(x) + n^−1/2a₁(x)φ(x) + o(n^−1/2).

Then

P

A_n≤ x − n^−1/2a₁(x)

= Φ(x) + o(n^−1/2) (12)

and

P

An≤ x − n^−1/2ˆa1(x)

= Φ(x) + o(n^−1/2)

where ˆa1 is the polynomial a1 with population moments replaced by sample moments.

A simulation method for skewness correction

U.U.D.M. Project Report 2008:22

Department of Mathematics

A simulation method for skewness correction

Måns Eriksson

Acknowledgements

Contents

1 Introduction

2 The Edgeworth expansion

3 Methods for skewness correction