• No results found

Another Student´s T-test : Proposal and evaluation of a modified T-test

N/A
N/A
Protected

Academic year: 2021

Share "Another Student´s T-test : Proposal and evaluation of a modified T-test"

Copied!
41
0
0

Loading.... (view fulltext now)

Full text

(1)

Örebro University

Örebro University School of Business Statistics, Paper, Second level, 15 Credits Supervisor: Sune Karlsson

Examiner: Thomas Laitila Spring 2014

Another Student’s T-test

Proposal and evaluation of a modified T-test

Jonas Englund 880131

(2)

Abstract

In this paper we propose a way of performing hypothesis tests by utlizing all information that we know under the null hypothesis. W.S. Gosset, also known as Student, derived the famous Student’s T-test in the early days of the twentieth century and this is where this paper departures from. It turns out that when using Student’s T-test we are not using all available information that is known under the null hyptothesis. By using all known information we can get a better variance estimator than the usual variance estimator. The test based on this variance estimator is in this paper called Another Student’s Test (AST). The test is evaluated with the use of simulation and compared to Student’s T-test. The conclusion that we arrived at were that AST and Student’s T can not be said to perform any different, at least not under the settings used in this paper. Albeit this, further analysis should be carried out to investigate AST further, and a couple of situations where the ideas can be used are proposed.

(3)

Table of contents

1 Introduction ... 1

2 Hypothesis testing in general ... 1

2.1 Test evaluation ... 2

3. Student’s T-test ... 2

4 Another student’s T-test ... 2

4.1 Proof of as a superior variance estimator under the null ... 4

4.2 as a maximum likelihood estimator ... 6

4.3 Formal derivation of the -statistic ... 7

5 Probability distribution of ... 7

5.1 Probability distribution of under ... 7

5.2 Probability distribution of under ... 9

6 Evaluation ... 11 6.1 Power estimation ... 12 6.2 Graphical evaluation ... 12 6.3 Non-graphical evaluation... 16 6.4 Analysis ... 17 6.5 Summary of evaluation ... 19

7 Extensions of the test ... 19

8 Summary and conclusions ... 21

References ... 22

Appendix A: Review of method of evaluation ... 23

A.1 Code in R ... 23

Appendix B: Lyapunov’s CLT ... 25

B.1 Code in R ... 25

(4)

1 Introduction

In this paper we introduce a variant of Student’s T-test which we call Another Student’s Test (AST). The rationale behind this test will be thoroughly reviewed and explained. The characteristics of the test will be examined, such as its probability distribution under both the null and otherwise. Critical values will be derived via Monte Carlo simulation and power estimates in various situations will also be estimated via simulation. The test will be evaluated by estimating the test’s power in various situations and by comparing it with the power of the one sample Student’s T-test. Since the use of Student’s T-test is the golden standard in the situation of testing if a mean of a single group is equal to some constant, given that the assumption of normality is met, we will also test the hypothesis of no

difference between the tests. We begin by an introduction to hypothesis testing, followed by

an introduction to Student’s T-test and then the AST. This is then followed by an evalutation of the tests, that is, a comparison between the tests performances in terms of power and then extensions of the test is proposed. The purpose of this paper is simply to examine AST, evaluate it and test whether it is equally good as Student’s T-test; all of this under the assumption of a normally distributed variable.

2 Hypothesis testing in general

“A hypothesis is a statement about a population parameter”, as Casella & Berger (2002, p. 373) expresses it. In most cases we only have sample data and the aim of a hypothesis test is to get an indication of whether the null can be rejected or not. In a hypothesis testing situation, a null and an alternative hypothesis is predetermined before the test is carried out. The null hypothesis can, in mathematical notation, be expressed as

where the alternative hypothesis is, in general, the complement of the null. When performing hypothesis tests we assume that the null is true and evaluate whether the result we got is probable. In other words: we calculate or estimate the probability of attaining the result we got or more extreme, given that the null is true. There are two types of errors that can occur when carrying out hypothesis tests, there are type I errors which is when the test tells us to reject the null when the null is true (the probability of this occurence when the null is true, is denoted ), and there are type II errors which is when the test tells us not to reject the null when it is false. Then there are two other possibilities; the probability of not rejecting the null when the null is true; and the other is the power of the test, that is, the probability of rejecting the null when the null is false (often denoted ). The larger the probability of correctly rejecting the null, given some level of significance, the better the test is (Casella & Berger, 2002).

(5)

2.1 Test evaluation

There are many ways of evaluating tests. In this paper we will evaluate whether AST is equally powerful as Student’s T-test, and this is carried out via simulation based methods. See section 6 for more on this topic.

3. Student’s T-test

Gosset, also known as Student which he used as a pseudonym when publishing his work, was interested in the behaviour of the probability distribution of the T-statistic in small samples. Early on in the twentieth century many statisticians did not distinguish between the true variance population parameter, , and the estimated variance, . Gosset worked as a brewer at Guinness’ brewery at the research department and when they started doing research they often used small samples. So, Gosset started his work on Student’s T-test with some help from another famous statistician, namely Ronald Fischer (Box, 1987).

Gosset and his team at the brewery made experiments and when they treated the sample variance as population variance they found that the results were not trustworthy. Which in turn led him to dig in to the derivation of Student’s T-test. Gosset derived the distribution of the following statistic

̅ √ ⁄

which he found out had the following probability distribution function ⁄

√ ( )

where denotes the gamma function and the degrees of freedom. This finding made it possible to test hypotheses reliably in small samples (Box, 1987; Casella & Berger, 2002).

4 Another student’s T-test

This test is a modified one sample student’s T-test. An assumption used throughout this section is that is normally distributed and that each sampled observation is independent and identically distributed (I.I.D). The ordinary T-test has the following form (Casella & Berger, 2002)

̅ √

(1) where ̅ is the sample mean, is the hypothesized population mean or expected value and

(6)

∑ ̅

The hypothesis in such a test is, in the simpliest case, stated as:

(2)

The test statistic in (1) follows a central t-distribution with degrees of freedom if is normally distributed (Casella & Berger, 2002). Below is a modified T-test introduced that uses a variance estimator that is superior compared to under the null hypothesis,

̃ ̅ ̃ √

(3) where

̃

The logical foundation for this test is somewhat similar to that of the score test when we are dealing with a binomial distributed variable for which the test looks like:

̂ √ (4) this test statistic is asymptotically standard normal (Casella & Berger, 2002). In this test, the standard error (see the denominator above) of ̂ is a function of , which is equal to if the null is true. By utilizing the same idea we can make use of the same methodology as in (4) also in the case of when we have a normally distributed variable, . We begin by showing that the usual variance estimator, , is an inferior estimator of when the null is true, compared to ̃ . The usual variance estimator, , has an expected value equal to and variance equal to (Wackerly, Mendenhall & Sheaffer, 2002). Ghosh (1979)

(7)

give a thorough review of the score statistic described in (4). He provides evidence that the power function of the score test and the usual approximate Z-test crosses one another, therefore we hypothesize that the power function of the test, ̃, also is a more powerful test than in some regions. In order to use this modified T-test we need to find the distribution of ̃, or at least the critical value at level for a given sample size.

In fact, this test is more rational than Student’s T-test since it utilizes more information, information that is known (well, not known since then we would not have to perform a test, but known under the null hypothesis, which we assume is true).

4.1 Proof of ̃

as a superior variance estimator under the null

To begin with, the following moments has to be established in order to complete the proof1: [ ]

[ ] [ ] and

[ ]

The following general result is also used to establish the proof:

[ ] [ ] [ ] [ ] [ ] [ ]

Now we can start by giving a proof of this variance estimator’s unbiasedness under the null, [ ] [ ] [ ]

We can now see that, under , this is equal to and from this it follows that [ ̃ ] . So, when the null is true this variance estimator is unbiased and next is a proof of it’s superior (lower, that is) variance,

(8)

[ ̃ ] [ ̃ ] [ ̃ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ̅ ] [ ] [ ] [ ] (5) In order to forthgo from here we need to establish a few intermediate results. We have that

[ ] [ ] [ ∑ ] [ ] [ ] and [ ̅ ] [ ̅] [ ̅] [ ] [ ] and [ ] [ ] [ ∑ ∑ ] [ ] [ ] [ ] ( ) By inserting these results into (5) and simplifying a little bit, we are arriving at

[ ̃ ] (

( ) ) [ ̃ ]

[

( ) ]

(9)

Now, by setting , this expression can be simplified to

[ ̃ | ] [ ]

We have now established that ̃ is a superior variance estimator under the null.

4.2 ̃

as a maximum likelihood estimator

We can also show that ̃ is the maximum likelihood estimator of when is known. For an introduction to maximum likelihood estimation, see, for example, Casella and Berger (2002). Consider the following pdf of , which is the pdf of a normally distributed variable,

from where we need to solve for ̂ as described next

̂ ( ̂ | ) We have that: ̂ ( ̂ | ) ̂ ( ̂ (√ ) (√ ) ̂ ) ̂ ̂

by setting this equal to zero and solve for ̂ , we get ̂ ̂ ̂ ̃

Now we have established that this estimator is the maximum likelihood estimator and also that it has a variance less than the usual variance estimator under the null. It can also be shown that this estimator is, in fact, the best unbiased estimator; that is, the unbiased estimator with least variance. In proving this it suffices to prove that the variance is equal to the Cramer Rao lower bound. But the proof of that this is the case is omitted and instead we refer to page 340 in Casella and Berger (2002). Now when arguments for the use of ̃ instead of when testing the hypothesis in (2) has been given, we proceed to a more formal derivation of the test described in (3).

(10)

4.3 Formal derivation of the ̃-statistic

A Wald statistic is, asymptotically, a standard normal stochastic variable and can be derived in the following way, in accordance with Casella and Berger’s (2002) terminology,

̅

√ ̂

̅

√ [ | ]

where is the standard error for , where is an estimate of and if it is the MLE, the denominator on the right hand side of the equation (the observed information number) is a resonable estimate of (Casella & Berger, 2002). Therefore, we can derive the AST as a Wald statistic. Now we can begin the derivation of AST. Based on the following results

| √ | | ( | ) ( | ) ̅ ( | )

we can see that

̅ √

If we estimate with we would have the usual T-test. But, as said before, we do have an estimator of that is better than when the null is true, thus the use of ̃ . We know the asymptotic distribution of this statistic, but not the distribution in finite samples; which is reviewed in the next section.

5 Probability distribution of

̃

In this section we will visually provide estimated probability distribution of ̃ under various circumstances.

5.1 Probability distribution of ̃ under

The estimated probability distribution will be displayed in histograms based on simulation, where number of runs are . To begin with, estimated probability distributions will be given for under the null. When estimating the probability distribution for a given sample size under the null we generate a sample under the null and

(11)

save the ̃-statistic, which is repeated times. We can begin by noting the following in the case of a sample size of one, and then turn to distributions of more interesting sample sizes:

̃ ̅ ̃ √ √ | | {

(12)

Figure 1: Probability distribution of ̃ under ,

.

Figure 3: Probability distribution of ̃ under ,

.

Figure 2: Probability distribution of ̃ under ,

.

Figure 4: Probability distribution of ̃ under ,

.

(13)

Figure 6: Probability distribution of ̃ under , . A fitted standard normal density line is also apparent in the figure.

As seen above, the distribution for small samples is somewhat peculiar while it seem to follow a Gaussian distribution in the ”asymptotical” case, as expected.

5.2 Probability distribution of ̃ under

In this section estimated probability distributions will be displayed for the same sample sizes as the last section, that is, . Distributions will also be displayed for

{ √ √

Where is a standard measure of effect size and is called Cohen’s d (Cohen, 1992), defined by

(14)

Figure 7: Probability distribution of ̃ under

, n=2 and √ .

Figure 9: Probability distribution of ̃ under ,

n=3 and √ .

Figure 11: Probability distribution of ̃ under ,

n=7 and √ .

Figure 8: Probability distribution of ̃ under ,

n=2 and √ .

Figure 10: Probability distribution of ̃ under ,

n=3 and √ .

Figure 12: Probability distribution of ̃ under ,

(15)

Figure 13: Probability distribution of ̃ under ,

and √ .

Figure 15: Probability distribution of ̃ under ,

and √ .

Figure 17: Probability distribution of ̃ under ,

and √ .

Figure 14: Probability distribution of ̃ under ,

and √ .

Figure 16: Probability distribution of ̃ under ,

and √ .

Figure 18: Probability distribution of ̃ under ,

and √ .

Figures where Cohen’s d is positive is not displayed since they look the same but in the other ”direction”. Histograms for positive values of can be given upon request.

6 Evaluation

In the evaluation of the tests we have to consider various circumstances, such as different sample sizes and effect sizes. Comparison between the tests will mainly be displayed

(16)

graphically with the use of power function plots, with exact power of Student’s T-test on the x-axis and estimated difference in power between the tests on the y-axis. Evaluation will be made for all sample sizes between 2 and 30.

6.1 Power estimation

Since we do not know the probability distribution function of AST we have to estimate critical values in order to attain them. We estimate critical values under and when estimating the power, rejection of the null is made if the test statistic takes on a value in the range of the rejection region. For estimation of critical values, replications has been used. For power estimation, . The method is outlined below

(1) Generate a vector with sample size n, , where , . (2) Calculate ̃.

(3) Repeat (1)-(2) times.

(4) Calculate critical value, | ̃ | | ̃ ⁄ |

. (5) Generate , where , . (6) Calculate ̃ .

(7) Repeat (5)-(6) times.

(8) Count proportion of times | ̃ | , which is the power estimate.

When estimating the critical value we assume a symmetrical distribution. Moreover, when estimating power of ̃ both for different and , the random numbers are generated independently of each other.

6.2 Graphical evaluation

In this section we will evaluate the power of AST graphically by plotting the estimated difference in power between AST and Student’s T-test2. This is done for each sample size from 2 to 30. In each figure, the estimated difference in power is made for each value of the power of Student’s T-test ( ) from . In other words, estimated difference will be displayed for equal to 0.02, 0.03, …, 0.99 when the alpha level is 0.01. For example, see figure below:

(17)

Figure 19: The estimated difference in power between AST and Student’s T-test for , and a

two-sided hypothesis. Dashed lines are exact 5 percent critical values under the null.

Estimation of the difference in power in the example above is made for . For , for example, Cohen’s is attained and the power for AST is estimated via simulation runs, given Cohen’s , and then the estimated difference in power is plotted. This is then repeated for all values of as described right before Figure 19.

Values of that gives the desired power for Student’s T-test were derived, exactly calculated, not estimated, by specifying sample size, power, type of test and the direction of the hypothesis. Then is attained by finding which value of in the equation below that gives the area, , outside of the critical values,

√ ( ) ∫ ( (

√ ) )

where is the non-centrality parameter. The non-centrality parameter can be caracterized by

√ ⁄

where is standard normal, is a distributed random variable with degrees of freedom. In this particular case, is equal to . Since is determined by , and , where is the only unknown parameter, the equation can be solved. Fortunately, the

(18)

findings of more than 16 thousand different values of in this paper were not calculated by hand. A program called G*Power 3.1.7 was used to attain values on for different sample sizes and power.

Each power estimate in figure 19 is carried out pseudo3 independently of each and every other estimate. The fact that they are pseudo independent enables us to carry out a simple test of the hypothesis

but more about this in section 6.5. In figure 19 we can also see dashed lines, these are 5 percent critical values for the difference in power when the null is true (that is, when the tests have equal power) and is calculated as

As we can see in the following sections, the difference in power between the tests seem to be: none! In the following section only a portion of the results is displayed, see Appendix C for the rest of the results. Next are some more power function graphs displayed.

3

The only dependency between the power estimates is that they are based on the same estimated critical value, but since the critical value is estimated from ten million simulations we can say that the power estimates are almost independent of each other, thus the term ”pseudo independent”.

(19)

Figure 20: The estimated difference in power between AST and Student’s T-test for , and a two-sided hypothesis. Dashed lines are exact 5 percent critical values under the null.

Figure 21: The estimated difference in power between AST and Student’s T-test for , and a

(20)

Figure 22: The estimated difference in power between AST and Student’s T-test for , and a

one-sided hypothesis. Dashed lines are exact 5 percent critical values under the null.

As we can see in the figures above, they all seem to indicate that there is no difference in power between the tests evaluated. The same pattern as seen above can also be seen for many other sample sizes, different alpha levels and both for uni- and bi-directional hypothesis; see Appendix C for power function plots for sample sizes from 2 to 30, for significance levels 0.01, 0.05 and 0.1 and for both one and two sided hypotheses.

6.3 Non-graphical evaluation

The distribution of the estimated difference in power is displayed next, and visually it may seem to follow a normal distribution, but this is not the case!

(21)

Figure 20: Estimated distribution of difference in power between AST and Student’s T- test.

Next is a table with descriptive statistics about the estimated difference in power between the tests. And we can see that the kurtosis estimate indicate that the distribution is non-normal. What is also apparent, is that the estimated expected value is in favor of Student’s T, but not significantly as we will see in the next section.

Mean Variance Skewness Kurtosis Observations

-0.00000141 0.0000179 0.0134 3.53 16298

Table 1: Descriptive statistics of the estimated difference in power

6.4 Analysis

So far it does not seem to be any difference what so ever between the tests. Fortunately, we can test this. It is tempting to think that we can test whether the tests are doing equally well by assuming that we can use a normal large sample test as described next.

̃

̅

The nominator above is the mean sample difference in power between and AST and the denominator is the standard error of ̅. This statistic is always asymptotically standard normal if the null is true and [ ] [ ] , where is the difference in power between and AST for obeservation . This is not true in this case since the variance of is equal to where is the power at point and can be anything from to

(22)

. The statistic is sometimes standard normally distributed in case of different variances between observations though4. So we can carry out the test described above, and from table 1 we can after some calculation find the p-value for a two sided test, which is 0.966. Another way of testing the hypothesis above is to count number of power estimates outside of the 95 percent confidence intervals (see section 6.2). Each of these points has probability 0.05 of being in the rejection region (i.e., outside the confidence interval) if the null is true. From this it follows that number of power estimates in the rejection region is binomially distributed, and thus we can make use of the fact described next (Casella & Berger, 2002):

̂ √ Lets define { then √

and from this it follows that we can form 95 percent confidence intervals. The following is attained when performing an analysis of whether statistical evidence exists that [ ] is different from :

Mean Lower confidence limit Upper confidence limit

0.0523 0.0489 0.0558

Table 2: Estimated expected proportion of power estimates outside of the 95 percent confidence interval of that power estimate.

As we can see, we do not have statistical evidence of that Student’s T-test and AST is different in terms of power, at least not for samples smaller than or equal to 30. We can also base a test on number of times the AST is estimated to be more powerful than Student’s T, which is a binomially distributed variable with probability of success equal to one half under the null. The p-value for this test is 0.033 based on a two sided hypothesis, so this test does not fail to reject the null hypothesis of equally powerful tests.

4 Identical variance for each observation is not a necessity as is proven in Appendix B with a simple simulation

(23)

6.5 Summary of evaluation

Three hyptothesis tests that tested the same hypothesis were carried out whereas two of them failed to reject the null and one that did reject the null. If we would have taken into account that multiple tests were performed by using, say, Bonferroni correction, we would not have gotten any significant results though. Given the very large sample size and high p-values in the hypothesis tests, arguments for any practical difference between the usual T-test and the test proposed in this paper can hardly be made. Based on this, we can by all means fairly confident say that it does not matter which test out of the two discussed in this paper that is used, at least under the circumstances tested in this paper and also given that the normality assumption is met.

7 Extensions of the test

There are many situations where the same ideas can be used. One such situation is in ordinary least squares regression (OLS). When testing whether a parameter is significantly different from some value, we usually do not utilize all information known under the null. The idea is the same as discussed in this paper: instead of estimating the standard error the usual way, we can estimate it with information known under the null. In OLS we estimate a model’s parameters in the following way

̂

and the covariance matrix of our estimates are An unbiased estimate of is ( ̂) (7) where is the number of estimated parameters (Greene, 2000). Now, by utilizing the same idea as in the variance estimator in AST we can get a better estimate of the standard error of a parameter when testing hypothesis about either one or many parameters via an F-test. This is carried out by setting the parameter estimates that we wish to test equal to what is stated in the null hypothesis, when estimating . In other words, instead of using the vector of estimated parameters,

(24)

̂ ( ̂ ̂ ̂ )

when estimating , we can use the parameter estimates along with the information known under the null. For example, if we want to test whether all parameters except the intercept are zero in a regression model with two independent variables, then would be estimated by using the following information instead of ̂, as shown below:

̃ ( ̂

)

Another situation in which we can make use of the idea discussed in this paper is when testing for equal means between two groups where we assume equal variances. The test statistic in this case is carried out the following way (Wackerly, Mendenhall & Sheaffer, 2002):

̅ ̅ √

where is an estimate of and is estimated as follows

( ̅ )

( ̅ )

Another way of estimating the population variance is by using more information, which we are able to do since we are assuming equal means. By using this information we can estimate

with the statistic described next:

̂ ( ̅̅) ( ̅̅)

where ̅̅ is the estimated total mean from both groups. In mathematical notation:

̅̅

(25)

In this way we are getting a better estimate, under the null, of the population variance.

8 Summary and conclusions

The AST test has been derived and evaluated with Student’s T-test as reference point. It is utilizing information known under the null hypothesis to get a better estimate of the population variance. By doing this we hoped to get a test that performed better than Student’s T-test. It did not! The conclusions are that in the settings tested in this paper, the AST were performing neither worse nor better than Student’s T-test. A couple of extensions of the test is proposed and should be evaluated in further analysis.

Conclusions of the findings here are that there is no need at all for using AST instead of Student’s T-test. AST is not evaluated under situations when the assumption of normality is not met either, where there are other tests that are well explored and that should be used instead. Nevertheless, further analysis of test situations where we can utilize as much information as possible from the null hypothesis should be carried out. And who knows, it may outperform the T-test in the regression example discussed in section 7, but probably not!

(26)

References

Box, J, F. (1987). Guinness, Gosset, Fischer, and small samples. Statistical Science, 2, 45-52.

Bryc, W. (1995). The normal distribution: characterizations with applications. Springer-Verlag.

Casella, G., & Berger, R. (2002). Statistical Inference: Second Edition. Duxbury Press. Cohen, J. (1992). A power primer. Psychological Bulletin, 112, 155-159.

Ghosh, B, K. (1979). A Comparison of Some Approximate Confidence Intervals for the Binomial Parameter. Journal of American Statistical Association, 74, 894-900.

Greene, W. (2000). Econometric analysis: Fourth edition. Prentice-Hall: New Jersey.

Wackerly, D., Mendenhall, W., & Scheaffer, R. (2002). Mathematical Statistics with

(27)

Appendix A: Review of method of evaluation

In order to keep this section as simple as possible we will only digress the case where , but the other cases is carried out in basically the same way. To generate the plots described in 6.2 we can start by attaining values on Cohen’s d for each sample size between 2 and 30, and also for each point where . These values are then put into a matrix

[

]

where, for example, is the value of Cohen’s d that gives an exact power for Student’s

T-test of when the sample size is . The next step is to attain the critical values for AST for each sample size and put these into a vector,

The method of attaining these critical values is explained in 6.2. When this is done we can run the code given in section A.1, which is semantically explained below. Only the important parts of the code will be explained. The function is reviewed semantically below:

1. Function that returns estimated power of AST for a specified . 2. Run 1 for all .

3. Save plot and power estimates.

4. Run 2 and 3 for each value of .

The matrix is used to get the correct critical values for each sample size and matrix is used to get correct . The code used to carry out these steps is given below and was written in R.

A.1 Code in R

prog <- function(M, alfa) {

e <- 100*alfa; f <- 100*alfa+1; g <- 99-100*alfa; l <- length(d[,1]) if ((alfa == 0.1 | alfa == 0.05 | alfa == 0.01) & g == l) {

poweronly <- function(N, n, mu, muzero, sd, cri) { p <- numeric(N)

for (i in 1:(N)) { x <- rnorm(n, mu, sd) s2 <- sum((x-muzero)^2)/n

(28)

if (abs(test) > cri) {p[i]=1} }

mean(p) }

tp <- numeric(g); for (i in f:99) {tp[i-e] <- c(0.01*i)} r <- matrix(0, (g), 29)

for (j in 1:29) { a <- crit[j,1] for (i in 1:g) {

r[i,j] = poweronly(M, j+1, 0, d[i,j], 1, a)-tp[i] }

plott <- plot(tp,r[,j], type="l", xlab="Actual power for Student's T-test", ylab="Estimated difference in power", ylim=c(-0.018, 0.018))

curve(1.96*sqrt(x*(1-x)/M), lty="dashed", add=T); curve(-1.96*sqrt(x*(1-x)/M), lty="dashed", add=T)

}

#return(r)

} else {"Wrong alfa level and/or non-conformable alfa level and matrix d"} }

(29)

Appendix B: Lyapunov’s CLT

In this appendix I will give both a reference to Lyapunov’s CLT and a simulation-based proof of that we can make use of a central limit theorem despite non-equal variances of the observations . The outline of the simulation is as follows: (1) simulate independently from the Bernoulli distribution and assign a variance equal to , to each ; (2) calculate the test statistic described in (6); (3) repeat step (1) and (2) 100 times; attain p-value from the Shapiro-Wilks normality test; (4) repeat steps (1), (2) and (3) 10 000 times; (5) count number of times the p-values are below . If (6) is standard normally distributed then this simulation should yield a result between 0.0457 and 0.0543 95 percent of the times. The result of the simulation was 0.0488, as expected.

This simulation merely illustrates the russian mathematician Alexander Lyapunov’s central limit theorem, which states that if we are dealing with observations with unequal variances, then

(6) is asymptotically standard normally distributed if

∑ [| | ] for some .

B.1 Code in R

prog <- function(n,N,M) { set.seed(12345)

Z <- numeric(N); shap <- numeric(M); x <- numeric(n); q <- numeric(20) for (i in 1:20) {q[i]=0.04*i*(1-0.04*i)/1}; s <- sqrt(10*sum(q))

for (j in 1:M) { for (a in 1:N) { for (k in 0:(n/20-1)) { for (i in 1:20) { x[20*k+i] <- 0.04*i-rbinom(1,1,0.04*i) } }

(30)

Z[a] <- mean(x)/(s) }

shap[j] <- shapiro.test(Z)$p.value }

p1 <- shap<0.1; p05 <- shap<0.05; p01 <- shap<0.01 return(list(mean(p1),mean(p05),mean(p01)))

}

(31)

Appendix C: More graphs

C.1 Two-sided hypothesis and a significance level of 0.1

n=2 n=3 n=4 n=5 n=6 n=7 n=8 n=9 n=10 n=11 n=12 n=13 n=14 n=15 n=16

(32)

n=17 n=18 n=19 n=20 n=21 n=22 n=23 n=24 n=25 n=26 n=27 n=28 n=29 n=30

C.2 Two-sided hypothesis and a significance level of 0.05

(33)

n=5 n=6 n=7 n=8 n=9 n=10 n=11 n=12 n=13 n=14 n=15 n=16 n=17 n=18 n=19 n=20 n=21 n=22

(34)

n=23 n=24 n=25 n=26 n=27 n=28 n=29 n=30

C.3 Two-sided hypothesis and a significance level of 0.01

n=2 n=3 n=4 n=5 n=6 n=7 n=8 n=9 n=10

(35)

n=11 n=12 n=13 n=14 n=15 n=16 n=17 n=18 n=19 n=20 n=21 n=22 n=23 n=24 n=25 n=26 n=27 n=28

(36)

n=29 n=30

(37)

n=2 n=3 n=4 n=5 n=6 n=7 n=8 n=9 n=10 n=11 n=12 n=13 n=14 n=15 n=16 n=17 n=18 n=19

(38)

n=20 n=21 n=22 n=23 n=24 n=25 n=26 n=27 n=28 n=29 n=30

C.5 One-sided hypothesis and a significance level of 0.05

n=2 n=3 n=4 n=5 n=6 n=7

(39)

n=8 n=9 n=10 n=11 n=12 n=13 n=14 n=15 n=16 n=17 n=18 n=19 n=20 n=21 n=22 n=23 n=24 n=25

(40)

n=26

n=27

n28

n=29

n=30

C.6 One-sided hypothesis and a significance level of 0.01

n=2 n=3 n=4 n=5 n=6 n=7 n=8 n=9 n=10 n=11 n=12 n=13

(41)

n=14 n=15 n=16 n=17 n=18 n=19 n=20 n=21 n=22 n=23 n=24 n=25 n=26 n=27 n=28 n=29 n=30

References

Related documents

Comparing the two test statistics shows that the Welch’s t-test is again yielding lower type I error rates in the case of two present outliers in the small sample size

The  importance  of  a  well­functioning  housing  market  has  been  proposed  for 

46 Konkreta exempel skulle kunna vara främjandeinsatser för affärsänglar/affärsängelnätverk, skapa arenor där aktörer från utbuds- och efterfrågesidan kan mötas eller

This study will compare the estimated VO 2max results of a beep test and a 30-15 intermittent fitness test (IFT) on floorball players and see if players get a different

The new CET-4 test is considered to be higher in validity than the old test. To begin with, the new specifications provide more explicit definitions of constructs to be tested as

To investigate in what way cognition and emotion interact in the Iowa Gambling Test, a number of researchers have studied how results on the Iowa Gambling Test

In the cases with slight and moderate violations of the two assumptions, the performance seems to follow the same patterns in terms of size as it does under ideal conditions;

Industrial Emissions Directive, supplemented by horizontal legislation (e.g., Framework Directives on Waste and Water, Emissions Trading System, etc) and guidance on operating