Mean-squared errors of small area estimators under a multivariate linear model for repeated measures data

(1)

Mean-Squared errors of small area

estimators under a multivariate linear

model for repeated measures data

Innocent Ngaruye, Dietrich von Rosen and Martin Singull

LiTH-MAT-R--2017/05--SE

(2)

(3)

under a multivariate linear model for repeated

measures data

Innocent Ngaruye1,2, Dietrich von Rosen1,3 and Martin Singull1

1_{Department of Mathematics, Link¨}_{oping University,}

SE–581 83 Link¨oping, Sweden

E-mail: {innocent.ngaruye, martin.singull}@liu.se

2_{Department of Mathematics,}

College of Science and Technology, University of Rwanda, P.O. Box 3900 Kigali, Rwanda

E-mail: i.ngaruye@ur.ac.rw

3_{Department of Energy and Technology,}

Swedish University of Agricultural Sciences, SE–750 07 Uppsala, Sweden

E-mail: dietrich.von.rosen@slu.se

Abstract

In this paper, we discuss the derivation of the first and second moments for the proposed small area estimators under a multivariate linear model for repeated measures data. The aim is to use these moments to estimate the mean-squared errors (MSE) for the predicted small area means as a measure of precision. A two stage estimator of MSE is obtained. At the first stage, we derive the MSE when the covariance matrices are known. To obtain an unbiased estimator of the MSE, at the second stage, a method based on parametric bootstrap is proposed for bias correction and for prediction error that reflects the uncertainty when the unknown covariance is replaced by its suitable estimator.

Keywords: Mean-squared errors, Multivariate linear model, Repeated measures data, Small area estimation

(4)

1 Introduction

Reliable information about various population characteristics of interest are needed by pol-icy and decision makers for planning. Therefore, there is a great need to estimate these characteristics of interest via survey sampling, not only for the total target population, but also for local sub-population units (domains). However, most sampling surveys are designed to target much larger populations. Then, the derived direct survey estimators obtained using data only from the target small domain of interest have been found to be with lack of precision due to small sample size connected to this domain. The development of esti-mation techniques that provide reliable estimates for such a small domain or small area and standard errors of estimates have been a big concern in recent years. These techniques are commonly known as Small Area Estimation (SAE) methods. For comprehensive reviews of SAE, one can refer to Rao (2003); Rao and Isabel (2015); Pfeffermann (2002, 2013).

Longitudinal surveys with repeated measures data over time are developed to study pattern of changes and trends over time. The demand for SAE statistics is not only for cross-sectional data, but also for repeated measures data. Ngaruye et al. (2016) have pro-posed a multivariate linear model for repeated measures data under small area estimation settings which accounts for grouped response units and random effects variations. One of the methods to ensure the precision of model-based estimators is the assessment of its mean-squared error (MSE). There is an extensive literature about the approaches used for the estimation of mean squared error of model-based small area estimators. The second-order approximation of asymptotically unbiased mean squared error based on Taylor series expan-sion has been considered by various authors such as Kackar and Harville (1984); Prasad and Rao (1990); Das et al. (2004); Datta et al. (1999), among others. However, as pointed out by Kubokawa and Nagashima (2012), the Taylor series expansion is sometimes complicated to implement for complicated models with many unknown parameters since it requires the computation of asymptotic bias and asymptotic variance and covariance for estimators of unknown parameters.

In this paper, we aim to derive first and second moments of the proposed estimators for unknown parameters in a model considered by Ngaruye et al. (2016) and we use these moments to derive the mean-squared errors (MSE) for the predicted small area means. Further, following Butar and Lahiri (2003); Kubokawa and Nagashima (2012) we propose an unbiased estimator of MSE based on parametric bootstrap method.

The article is organized as follows: In Section 2, the description of the considered model is reviewed. In Section 3, the approach used for estimation and prediction is presented. In Section 4, some preliminary basic results that are refereed to in next the sections are provided. The first and second moments of the proposed estimators are presented in Section 5 and in Section 6 an unbiased estimator of MSE of predicted small area means under a multivariate linear model for repeated measures data is derived.

(5)

2 Description of the model

Consider the multivariate linear regression model for repeated measurements as defined in Ngaruye et al. (2016). Assume that a p-vector of measurements over time for a finite population of size N divided into m non-overlapping small areas of size Ni, i = 1, . . . , m

together with r-vector of auxiliary variables are available for all units in the population. Suppose also that the target population is composed of k group units and denote by Nig

the population size of the g-th group units, g = 1, . . . , k such thatPk

g=1Nig= Ni and that

the mean growth of the jth unit, j = 1, . . . , Ni, in area i for each group to be a polynomial

in time of degree q − 1. Then, the unit level regression model for the j-th unit coming from the small area i at time t which applies for each one of all k group units can be expressed by

yijt= β0+ β1t + · · · + βqtq−1+ γ0xij+ uit+ eijt,

i = 1, . . . , m; j = 1, . . . , Ni; t = t1, . . . , tp,

where the random errors eijt and random effects uit are independent and assumed to be

i.i.d normal with mean zero and variance σ2

e and σ2ut, respectively. The γ is a vector of

fixed regression coefficients representing the effects of auxiliary variables. The β0, . . . , βq are

unknown parameters. For all time points, the model can be written in matrix form as

y_ij = Aβ + 1pγ0xij+ ui+ eij, i = 1, . . . , m; j = 1, . . . , Ni;

where 1p is a p-vector of ones and ui is assumed to be multivariate normally distributed

with zero mean and unknown positive definite covariance matrix Σu.

The associated multivariate linear regression model for all units coming from the i-th small area belonging to the g-th group units can be expressed by

Yig= Aβg10Nig + 1pγ

0_X

ig+ uiz0ig+ Eig, i = 1, . . . , m, g = 1, . . . , k, (1)

and the model at small area level for all k group units together, belonging to the i-th small area can be expressed as

Yi=ABCi+ 1pγ0Xi+ uiz0i+ Ei,

ui∼ Np(0, Σu), (2)

Ei∼ Np,Ni(0, Σe, INi),

where zi= √1_N

i1Ni. The corresponding model combining all disjoint m small areas and all

N units divided into k non-overlapping group units is given by

Y =ABHC + 1pγ0X + U Z + E,

E ∼ Np,N(0, Σe, IN), (3)

(6)

where the matrices Z, H, C are of full row rank and the matrix A is of full column rank. In addition, C(Z0) ⊆ C(C0) and ZZ0 = Im, where where C(A) denotes the column vector

space generated by the columns of the matrix A.

In model (3), Y : p × N is the data matrix, A : p × q, q ≤ p is the within individual design matrix indicating the time dependency within individuals, B : q × k is unknown parameter matrix, C : mk × N with rank(C) + p ≤ N and p ≤ m is the between individual design matrix accounting for group effects, the matrix U : p × m is a matrix of random effect whose columns are assumed to be independently distributed as a multivariate normal distribution with mean zero and a positive dispersion matrix Σu, i.e. U ∼ Np,m(0, Σu, Im),

Z : m × N is the design matrix for random effect and the columns of the error matrix E are assumed to be independently distributed as p-variate normal with mean zero and and known covariance matrix Σe, i.e. E ∼ Np,N(0, Σe, IN). The matrix H : k × mk captures

all k group units by stacking as column blocks the m data matrices of model (2) together in a new matrix and is included in the model for technical issues of estimation. More details about model formulation can be found in Ngaruye et al. (2016).

3 Estimation and prediction

Model (3) is considered as a random effects growth curve model with covariates. For a comprehensive review of different considerations of the random effects growth curve model, see for e.g., Yokoyama and Fujikoshi (1992); Yokoyama (1995); Nummi (1997); Pan et al. (2002). The estimation and prediction are performed with a likelihood based approach. In what follows, Ao stands for any matrix of full rank spanning C(A)⊥, i.e., C(Ao) = C(A)⊥, where C(A)⊥ _{is an orthogonal complement to C(A). Moreover, A}− _{denotes an arbitrary}

generalized inverse of the matrix A such that AA−A = A. We also denote by PA =

A(A0A)−_A0 _{and Q}

A= I − PA the orthogonal projection matrices onto the column space

C(A) and onto its orthogonal complement C(A)⊥_{, respectively. More details about the}

growth curve model can be found for example in Kollo and von Rosen (2005) and derivation of estimators and predictors of model (3) are developed in Ngaruye et al. (2016).

We make an orthogonal transformation of model (3) and partition it into three indepen-dent models. This partition is based on orthogonal diagonalization of the idempotent matrix (CC0)−1/2CZ0ZC0(CC0)−1/2 by Γ =hΓ1: Γ2

i

, the orthogonal matrix of eigenvectors for m and N − m elements. We use the following notations to shorten matrix expressions:

K1= H(CC0)1/2Γ1, R1= C0(CC0)−1/2Γ1, (4)

K2= H(CC0)1/2Γ2, R2= C0(CC0)−1/2Γ2. (5)

(7)

this orthogonal transformation, we obtain independently distributed models V1=Y R1∼ Np,m M1, Σu+ Σe, Im , V2=Y R2∼ Np,mk−m M2, Σe, Imk−m , W =Y C0o∼ Np,N −mk M3, Σe, IN −mk ,

with the means

M1=ABK1+ 1pγ0XR1,

M2=ABK2+ 1pγ0XR2,

M3=1pγ0XC0o.

Throughout this paper, we consider only the particular choice of p = q. The estimation of mean and covariance is performed using a likelihood based approach while the prediction of random effect is performed using Henderson’s approach consisting of the maximization of the joint density f (Y , U ) with respect to U under the assumption of known Σeand Σu

(Henderson, 1973). Then we have the following theorem:

Theorem 3.1. (Ngaruye et al. (2016)) Consider the model (3) and assume that the matrices X and Ko₂0K1 are of full rank. For a particular case when p = q, the within

design matrix Ap×p is non singular, then the estimators for γ, Σu, the linear combination

BHC and the predictor of U are given by

b γ =1 p(XP X 0₎−1_{XP Y}0₁ p, b BHC =A−1Y −1 p1p1 0 pY P X 0_{(XP X}0₎−1 XR2K02(K2K02) −_{HC + A}−1_V 3K01P1HC, b Σu= 1 mV3QK01Ko2V 0

3− Σe assumed to be positive definite,

b U =(Σe+ bΣu)−1ΣbuV3QK0 1Ko2R 0 1Z 0_, where P =C0o(C0o)0+ R2QK0 2R 0 2, P1=Ko2(K o 2 0 K1K01K o 2) −1_Ko 2 0 , and V3= Y −1 p1p1 0 pY P X 0_{(XP X}0₎−1 XG2R1, G2=I − R2K02(K2K02)−HC.

The proof of Theorem 3.1 is presented in the Appendix. We note that the estimators given in Theorem 3.1 are unique. The following two lemmas discuss the uniqueness of

(8)

estimators bBHC and bU , the proof of the uniqueness of other estimators is straightforward.

Lemma 3.1. The estimator bBHC given in Theorem 3.1 is invariant with respect to the choice of generalized inverse.

Proof. By replacing V3 with its expression in the Theorem 3.1, we can rewrite bBHC as

b BHC =A−1Y −1 p1p1 0 pY P X 0_{(XP X}0₎−1 XR2K02(K2K20)−+ G2R1K01P1 HC =A−1Y −1 p1p1 0 pY P X 0_{(XP X}0₎−1 XR2K02(K2K20)− Ik− K1K01P1 + R1K01P1 HC. We can put K1K01P1=K1K01K o 2(K o 2 0_K 1K01K o 2)−1K o 2 0 =Ik− K2 K0₂(K1K01) −1_K 2 − K0₂(K1K01) −1_. ₍₆₎ Therefore, b BHC =A−1Y −1 p1p1 0 pY P X 0_{(XP X}0₎−1 XR2K02(K2K02) −_K 2 K0₂(K1K01) −1_K 2 − × K0₂(K1K01) −1_{+ R} 1K01P1 HC

is unique, which completes the proof of the lemma.

Lemma 3.2. The predictor bU given in Theorem 3.1 is invariant with respect to the choice of generalized inverse.

Proof. From the expression of bU in Theorem 3.1, inserting the value of bΣuyields

b U =(Σe+ bΣu)−1ΣbuV3QK0 1Ko2R 0 1Z 0₌_I p− mσ2e V3QK0 1Ko2V 0 3 −1 V3QK0 1Ko2R 0 1Z 0_.

First observe that

V3QK0 1Ko2 = Y −1 p1p1 0 pY P X 0_{(XP X}0₎−1 XR1QK0 1Ko2 −Y −1 p1p1 0 pY P X0(XP X0) −1 XR2K02(K2K02)−K1QK0 1Ko2

and also note that K01P1K1= PK0

1Ko2. From relation (6), it follows that

K1QK0 1Ko2 =K1− K1PK 0 1Ko2 = Ik− K1K 0 1P1 K1 =K2 K0₂(K1K01) −1_K 2 − K0₂(K1K01) −1_K 1. Thus, V3QK0

1Ko2 does not depend on a choice of a generalized inverse and the uniqueness

of V3QK0

(9)

3.1 Prediction of small area means

The prediction of small area means is based on the prediction approach to finite population under model-based theory. By this approach, the target population under study is considered as a random sample from a larger population characterized by a suitable model and the predictive distribution of the values for non sampled units is obtained given the realized values of sampled units (Bolfarine and Zacks, 1992). The model (3) considered in this paper belongs to the extensions of unit level model, often known as nested linear regression model which was originally proposed by Battese et al. (1988) for prediction of mean per-capital income in small geographical areas within counties in the United States. Following these authors, the estimation of a population mean from the sample returns to the prediction of a mean of non-sampled values.

For k group units in all small areas, we consider the partition of Ni units into Nig, g =

1, ..., k, and ni sampled units into nig such that Ni = P k

g=1Nig and ni = P k

g=1nig and

similarly for Yi such that Yi = [Yi1, . . . , Yik]. Then the corresponding target small area

means at each time point for each group unit are given by

b µ_ig= 1 Nig Y(s)_ig 1nig+ bY (r) ig 1Nig−nig , (7)

where Y(s)_i = (yi1, · · · , yini) : p × ni, standing for the sampled niobservations from the i-th

small area and bY(r)_i = (y_in_i+1, · · · , y_iN_i) : p × (Ni− ni), corresponds to the predicted values

for non-sampled (Ni− ni) units from the i-th small area. The first term of the expression

(7) on the right side is known from the sampled observations and the second term is the prediction of non-sampled observations obtained using the considered model (Henderson, 1975) and is given by b Y(r)ig 1Nig−nig = (1 − nig Nig )Abβg+ 1 Nig 1p10Nig−nigX (r)0 ig γ +b pNig− nig Nig b ui, (8)

where X(r)_ig stands for the matrix of auxiliary information of non-sampled units in the i-th area belonging to the group units g.

Note that bβg is the estimator of βg which is the g-th column of the estimated matrix

b

B and u_bi is the i-th column of the predicted matrix bU . Throughout this paper, we are

interested in the estimation of mean-squared errors (MSE) for the predicted small area means given in above relation (8). Therefore, we need to calculate the moments of proposed estimators in order to derive the estimation of MSE.

(10)

4 Preparations for moments calculation

In this section a definition and lemma are given for some technical results that are refereed to later on for the moment derivations and other calculations. The following definition defines the covariance between two random matrices.

Definition 4.1. Let X and Y be two random matrices. The covariance between X and Y is defined by

cov(X, Y ) = E[vecXvec0Y ] − E[vecX]E[vec0Y ],

and the dispersion matrix D[X] is defined by D[X] = cov(X, X), where vec is the usual column-wise vectorization operator and vec0 is the transpose.

To simplify our calculations later we can use some identities for the matrices K1, K2, R1

and R2.

Lemma 4.1. From the definition of the matrices K1, K2, R1 and R2given in equations

(4) and (5) together with the fact that C(Z0) ⊆ C(C0) and ZZ0 = Im, the following identities

hold (i) R01R1Z = Z (ii) Z0ZR1= R1 (iii) K1K01= HCZ 0_ZC0_H0 (iv) R0₂R1= R02Z 0 _{= 0 and hence P Z}0 _{= 0} (v) HCP = 0

Proof of Lemma 4.1. We prove the first three identities, the others are obtained by straightforward calculations. From the orthogonal diagonalization

(CC0)−1/2CZ0ZC0(CC0)−1/2= ΓDΓ0 =hΓ1 Γ2 i " Im 0 0 0 # " Γ0₁ Γ0₂ # . It follows that Γ1(CC0)−1/2CZ0ZC0(CC0)−1/2Γ01= R 0 1Z 0_ZR 1= Im.

Furthermore, since ZZ0 = R0₁R1 = Im. It follows that ZZ0Z = R01R1Z = Z. Similarly,

from R0₁Z0ZR1 = R01R1, we deduce that Z0ZR1= R1. Moreover, since K1 is a full row

rank, then K1K01= K1R01Z 0_ZR 1K01= HCR1R01Z 0_ZR 1R01C 0_H0_{= HCZ}0_ZC0_H0_.

Later in our calculation we will also use the following well known result

vec(ABC) = (C0⊗ A)vecB, where the symbol ⊗ denotes the Kronecker product.

(11)

5 Moments of proposed estimators

In this section, we present the calculations of the first and second moments of the proposed estimators. For the purpose of moment calculations, from now on, we suppose that we have complete knowledge on the covariance matrix Σuor it has been estimated previously so that

it is taken to be as known. In that regard, we use the following predictor of U

e U = (Σe+ Σu)−1ΣuV3QK0 1Ko2R 0 1Z 0_. ₍₉₎

Theorem 5.1. Given the estimators in Theorem 3.1. Then, bY = A bBHC + 1pγ_b0X +

e

U Z is an unbiased predictor, i.e., E[ bY ] = E[Y ].

Proof. First we show that_bγ is an unbiased estimator. We have

E[bγ] =1 p(XP X 0₎−1 XP E[Y0]1p =1 p(XP X 0 )−1XP C0H0B0A0+ X0γ10p1p =1 p(XP X 0₎−1_{XP X}0_γ10 p1p= γ. Therefore, E[1pbγ 0

X] = 1pγ0X. Moreover, from the expression of bBHC given in Theorem

3.1, it follows that E[A bBHC] =E[Y ] −1 p1p1 0 pE[Y ]P X 0_{(XP X}0₎−1 XR2K02(K2K02) −_{+ G} 2R1K01P1 HC =ABHC + 1pγ0X − 1 p1p1 0 p(ABHC + 1pγ0X)P X0(XP X0) −1 X ×R2K02(K2K02) −_{+ G} 2R1K01P1 HC =ABHC R2K02(K2K02) −_{+ G} 2R1K01P1HC =ABK2K02(K2K02) −_{HC + ABK} 1K01P1HC − ABK2K02(K2K02) −_K 1K01P1HC.

Inserting equation (6) into the expression of E[A bBHC] yields E[A bBHC] =ABK2K02(K2K02)−HC + ABHC − ABK2

K0₂(K1K01)−1K2 − × K0₂(K1K01)−1HC − ABK2K02(K2K02)−HC + ABK2K02(K2K02)−K2 K02(K1K01)−1K2 − K02(K1K01)−1HC =ABHC.

(12)

Hence, also A bBHC is an unbiased estimator. Finally, we have E[ eU Z] =(Σe+ Σu)−1ΣuE[V3]QK0 1Ko2R 0 1Z 0_Z =(Σe+ Σu)−1ΣuAB Ik− K2K02(K2K02) −_K 1QK0 1Ko2R 0 1Z 0_Z, ₍₁₀₎

where eU is given in (9) and since from the expression of V3, it follows that

E[V3] = ABHCG2R1= AB

Ik− K2K02(K2K02)−

K1.

Recall that from the proof of Lemma 3.2, we have

K1QK0 1Ko2 = K2 K0₂(K1K01)−1K2 − K0₂(K1K01)−1K1

and plugin this expression into E[ eU Z] given in (10) above leads to E[ eU Z] =(Σe+ Σu)−1ΣuAB Ik− K2K02(K2K02) − × K2 K0₂(K1K01)−1K2 − K0₂(K1K01)−1K1R01Z 0_{Z = 0.}

Hence, it is straightforward to see that

E[ bY ] = E[A bBHC + 1pγb

0_{X + e}_{U Z] = ABHC + 1}

pγ0X = E[Y ],

which completes the proof of the theorem.

The following two theorems give the main results of the paper about moments of the proposed estimators. For simplicity, in the next calculations the matrix A involved in bB will be considered as an identity matrix since A is non singular and it is always possible to do reparameterization and get back to bB.

Theorem 5.2. Given the estimators in Theorem 3.1. The dispersion matrices D[γ],b D[ bBHC] and D[ eU ] are given by

mathbbD[γ] =_b σ 2 e p (XP X 0₎−1_, D[ bBHC] =C0H0P1HC ⊗ Σu+ Σe + C0H0 Ik− P1K1K01(K2K02)− × Ik− K1K01P1HC ⊗ Σe+ σ2 e pC 0_H0_(K 2K02)−K2R02+ P1K1R01G 0 2 × X0(XP X0)−1XR2K02(K2K02)−+ G2R1K01P1 HC ⊗ 1p10p,

(13)

D[ eU ] =ZR1QK0 1Ko2 Im+ K1(K2K 0 2)−K1QK0 1Ko2R 0 1Z0⊗ Σu(Σe+ Σu)−1Σu +σ 2 e pZR1QK01Ko2R 0 1G 0 2X 0_{(XP X}0₎−1 XG2R1QK0 1Ko2R 0 1Z 0 ⊗ Σu(Σe+ Σu)−11p10p(Σe+ Σu)−1Σu.

In what follows, we use the notation (Y )()0instead of (Y )(Y )0in order to shorten matrix expressions.

Proof. Recall thatγ =_b 1 p(XP X

0₎−1_{XP Y}0₁

p. Hence, it follows that

D[bγ] =1 p2 10p⊗ (XP X0)−1XP D[Y0] 0 =1 p2 10p⊗ (XP X0)−1XP Σu⊗ Z0Z + Σe⊗ IN 0 =σ 2 e p ⊗ (XP X0)−1XP X0(XP X0)−1= σ 2 e p(XP X 0₎−1_,

since P Z0= 0. Moreover, replacing V3by its expression in the Theorem 3.1, we can rewrite

b BHC by b BHC =Y −1 p1p1 0 pY P X0(XP X0) −1 XR2K02(K2K20)−+ G2R1K01P1 HC

and therefore, the dispersion matrix D[ bBHC] is given by D[ bBHC] = C0H0 (K2K02)−K2R02+ P1K1R01G 0 2 ⊗ Ip −1 pC 0_H0 _(K 2K02) −_K 2R02+ P1K1R01G 0 2X 0_{(XP X}0₎−1_{XP ⊗ 1} p10p ×Z0Z ⊗ Σu+ IN ⊗ Σe 0 .

Using the results from Lemma 4.1, we obtain

D[ bBHC] =C0H0P1HC ⊗ Σu+ Σe + C0H0 Ik− P1K1K01(K2K02) − × Ik− K1K01P1HC ⊗ Σe+ σ2 e pC 0_H0_(K 2K02) −_K 2R02+ P1K1R01G 0 2 × X0(XP X0)−1XR2K02(K2K02)−+ G2R1K01P1 HC ⊗ 1p10p. Moreover, from eU = Σu(Σe+Σu)−1V3QK0 1Ko2R 0

1Z0, and replacing V3by its expression,

we can rewrite e U = Σu(Σe+ Σu)−1 Y −1 p1p1 0 pY P X0(XP X0) −1 XG2R1QK0 1Ko2R 0 1Z0.

(14)

Given this, the dispersion matrix D[ eU ] has the form D[ eU ] = ZR1QK0 1Ko2R 0 1G 0 2⊗ Σu(Σe+ Σu)−1− 1 pZR1QK01Ko2R 0 1G 0 2X 0 (XP X0)−1XP ⊗ Σu(Σe+ Σu)−11p10p Z0Z ⊗ Σu+ IN⊗ Σe 0 =ZR1QK0 1Ko2 Im+ K1(K2K 0 2) −_K 1QK0 1Ko2R 0 1Z 0_{⊗ Σ} u(Σe+ Σu)−1Σu +σ 2 e pZR1QK01Ko2R 0 1G 0 2X 0_{(XP X}0₎−1 XG2R1QK0 1Ko2R 0 1Z 0 ⊗ Σu(Σe+ Σu)−11p10p(Σe+ Σu)−1Σu.

In Theorem 5.2 the dispersion matrices for the estimators and predictor are given. How-ever, it is also of interest to derive the covariances between them.

Theorem 5.3. Consider the estimators given in Theorem 3.1. Then, the covariances cov[ eU ,γ], cov[ b_b BHC, eU ] and cov[ bBHC,γ] are given by_b

cov[ eU ,γ] = −_b σ 2 e p ZR1QK01Ko2R 0 1G 0 2X 0_{(XP X}0₎−1 ⊗ Σu(Σe+ Σu)−11p, cov[ bBHC, eU ] =σ 2 e pC 0_H0 _(K 2K02)−K2R02+ P1K1R01G 0 2X 0_{(XP X}0₎−1 XG2R1QK0 1Ko2R 0 1Z 0 ⊗ 1p10p(Σe+ Σu)−1Σu, cov[ bBHC,γ] = −_b σ 2 e p C 0_H0 _(K 2K02)−K2R02+ P1K1R01G 0 2X 0_{(XP X}0₎−1 ⊗ 1p.

Proof. By using the results from Lemma 4.1, we get

cov[ eU ,γ] =_b ZR1QK0 1Ko2R 0 1G 0 2⊗ Σu(Σe+ Σu)−1− 1 pZR1QK01Ko2R 0 1G 0 2X 0_{(XP X}0₎−1 XP ⊗ Σu(Σe+ Σu)−11p10p Z0Z ⊗ Σu+ IN ⊗ Σe 1 pP X 0_{(XP X}0₎−1 ⊗ 1p = −σ 2 e p ZR1QK01Ko2R 0 1G 0 2X 0_{(XP X}0₎−1 ⊗ Σu(Σe+ Σu)−11p,

(15)

and cov[ bBHC, eU ] =C0H0 (K2K02) −_K 2R02+ P1K1R01G 0 2 ⊗ Ip −1 pC 0_H0 _(K 2K02)−K2R02+ P1K1R01G 0 2X 0_{(XP X}0₎−1_{XP ⊗ 1} p10p ×Z0Z ⊗ Σu+ IN ⊗ Σe G2R1QK0 1Ko2R 0 1Z 0_{⊗ Σ} u(Σe+ Σu)−1 −1 pP X 0_{(XP X}0₎−1 XG2R1QK0 1Ko2R 0 1Z 0_{⊗ 1} p10p(Σe+ Σu)−1Σu =σ 2 e pC 0_H0 _(K 2K02)−K2R02+ P1K1R01G02X0(XP X0) −1 XG2R1QK0 1Ko2R 0 1Z0 ⊗ 1p10p(Σe+ Σu)−1Σu.

Similarly we have the last covariance as

cov[ bBHC,γ] =_b C0H0 (K2K02) −_K 2R02+ P1K1R01G 0 2 ⊗ Ip −1 pC 0_H0 _(K 2K02)−K2R02+ P1K1R01G 0 2X 0_{(XP X}0₎−1_{XP ⊗ 1} p10p ×Z0Z ⊗ Σu+ IN⊗ Σe 1 pP X 0_{(XP X}0₎−1_{⊗ 1} p = −σ 2 e pC 0_H0 _(K 2K02)−K2R02+ P1K1R01G 0 2X 0_{(XP X}0₎−1 ⊗ 1p.

6 Mean-squared errors of predicted small area means

6.1 Derivation of MSE(

τ

_b

ig

)

In this section, following Butar and Lahiri (2003) we estimate the mean squared error for predicted small area means in two steps. At the first step, under the assumption of known covriance matrices Σe and Σu, the derivation of MSE is presented. At the second step,

a parametric bootstrap approach is proposed for bias correction and approximation of the uncertainty due to the estimation of Σu. Put

K =1 − nig Nig A, L = 1 Nig 1p10Nig−nigX (r)0 ig , M = pNig− nig Nig Ip, (11) i = 1, ..., m, g = 1, ..., k.

(16)

Then, the linear prediction and empirical linear prediction quantities from small area means given in (8) can be written by

e

τig =K bβg+ Lγ + Mb uei,

b

τig =K bβg+ Lγ + Mb ubi, i = 1, ..., m, g = 1, ..., k. (12)

Let νgand δibe a k-vector and an m-vector of zeros with one on the g-th and i-th position,

respectively. Choose αg = C0H0(HCC0H0)−1νg so that bβg = bBHCαg, eui = eU δi and

b

ui= bU δi.

Furthermore, let us writeτ_big− τig= (τbig−τeig) + (τeig− τig) and the MSE ofτbig can be obtained by MSE(τ_big) =MSE(eτig) + 2E b τig−eτig e τig− τig 0 + E τbig−τeig b τig−τeig 0 . (13) The first term of the right hand side of (13) has the form

MSE(τ_eig) =E[(eτig− τig)(τeig− τig) 0_] =E[(K(bβ_g− β_g) + L(γ − γ) + M (_b u_ei− ui)) × (K(bβ_g− β_g) + L(γ − γ) + M (_b _eui− ui))0] =KD[bβ_g]K0+ Kcov[bβ_g,γ]L_b 0+ Kcov[bβ_g,u_ei]M0+ LD[bγ]L0 + Lcov[γ,_b u_ei]M0+ M D[eui, bβg]K 0_{+ Lcov[} b γ, bB]K0 + M cov[_eui,γ]Lb 0 + M D[eui− ui]M0. (14)

Observe that, from the definitions given to bβ_g = bBHCαg, uei = eU δi and ubi = bU δi and taking into account that A−1 was omitted when calculating the second moments involving

b

BHC, the covariances presented in equation (14) are expressed by

cov[bβ_g,u_ei] = cov[ bBHCαg, eU δi] = (α0g⊗ A −1_{)cov[ b}_{BHC, e}_{U ](δ} i⊗ Ip), cov[u_ei,bγ] = cov[ eU δi,γ] = (δb 0 i⊗ Ip)cov[ eU ,γ],b cov[bβ_g,γ] = cov[ b_b BHCαg,γ] = (αb 0 g⊗ A −1_{)cov[ b}_BHC, b γ].

Similarly, the dispersion matrices presented in equation (14) are expressed by

D[bβg] = D[ bBHCαg] = (α0g⊗ A −1 )D[ bBHC](αg⊗ A0−1), D[eui] = D[ eU δi] = (δi0 ⊗ Ip)D[ eU ](δi⊗ Ip), and D[eui− ui] =(δi0⊗ Ip)D[ eU − U ](δi⊗ Ip) =(δi0⊗ Ip)(D[ eU ] − 2cov( eU , U ) + D[U ])(δi⊗ Ip) =(δi0⊗ Ip)(D[U ] − D[ eU ])(δi⊗ Ip).

(17)

Altogether we can rewrite MSE(τ_eig) as MSE(τ_eig) =K(α0g⊗ A −1 )D[ bBHC](αg⊗ A0−1)K0+ K(α0g⊗ A −1_{)cov[ b}_BHC, b γ]L0 + K(α0_g⊗ A−1)cov[ bBHC, eU ](δi⊗ Ip)M0+ LD[bγ]L0 + Lcov[ eU ,γ]_b0(δi⊗ Ip)M0+ M (δi0⊗ Ip)cov[ bBHC, eU ]0(αg⊗ A0−1)K0 + Lcov[ bBHC,γ]_b0(αg⊗ A0−1)K0+ M (δ0i⊗ Ip)cov[ eU ,bγ]L 0 + M ΣuM0− M (δ0i⊗ Ip)D[ eU ](δi⊗ Ip)M0, (15)

where the dispersion matrices D[bγ], D[ bBHC], D[ eU ] and the covariance matrices cov[ eU ,γ],_b cov[ bBHC, eU ], cov[ bBHC,γ] are presented in Theorem 5.2 and Theorem 5.3, respectively._b

The second two terms of the right hand side of equation (13) are intractable and need to be approximated. It is important to note that in practice, the covariance matrix Σu is

unknown. As pointed out by different authors (see for example Das et al. (2004)), a naive estimator of MSE obtained by replacing the unknown covariance matrix Σuby its estimator

b

Σu in (15) that ignores the variability associated with bΣu can lead to underestimation of

the true MSE. Therefore, in the next section, we propose a parametric bootstrap method to estimate the MSE when Σu is replaced by its estimator.

6.2 Estimation of MSE(

τ

_b

ig

)

The approximate estimator of MSE(_bτig) given in equation (13) can be decomposed as

MSE(τ_big) = G1i(Σu) + G2i(Σu) + G3i(Σu), (16) where G1i(Σu) + G2i(Σu) = MSE(_eτig) G3i(Σu) = 2E _bτig−τ_eig _eτig− τig 0 + E τbig−eτig b τig−_eτig 0 .

For known Σu, the quantity G1i(Σu) + G2i(Σu) is given in equation (15). When Σu is

replaced by its estimator, the quantity G1i( bΣu) + G2i( bΣu) introduces an additional bias

related to bΣu, i.e., E[ bΣu] − Σu (see Datta and Lahiri (2000)). Following Butar and Lahiri

(2003); Kubokawa and Nagashima (2012), we propose a parametric bootstrap method to estimate the first two terms of (16) by correcting the bias of G1i( bΣu) + G2i( bΣu) when Σu

is replaced by its estimator bΣu in equation (15) and secondly for estimating the third term

G3i(Σu) of equation (16).

Consider the bootstrap model

Y∗_i | u∗

i ∼ Np,ni A bBCi+ 1pγb

0_X

(18)

where u∗_i ∼ Np(0, bΣu). If we put b τig(Yi; bβg,γ, bb Σu) =K bβg+ Lbγ + Muei, (18) b τig(Yi; bβ ∗ g,γb ∗_{, b}_Σ u) =K bβ ∗ g+ Lbγ ∗_{+ M} e u∗_i, (19) b τig(Yi; bβ ∗ g,γb ∗ , bΣ∗_u) =K bβ∗_g+ L_bγ∗+ Mu_b∗_i, i = 1, ..., m, g = 1, ..., k. (20)

then the quantity G1i(Σu) + G2i(Σu) can be estimated by

G1i( bΣu) + G2i( bΣu) − E∗G1i( bΣ ∗ u) + G2i( bΣ ∗ u) − G1i( bΣu) − G2i( bΣu)

and the quantity G3i(Σu) estimated by

2E∗ h b τig(Yi; bβ ∗ g,bγ ∗ , bΣ∗_u) −τ_big(Y∗i; bβ ∗ g,γb ∗ , bΣu) b τig(Yi; bβ ∗ g,bγ ∗ , bΣu) −τbig(Yi; bβg,bγ, bΣu) 0i + E∗ h b τig(Yi; bβ ∗ g,γb ∗ , bΣ∗u) −τbig(Yi; bβ ∗ g,γb ∗ , bΣu) 0i ,

where E∗ is the expectation with respect to model (17), and the calculation ofγb

∗

, bβ∗_g, bΣ∗_u is performed similarly to that ofγ, b_b β_g, bΣu except thatγb

∗

, bβ∗_g, bΣ∗_u are calculated based on Y∗_i instead of Yi.

Thus, when unknown covariance Σuis replaced by its estimator in MSE(τbig) of equation (16), we obtain the proposed estimator of MSE whose results are summarized in the following theorem.

Theorem 6.1. Consider the model (3) and assume that the matrices X and Ko₂0K1are

of full rank. In addition, let p = q so that the matrix A is non singular. Consider also the bootstrap model (17) and the estimated quantities (18), (19) and (20). Then, the estimator of MSE for predicted small area means given in (12) can be expressed by

[ MSE(τ_big) =2 h G1i( bΣu) + G2i( bΣu) i − E∗ h G1i( bΣ ∗ u) + G2i( bΣ ∗ u) i + 2E∗ h b τig(Yi; bβ ∗ g,γb ∗ , bΣ∗_u) −τ_big(Y∗i; bβ ∗ g,γb ∗ , bΣu) ×τ_big(Yi; bβ ∗ g,γb ∗ , bΣu) −bτig(Yi; bβg,γ, bb Σu) 0i + E∗ h b τig(Yi; bβ ∗ g,γb ∗ , bΣ∗_u) −τ_big(Yi; bβ ∗ g,bγ ∗ , bΣu) 0i , (21) where G1i( bΣu) + G2i( bΣu) and G1i( bΣ ∗ u) + G2i( bΣ ∗

u) are given by equation (15) with Σu

replaced by bΣu and bΣ ∗

u, respectively. In addition, the dispersion matrices D[bγ], D[ bBHC],

D[ eU ] and the covariance matrices cov[ eU ,bγ], cov[ bBHC, eU ], cov[ bBHC,γ] involved in equa-b tion (15) are presented in Theorem 5.2 and Theorem 5.3, respectively.

(19)

Appendix

Proof of Theorem 3.1. From the results of Theorem 4.1 in Ngaruye et al. (2016), for A, Ko₂0K1and X of full rank, the the parameters γ and B are uniquely estimated. Furthermore

b T1Ko2 0 =A−1(A0S−1A)−1A0S−1V3K01K o 2(K o 2 0 K1K01K o 2) −1_Ko 2 0 =A−1V3K01K o 2(K o 2 0 K1K01K o 2) −1_Ko 2 0 , A bT1Ko2 0 K1=V3PK0 1Ko2.

Therefore, we can deduce the expressions of bB and bΣu. Moreover, using the results of

Lemma 4.1, straightforward calculations show that

(Y − A bBHC − 1pγb 0_X)Z0_{= V} 3QK0 1Ko2R 0 1Z 0_.

Hence, the predictor of the random effect matrix U is given by

b U =(ΣeΣb −1 u + Ip)−1(Y − A bBHC − 1pbγ 0_X)Z0 =(Σe+ bΣu)−1ΣbuV3QK0 1Ko2R 0 1Z 0_.

Furthermore, we observe that the matrix V3given by

V3=Y R1− Y R2K02(K2K02) −_K 1− 1 p1p1 0 pY P X 0_{(XP X}0₎−1 XR1 +1 p1p1 0 pY P X 0_{(XP X}0₎−1 XR2K02(K2K02) −_K 1,

in Theorem 4.1 from Ngaruye et al. (2016), can be simplified to

V3= Y −1 p1p1 0 pY P X0(XP X0) −1 XG2R1,

where G2 = I − R2K02(K2K02)−HC. Thus, replacing the estimators by their respective

expressions yields the desired results.

References

Battese, G. E., Harter, R. M., and Fuller, W. A. (1988). An error-components model for prediction of county crop areas using survey and satellite data. Journal of the American Statistical Association, 83(401):28–36.

Bolfarine, H. and Zacks, S. (1992). Prediction theory for finite populations. Springer.

Butar, F. B. and Lahiri, P. (2003). On measures of uncertainty of empirical bayes small-area estimators. Journal of Statistical Planning and Inference, 112(1):63–76.

Das, K., Jiang, J., and Rao, J. (2004). Mean squared error of empirical predictor. The Annals of Statistics, 32(2):818–840.

(20)

Datta, G. S. and Lahiri, P. (2000). A unified measure of uncertainty of estimated best linear unbiased predictors in small area estimation problems. Statistica Sinica, pages 613–627.

Datta, G. S., Lahiri, P., Maiti, T., and Lu, K. L. (1999). Hierarchical Bayes estimation of unemployment rates for the states of the US. Journal of the American Statistical Association, 94(448):1074–1082.

Henderson, C. R. (1973). Sire evaluation and genetic trends. Journal of Animal Science, 1973(Symposium):10–41.

Henderson, C. R. (1975). Best linear unbiased estimation and prediction under a selection model. Biometrics, 31(2):423–447.

Kackar, R. N. and Harville, D. A. (1984). Approximations for standard errors of estimators of fixed and random effects in mixed linear models. Journal of the American Statistical Association, 79(388):853–862.

Kollo, T. and von Rosen, D. (2005). Advanced Multivariate Statistics with Matrices. Springer, Dordrecht.

Kubokawa, T. and Nagashima, B. (2012). Parametric bootstrap methods for bias correction in linear mixed models. Journal of Multivariate Analysis, 106:1–16.

Ngaruye, I., Nzabanita, J., von Rosen, D., and Singull, M. (2016). Small area estimation un-der a multivariate linear model for repeated measures data. Communications in Statistics - Theory and Methods. http://dx.doi.org/10.1080/03610926.2016.1248784.

Nummi, T. (1997). Estimation in a random effects growth curve model. Journal of Applied Statistics, 24(2):157–168.

Pan, J.-X., Fang, K., and Fang, K.-T. (2002). Growth Curve Models and Statistical Diag-nostics. Springer Science & Business Media.

Pfeffermann, D. (2002). Small area estimation - new developments and directions. Inter-national Statistical Review, 70(1):125–143.

Pfeffermann, D. (2013). New important developments in small area estimation. Statistical Science, 28(1):40–68.

Prasad, N. and Rao, J. (1990). The estimation of the mean squared error of small-area estimators. Journal of the American Statistical Association, 85(409):163–171.

(21)

Rao, J. N. K. and Isabel, M. (2015). Small Area Estimation. John Wiley and Sons, New York, 2 edition.

Yokoyama, T. (1995). Statistical inference on some mixed manova-gmanova models with random effects. Hiroshima Mathematical Journal, 25(3):441–474.

Yokoyama, T. and Fujikoshi, Y. (1992). Tests for random-effects covariance structures in the growth curve model with covariates. Hiroshima Mathematical Journal, 22:195–202.