Mean-Squared errors of small area
estimators under a multivariate linear
model for repeated measures data
Innocent Ngaruye, Dietrich von Rosen and Martin Singull
LiTH-MAT-R--2017/05--SE
under a multivariate linear model for repeated
measures data
Innocent Ngaruye1,2, Dietrich von Rosen1,3 and Martin Singull1
1Department of Mathematics, Link¨oping University,
SE–581 83 Link¨oping, Sweden
E-mail: {innocent.ngaruye, martin.singull}@liu.se
2Department of Mathematics,
College of Science and Technology, University of Rwanda, P.O. Box 3900 Kigali, Rwanda
E-mail: i.ngaruye@ur.ac.rw
3Department of Energy and Technology,
Swedish University of Agricultural Sciences, SE–750 07 Uppsala, Sweden
E-mail: dietrich.von.rosen@slu.se
Abstract
In this paper, we discuss the derivation of the first and second moments for the proposed small area estimators under a multivariate linear model for repeated measures data. The aim is to use these moments to estimate the mean-squared errors (MSE) for the predicted small area means as a measure of precision. A two stage estimator of MSE is obtained. At the first stage, we derive the MSE when the covariance matrices are known. To obtain an unbiased estimator of the MSE, at the second stage, a method based on parametric bootstrap is proposed for bias correction and for prediction error that reflects the uncertainty when the unknown covariance is replaced by its suitable estimator.
Keywords: Mean-squared errors, Multivariate linear model, Repeated measures data, Small area estimation
1
Introduction
Reliable information about various population characteristics of interest are needed by pol-icy and decision makers for planning. Therefore, there is a great need to estimate these characteristics of interest via survey sampling, not only for the total target population, but also for local sub-population units (domains). However, most sampling surveys are designed to target much larger populations. Then, the derived direct survey estimators obtained using data only from the target small domain of interest have been found to be with lack of precision due to small sample size connected to this domain. The development of esti-mation techniques that provide reliable estimates for such a small domain or small area and standard errors of estimates have been a big concern in recent years. These techniques are commonly known as Small Area Estimation (SAE) methods. For comprehensive reviews of SAE, one can refer to Rao (2003); Rao and Isabel (2015); Pfeffermann (2002, 2013).
Longitudinal surveys with repeated measures data over time are developed to study pattern of changes and trends over time. The demand for SAE statistics is not only for cross-sectional data, but also for repeated measures data. Ngaruye et al. (2016) have pro-posed a multivariate linear model for repeated measures data under small area estimation settings which accounts for grouped response units and random effects variations. One of the methods to ensure the precision of model-based estimators is the assessment of its mean-squared error (MSE). There is an extensive literature about the approaches used for the estimation of mean squared error of model-based small area estimators. The second-order approximation of asymptotically unbiased mean squared error based on Taylor series expan-sion has been considered by various authors such as Kackar and Harville (1984); Prasad and Rao (1990); Das et al. (2004); Datta et al. (1999), among others. However, as pointed out by Kubokawa and Nagashima (2012), the Taylor series expansion is sometimes complicated to implement for complicated models with many unknown parameters since it requires the computation of asymptotic bias and asymptotic variance and covariance for estimators of unknown parameters.
In this paper, we aim to derive first and second moments of the proposed estimators for unknown parameters in a model considered by Ngaruye et al. (2016) and we use these moments to derive the mean-squared errors (MSE) for the predicted small area means. Further, following Butar and Lahiri (2003); Kubokawa and Nagashima (2012) we propose an unbiased estimator of MSE based on parametric bootstrap method.
The article is organized as follows: In Section 2, the description of the considered model is reviewed. In Section 3, the approach used for estimation and prediction is presented. In Section 4, some preliminary basic results that are refereed to in next the sections are provided. The first and second moments of the proposed estimators are presented in Section 5 and in Section 6 an unbiased estimator of MSE of predicted small area means under a multivariate linear model for repeated measures data is derived.
2
Description of the model
Consider the multivariate linear regression model for repeated measurements as defined in Ngaruye et al. (2016). Assume that a p-vector of measurements over time for a finite population of size N divided into m non-overlapping small areas of size Ni, i = 1, . . . , m
together with r-vector of auxiliary variables are available for all units in the population. Suppose also that the target population is composed of k group units and denote by Nig
the population size of the g-th group units, g = 1, . . . , k such thatPk
g=1Nig= Ni and that
the mean growth of the jth unit, j = 1, . . . , Ni, in area i for each group to be a polynomial
in time of degree q − 1. Then, the unit level regression model for the j-th unit coming from the small area i at time t which applies for each one of all k group units can be expressed by
yijt= β0+ β1t + · · · + βqtq−1+ γ0xij+ uit+ eijt,
i = 1, . . . , m; j = 1, . . . , Ni; t = t1, . . . , tp,
where the random errors eijt and random effects uit are independent and assumed to be
i.i.d normal with mean zero and variance σ2
e and σ2ut, respectively. The γ is a vector of
fixed regression coefficients representing the effects of auxiliary variables. The β0, . . . , βq are
unknown parameters. For all time points, the model can be written in matrix form as
yij = Aβ + 1pγ0xij+ ui+ eij, i = 1, . . . , m; j = 1, . . . , Ni;
where 1p is a p-vector of ones and ui is assumed to be multivariate normally distributed
with zero mean and unknown positive definite covariance matrix Σu.
The associated multivariate linear regression model for all units coming from the i-th small area belonging to the g-th group units can be expressed by
Yig= Aβg10Nig + 1pγ
0X
ig+ uiz0ig+ Eig, i = 1, . . . , m, g = 1, . . . , k, (1)
and the model at small area level for all k group units together, belonging to the i-th small area can be expressed as
Yi=ABCi+ 1pγ0Xi+ uiz0i+ Ei,
ui∼ Np(0, Σu), (2)
Ei∼ Np,Ni(0, Σe, INi),
where zi= √1N
i1Ni. The corresponding model combining all disjoint m small areas and all
N units divided into k non-overlapping group units is given by
Y =ABHC + 1pγ0X + U Z + E,
E ∼ Np,N(0, Σe, IN), (3)
where the matrices Z, H, C are of full row rank and the matrix A is of full column rank. In addition, C(Z0) ⊆ C(C0) and ZZ0 = Im, where where C(A) denotes the column vector
space generated by the columns of the matrix A.
In model (3), Y : p × N is the data matrix, A : p × q, q ≤ p is the within individual design matrix indicating the time dependency within individuals, B : q × k is unknown parameter matrix, C : mk × N with rank(C) + p ≤ N and p ≤ m is the between individual design matrix accounting for group effects, the matrix U : p × m is a matrix of random effect whose columns are assumed to be independently distributed as a multivariate normal distribution with mean zero and a positive dispersion matrix Σu, i.e. U ∼ Np,m(0, Σu, Im),
Z : m × N is the design matrix for random effect and the columns of the error matrix E are assumed to be independently distributed as p-variate normal with mean zero and and known covariance matrix Σe, i.e. E ∼ Np,N(0, Σe, IN). The matrix H : k × mk captures
all k group units by stacking as column blocks the m data matrices of model (2) together in a new matrix and is included in the model for technical issues of estimation. More details about model formulation can be found in Ngaruye et al. (2016).
3
Estimation and prediction
Model (3) is considered as a random effects growth curve model with covariates. For a comprehensive review of different considerations of the random effects growth curve model, see for e.g., Yokoyama and Fujikoshi (1992); Yokoyama (1995); Nummi (1997); Pan et al. (2002). The estimation and prediction are performed with a likelihood based approach. In what follows, Ao stands for any matrix of full rank spanning C(A)⊥, i.e., C(Ao) = C(A)⊥, where C(A)⊥ is an orthogonal complement to C(A). Moreover, A− denotes an arbitrary
generalized inverse of the matrix A such that AA−A = A. We also denote by PA =
A(A0A)−A0 and Q
A= I − PA the orthogonal projection matrices onto the column space
C(A) and onto its orthogonal complement C(A)⊥, respectively. More details about the
growth curve model can be found for example in Kollo and von Rosen (2005) and derivation of estimators and predictors of model (3) are developed in Ngaruye et al. (2016).
We make an orthogonal transformation of model (3) and partition it into three indepen-dent models. This partition is based on orthogonal diagonalization of the idempotent matrix (CC0)−1/2CZ0ZC0(CC0)−1/2 by Γ =hΓ1: Γ2
i
, the orthogonal matrix of eigenvectors for m and N − m elements. We use the following notations to shorten matrix expressions:
K1= H(CC0)1/2Γ1, R1= C0(CC0)−1/2Γ1, (4)
K2= H(CC0)1/2Γ2, R2= C0(CC0)−1/2Γ2. (5)
this orthogonal transformation, we obtain independently distributed models V1=Y R1∼ Np,m M1, Σu+ Σe, Im , V2=Y R2∼ Np,mk−m M2, Σe, Imk−m , W =Y C0o∼ Np,N −mk M3, Σe, IN −mk ,
with the means
M1=ABK1+ 1pγ0XR1,
M2=ABK2+ 1pγ0XR2,
M3=1pγ0XC0o.
Throughout this paper, we consider only the particular choice of p = q. The estimation of mean and covariance is performed using a likelihood based approach while the prediction of random effect is performed using Henderson’s approach consisting of the maximization of the joint density f (Y , U ) with respect to U under the assumption of known Σeand Σu
(Henderson, 1973). Then we have the following theorem:
Theorem 3.1. (Ngaruye et al. (2016)) Consider the model (3) and assume that the matrices X and Ko20K1 are of full rank. For a particular case when p = q, the within
design matrix Ap×p is non singular, then the estimators for γ, Σu, the linear combination
BHC and the predictor of U are given by
b γ =1 p(XP X 0)−1XP Y01 p, b BHC =A−1Y −1 p1p1 0 pY P X 0(XP X0)−1 XR2K02(K2K02) −HC + A−1V 3K01P1HC, b Σu= 1 mV3QK01Ko2V 0
3− Σe assumed to be positive definite,
b U =(Σe+ bΣu)−1ΣbuV3QK0 1Ko2R 0 1Z 0, where P =C0o(C0o)0+ R2QK0 2R 0 2, P1=Ko2(K o 2 0 K1K01K o 2) −1Ko 2 0 , and V3= Y −1 p1p1 0 pY P X 0(XP X0)−1 XG2R1, G2=I − R2K02(K2K02)−HC.
The proof of Theorem 3.1 is presented in the Appendix. We note that the estimators given in Theorem 3.1 are unique. The following two lemmas discuss the uniqueness of
estimators bBHC and bU , the proof of the uniqueness of other estimators is straightforward.
Lemma 3.1. The estimator bBHC given in Theorem 3.1 is invariant with respect to the choice of generalized inverse.
Proof. By replacing V3 with its expression in the Theorem 3.1, we can rewrite bBHC as
b BHC =A−1Y −1 p1p1 0 pY P X 0(XP X0)−1 XR2K02(K2K20)−+ G2R1K01P1 HC =A−1Y −1 p1p1 0 pY P X 0(XP X0)−1 XR2K02(K2K20)− Ik− K1K01P1 + R1K01P1 HC. We can put K1K01P1=K1K01K o 2(K o 2 0K 1K01K o 2)−1K o 2 0 =Ik− K2 K02(K1K01) −1K 2 − K02(K1K01) −1. (6) Therefore, b BHC =A−1Y −1 p1p1 0 pY P X 0(XP X0)−1 XR2K02(K2K02) −K 2 K02(K1K01) −1K 2 − × K02(K1K01) −1+ R 1K01P1 HC
is unique, which completes the proof of the lemma.
Lemma 3.2. The predictor bU given in Theorem 3.1 is invariant with respect to the choice of generalized inverse.
Proof. From the expression of bU in Theorem 3.1, inserting the value of bΣuyields
b U =(Σe+ bΣu)−1ΣbuV3QK0 1Ko2R 0 1Z 0=I p− mσ2e V3QK0 1Ko2V 0 3 −1 V3QK0 1Ko2R 0 1Z 0.
First observe that
V3QK0 1Ko2 = Y −1 p1p1 0 pY P X 0(XP X0)−1 XR1QK0 1Ko2 −Y −1 p1p1 0 pY P X0(XP X0) −1 XR2K02(K2K02)−K1QK0 1Ko2
and also note that K01P1K1= PK0
1Ko2. From relation (6), it follows that
K1QK0 1Ko2 =K1− K1PK 0 1Ko2 = Ik− K1K 0 1P1 K1 =K2 K02(K1K01) −1K 2 − K02(K1K01) −1K 1. Thus, V3QK0
1Ko2 does not depend on a choice of a generalized inverse and the uniqueness
of V3QK0
3.1
Prediction of small area means
The prediction of small area means is based on the prediction approach to finite population under model-based theory. By this approach, the target population under study is considered as a random sample from a larger population characterized by a suitable model and the predictive distribution of the values for non sampled units is obtained given the realized values of sampled units (Bolfarine and Zacks, 1992). The model (3) considered in this paper belongs to the extensions of unit level model, often known as nested linear regression model which was originally proposed by Battese et al. (1988) for prediction of mean per-capital income in small geographical areas within counties in the United States. Following these authors, the estimation of a population mean from the sample returns to the prediction of a mean of non-sampled values.
For k group units in all small areas, we consider the partition of Ni units into Nig, g =
1, ..., k, and ni sampled units into nig such that Ni = P k
g=1Nig and ni = P k
g=1nig and
similarly for Yi such that Yi = [Yi1, . . . , Yik]. Then the corresponding target small area
means at each time point for each group unit are given by
b µig= 1 Nig Y(s)ig 1nig+ bY (r) ig 1Nig−nig , (7)
where Y(s)i = (yi1, · · · , yini) : p × ni, standing for the sampled niobservations from the i-th
small area and bY(r)i = (yini+1, · · · , yiNi) : p × (Ni− ni), corresponds to the predicted values
for non-sampled (Ni− ni) units from the i-th small area. The first term of the expression
(7) on the right side is known from the sampled observations and the second term is the prediction of non-sampled observations obtained using the considered model (Henderson, 1975) and is given by b Y(r)ig 1Nig−nig = (1 − nig Nig )Abβg+ 1 Nig 1p10Nig−nigX (r)0 ig γ +b pNig− nig Nig b ui, (8)
where X(r)ig stands for the matrix of auxiliary information of non-sampled units in the i-th area belonging to the group units g.
Note that bβg is the estimator of βg which is the g-th column of the estimated matrix
b
B and ubi is the i-th column of the predicted matrix bU . Throughout this paper, we are
interested in the estimation of mean-squared errors (MSE) for the predicted small area means given in above relation (8). Therefore, we need to calculate the moments of proposed estimators in order to derive the estimation of MSE.
4
Preparations for moments calculation
In this section a definition and lemma are given for some technical results that are refereed to later on for the moment derivations and other calculations. The following definition defines the covariance between two random matrices.
Definition 4.1. Let X and Y be two random matrices. The covariance between X and Y is defined by
cov(X, Y ) = E[vecXvec0Y ] − E[vecX]E[vec0Y ],
and the dispersion matrix D[X] is defined by D[X] = cov(X, X), where vec is the usual column-wise vectorization operator and vec0 is the transpose.
To simplify our calculations later we can use some identities for the matrices K1, K2, R1
and R2.
Lemma 4.1. From the definition of the matrices K1, K2, R1 and R2given in equations
(4) and (5) together with the fact that C(Z0) ⊆ C(C0) and ZZ0 = Im, the following identities
hold (i) R01R1Z = Z (ii) Z0ZR1= R1 (iii) K1K01= HCZ 0ZC0H0 (iv) R02R1= R02Z 0 = 0 and hence P Z0 = 0 (v) HCP = 0
Proof of Lemma 4.1. We prove the first three identities, the others are obtained by straightforward calculations. From the orthogonal diagonalization
(CC0)−1/2CZ0ZC0(CC0)−1/2= ΓDΓ0 =hΓ1 Γ2 i " Im 0 0 0 # " Γ01 Γ02 # . It follows that Γ1(CC0)−1/2CZ0ZC0(CC0)−1/2Γ01= R 0 1Z 0ZR 1= Im.
Furthermore, since ZZ0 = R01R1 = Im. It follows that ZZ0Z = R01R1Z = Z. Similarly,
from R01Z0ZR1 = R01R1, we deduce that Z0ZR1= R1. Moreover, since K1 is a full row
rank, then K1K01= K1R01Z 0ZR 1K01= HCR1R01Z 0ZR 1R01C 0H0= HCZ0ZC0H0.
Later in our calculation we will also use the following well known result
vec(ABC) = (C0⊗ A)vecB, where the symbol ⊗ denotes the Kronecker product.
5
Moments of proposed estimators
In this section, we present the calculations of the first and second moments of the proposed estimators. For the purpose of moment calculations, from now on, we suppose that we have complete knowledge on the covariance matrix Σuor it has been estimated previously so that
it is taken to be as known. In that regard, we use the following predictor of U
e U = (Σe+ Σu)−1ΣuV3QK0 1Ko2R 0 1Z 0. (9)
Theorem 5.1. Given the estimators in Theorem 3.1. Then, bY = A bBHC + 1pγb0X +
e
U Z is an unbiased predictor, i.e., E[ bY ] = E[Y ].
Proof. First we show thatbγ is an unbiased estimator. We have
E[bγ] =1 p(XP X 0)−1 XP E[Y0]1p =1 p(XP X 0 )−1XP C0H0B0A0+ X0γ10p1p =1 p(XP X 0)−1XP X0γ10 p1p= γ. Therefore, E[1pbγ 0
X] = 1pγ0X. Moreover, from the expression of bBHC given in Theorem
3.1, it follows that E[A bBHC] =E[Y ] −1 p1p1 0 pE[Y ]P X 0(XP X0)−1 XR2K02(K2K02) −+ G 2R1K01P1 HC =ABHC + 1pγ0X − 1 p1p1 0 p(ABHC + 1pγ0X)P X0(XP X0) −1 X ×R2K02(K2K02) −+ G 2R1K01P1 HC =ABHC R2K02(K2K02) −+ G 2R1K01P1HC =ABK2K02(K2K02) −HC + ABK 1K01P1HC − ABK2K02(K2K02) −K 1K01P1HC.
Inserting equation (6) into the expression of E[A bBHC] yields E[A bBHC] =ABK2K02(K2K02)−HC + ABHC − ABK2
K02(K1K01)−1K2 − × K02(K1K01)−1HC − ABK2K02(K2K02)−HC + ABK2K02(K2K02)−K2 K02(K1K01)−1K2 − K02(K1K01)−1HC =ABHC.
Hence, also A bBHC is an unbiased estimator. Finally, we have E[ eU Z] =(Σe+ Σu)−1ΣuE[V3]QK0 1Ko2R 0 1Z 0Z =(Σe+ Σu)−1ΣuAB Ik− K2K02(K2K02) −K 1QK0 1Ko2R 0 1Z 0Z, (10)
where eU is given in (9) and since from the expression of V3, it follows that
E[V3] = ABHCG2R1= AB
Ik− K2K02(K2K02)−
K1.
Recall that from the proof of Lemma 3.2, we have
K1QK0 1Ko2 = K2 K02(K1K01)−1K2 − K02(K1K01)−1K1
and plugin this expression into E[ eU Z] given in (10) above leads to E[ eU Z] =(Σe+ Σu)−1ΣuAB Ik− K2K02(K2K02) − × K2 K02(K1K01)−1K2 − K02(K1K01)−1K1R01Z 0Z = 0.
Hence, it is straightforward to see that
E[ bY ] = E[A bBHC + 1pγb
0X + eU Z] = ABHC + 1
pγ0X = E[Y ],
which completes the proof of the theorem.
The following two theorems give the main results of the paper about moments of the proposed estimators. For simplicity, in the next calculations the matrix A involved in bB will be considered as an identity matrix since A is non singular and it is always possible to do reparameterization and get back to bB.
Theorem 5.2. Given the estimators in Theorem 3.1. The dispersion matrices D[γ],b D[ bBHC] and D[ eU ] are given by
mathbbD[γ] =b σ 2 e p (XP X 0)−1, D[ bBHC] =C0H0P1HC ⊗ Σu+ Σe + C0H0 Ik− P1K1K01(K2K02)− × Ik− K1K01P1HC ⊗ Σe+ σ2 e pC 0H0(K 2K02)−K2R02+ P1K1R01G 0 2 × X0(XP X0)−1XR2K02(K2K02)−+ G2R1K01P1 HC ⊗ 1p10p,
D[ eU ] =ZR1QK0 1Ko2 Im+ K1(K2K 0 2)−K1QK0 1Ko2R 0 1Z0⊗ Σu(Σe+ Σu)−1Σu +σ 2 e pZR1QK01Ko2R 0 1G 0 2X 0(XP X0)−1 XG2R1QK0 1Ko2R 0 1Z 0 ⊗ Σu(Σe+ Σu)−11p10p(Σe+ Σu)−1Σu.
In what follows, we use the notation (Y )()0instead of (Y )(Y )0in order to shorten matrix expressions.
Proof. Recall thatγ =b 1 p(XP X
0)−1XP Y01
p. Hence, it follows that
D[bγ] =1 p2 10p⊗ (XP X0)−1XP D[Y0] 0 =1 p2 10p⊗ (XP X0)−1XP Σu⊗ Z0Z + Σe⊗ IN 0 =σ 2 e p ⊗ (XP X0)−1XP X0(XP X0)−1= σ 2 e p(XP X 0)−1,
since P Z0= 0. Moreover, replacing V3by its expression in the Theorem 3.1, we can rewrite
b BHC by b BHC =Y −1 p1p1 0 pY P X0(XP X0) −1 XR2K02(K2K20)−+ G2R1K01P1 HC
and therefore, the dispersion matrix D[ bBHC] is given by D[ bBHC] = C0H0 (K2K02)−K2R02+ P1K1R01G 0 2 ⊗ Ip −1 pC 0H0 (K 2K02) −K 2R02+ P1K1R01G 0 2X 0(XP X0)−1XP ⊗ 1 p10p ×Z0Z ⊗ Σu+ IN ⊗ Σe 0 .
Using the results from Lemma 4.1, we obtain
D[ bBHC] =C0H0P1HC ⊗ Σu+ Σe + C0H0 Ik− P1K1K01(K2K02) − × Ik− K1K01P1HC ⊗ Σe+ σ2 e pC 0H0(K 2K02) −K 2R02+ P1K1R01G 0 2 × X0(XP X0)−1XR2K02(K2K02)−+ G2R1K01P1 HC ⊗ 1p10p. Moreover, from eU = Σu(Σe+Σu)−1V3QK0 1Ko2R 0
1Z0, and replacing V3by its expression,
we can rewrite e U = Σu(Σe+ Σu)−1 Y −1 p1p1 0 pY P X0(XP X0) −1 XG2R1QK0 1Ko2R 0 1Z0.
Given this, the dispersion matrix D[ eU ] has the form D[ eU ] = ZR1QK0 1Ko2R 0 1G 0 2⊗ Σu(Σe+ Σu)−1− 1 pZR1QK01Ko2R 0 1G 0 2X 0 (XP X0)−1XP ⊗ Σu(Σe+ Σu)−11p10p Z0Z ⊗ Σu+ IN⊗ Σe 0 =ZR1QK0 1Ko2 Im+ K1(K2K 0 2) −K 1QK0 1Ko2R 0 1Z 0⊗ Σ u(Σe+ Σu)−1Σu +σ 2 e pZR1QK01Ko2R 0 1G 0 2X 0(XP X0)−1 XG2R1QK0 1Ko2R 0 1Z 0 ⊗ Σu(Σe+ Σu)−11p10p(Σe+ Σu)−1Σu.
In Theorem 5.2 the dispersion matrices for the estimators and predictor are given. How-ever, it is also of interest to derive the covariances between them.
Theorem 5.3. Consider the estimators given in Theorem 3.1. Then, the covariances cov[ eU ,γ], cov[ bb BHC, eU ] and cov[ bBHC,γ] are given byb
cov[ eU ,γ] = −b σ 2 e p ZR1QK01Ko2R 0 1G 0 2X 0(XP X0)−1 ⊗ Σu(Σe+ Σu)−11p, cov[ bBHC, eU ] =σ 2 e pC 0H0 (K 2K02)−K2R02+ P1K1R01G 0 2X 0(XP X0)−1 XG2R1QK0 1Ko2R 0 1Z 0 ⊗ 1p10p(Σe+ Σu)−1Σu, cov[ bBHC,γ] = −b σ 2 e p C 0H0 (K 2K02)−K2R02+ P1K1R01G 0 2X 0(XP X0)−1 ⊗ 1p.
Proof. By using the results from Lemma 4.1, we get
cov[ eU ,γ] =b ZR1QK0 1Ko2R 0 1G 0 2⊗ Σu(Σe+ Σu)−1− 1 pZR1QK01Ko2R 0 1G 0 2X 0(XP X0)−1 XP ⊗ Σu(Σe+ Σu)−11p10p Z0Z ⊗ Σu+ IN ⊗ Σe 1 pP X 0(XP X0)−1 ⊗ 1p = −σ 2 e p ZR1QK01Ko2R 0 1G 0 2X 0(XP X0)−1 ⊗ Σu(Σe+ Σu)−11p,
and cov[ bBHC, eU ] =C0H0 (K2K02) −K 2R02+ P1K1R01G 0 2 ⊗ Ip −1 pC 0H0 (K 2K02)−K2R02+ P1K1R01G 0 2X 0(XP X0)−1XP ⊗ 1 p10p ×Z0Z ⊗ Σu+ IN ⊗ Σe G2R1QK0 1Ko2R 0 1Z 0⊗ Σ u(Σe+ Σu)−1 −1 pP X 0(XP X0)−1 XG2R1QK0 1Ko2R 0 1Z 0⊗ 1 p10p(Σe+ Σu)−1Σu =σ 2 e pC 0H0 (K 2K02)−K2R02+ P1K1R01G02X0(XP X0) −1 XG2R1QK0 1Ko2R 0 1Z0 ⊗ 1p10p(Σe+ Σu)−1Σu.
Similarly we have the last covariance as
cov[ bBHC,γ] =b C0H0 (K2K02) −K 2R02+ P1K1R01G 0 2 ⊗ Ip −1 pC 0H0 (K 2K02)−K2R02+ P1K1R01G 0 2X 0(XP X0)−1XP ⊗ 1 p10p ×Z0Z ⊗ Σu+ IN⊗ Σe 1 pP X 0(XP X0)−1⊗ 1 p = −σ 2 e pC 0H0 (K 2K02)−K2R02+ P1K1R01G 0 2X 0(XP X0)−1 ⊗ 1p.
6
Mean-squared errors of predicted small area means
6.1
Derivation of MSE(
τ
b
ig)
In this section, following Butar and Lahiri (2003) we estimate the mean squared error for predicted small area means in two steps. At the first step, under the assumption of known covriance matrices Σe and Σu, the derivation of MSE is presented. At the second step,
a parametric bootstrap approach is proposed for bias correction and approximation of the uncertainty due to the estimation of Σu. Put
K =1 − nig Nig A, L = 1 Nig 1p10Nig−nigX (r)0 ig , M = pNig− nig Nig Ip, (11) i = 1, ..., m, g = 1, ..., k.
Then, the linear prediction and empirical linear prediction quantities from small area means given in (8) can be written by
e
τig =K bβg+ Lγ + Mb uei,
b
τig =K bβg+ Lγ + Mb ubi, i = 1, ..., m, g = 1, ..., k. (12)
Let νgand δibe a k-vector and an m-vector of zeros with one on the g-th and i-th position,
respectively. Choose αg = C0H0(HCC0H0)−1νg so that bβg = bBHCαg, eui = eU δi and
b
ui= bU δi.
Furthermore, let us writeτbig− τig= (τbig−τeig) + (τeig− τig) and the MSE ofτbig can be obtained by MSE(τbig) =MSE(eτig) + 2E b τig−eτig e τig− τig 0 + E τbig−τeig b τig−τeig 0 . (13) The first term of the right hand side of (13) has the form
MSE(τeig) =E[(eτig− τig)(τeig− τig) 0] =E[(K(bβg− βg) + L(γ − γ) + M (b uei− ui)) × (K(bβg− βg) + L(γ − γ) + M (b eui− ui))0] =KD[bβg]K0+ Kcov[bβg,γ]Lb 0+ Kcov[bβg,uei]M0+ LD[bγ]L0 + Lcov[γ,b uei]M0+ M D[eui, bβg]K 0+ Lcov[ b γ, bB]K0 + M cov[eui,γ]Lb 0 + M D[eui− ui]M0. (14)
Observe that, from the definitions given to bβg = bBHCαg, uei = eU δi and ubi = bU δi and taking into account that A−1 was omitted when calculating the second moments involving
b
BHC, the covariances presented in equation (14) are expressed by
cov[bβg,uei] = cov[ bBHCαg, eU δi] = (α0g⊗ A −1)cov[ bBHC, eU ](δ i⊗ Ip), cov[uei,bγ] = cov[ eU δi,γ] = (δb 0 i⊗ Ip)cov[ eU ,γ],b cov[bβg,γ] = cov[ bb BHCαg,γ] = (αb 0 g⊗ A −1)cov[ bBHC, b γ].
Similarly, the dispersion matrices presented in equation (14) are expressed by
D[bβg] = D[ bBHCαg] = (α0g⊗ A −1 )D[ bBHC](αg⊗ A0−1), D[eui] = D[ eU δi] = (δi0 ⊗ Ip)D[ eU ](δi⊗ Ip), and D[eui− ui] =(δi0⊗ Ip)D[ eU − U ](δi⊗ Ip) =(δi0⊗ Ip)(D[ eU ] − 2cov( eU , U ) + D[U ])(δi⊗ Ip) =(δi0⊗ Ip)(D[U ] − D[ eU ])(δi⊗ Ip).
Altogether we can rewrite MSE(τeig) as MSE(τeig) =K(α0g⊗ A −1 )D[ bBHC](αg⊗ A0−1)K0+ K(α0g⊗ A −1)cov[ bBHC, b γ]L0 + K(α0g⊗ A−1)cov[ bBHC, eU ](δi⊗ Ip)M0+ LD[bγ]L0 + Lcov[ eU ,γ]b0(δi⊗ Ip)M0+ M (δi0⊗ Ip)cov[ bBHC, eU ]0(αg⊗ A0−1)K0 + Lcov[ bBHC,γ]b0(αg⊗ A0−1)K0+ M (δ0i⊗ Ip)cov[ eU ,bγ]L 0 + M ΣuM0− M (δ0i⊗ Ip)D[ eU ](δi⊗ Ip)M0, (15)
where the dispersion matrices D[bγ], D[ bBHC], D[ eU ] and the covariance matrices cov[ eU ,γ],b cov[ bBHC, eU ], cov[ bBHC,γ] are presented in Theorem 5.2 and Theorem 5.3, respectively.b
The second two terms of the right hand side of equation (13) are intractable and need to be approximated. It is important to note that in practice, the covariance matrix Σu is
unknown. As pointed out by different authors (see for example Das et al. (2004)), a naive estimator of MSE obtained by replacing the unknown covariance matrix Σuby its estimator
b
Σu in (15) that ignores the variability associated with bΣu can lead to underestimation of
the true MSE. Therefore, in the next section, we propose a parametric bootstrap method to estimate the MSE when Σu is replaced by its estimator.
6.2
Estimation of MSE(
τ
b
ig)
The approximate estimator of MSE(bτig) given in equation (13) can be decomposed as
MSE(τbig) = G1i(Σu) + G2i(Σu) + G3i(Σu), (16) where G1i(Σu) + G2i(Σu) = MSE(eτig) G3i(Σu) = 2E bτig−τeig eτig− τig 0 + E τbig−eτig b τig−eτig 0 .
For known Σu, the quantity G1i(Σu) + G2i(Σu) is given in equation (15). When Σu is
replaced by its estimator, the quantity G1i( bΣu) + G2i( bΣu) introduces an additional bias
related to bΣu, i.e., E[ bΣu] − Σu (see Datta and Lahiri (2000)). Following Butar and Lahiri
(2003); Kubokawa and Nagashima (2012), we propose a parametric bootstrap method to estimate the first two terms of (16) by correcting the bias of G1i( bΣu) + G2i( bΣu) when Σu
is replaced by its estimator bΣu in equation (15) and secondly for estimating the third term
G3i(Σu) of equation (16).
Consider the bootstrap model
Y∗i | u∗
i ∼ Np,ni A bBCi+ 1pγb
0X
where u∗i ∼ Np(0, bΣu). If we put b τig(Yi; bβg,γ, bb Σu) =K bβg+ Lbγ + Muei, (18) b τig(Yi; bβ ∗ g,γb ∗, bΣ u) =K bβ ∗ g+ Lbγ ∗+ M e u∗i, (19) b τig(Yi; bβ ∗ g,γb ∗ , bΣ∗u) =K bβ∗g+ Lbγ∗+ Mub∗i, i = 1, ..., m, g = 1, ..., k. (20)
then the quantity G1i(Σu) + G2i(Σu) can be estimated by
G1i( bΣu) + G2i( bΣu) − E∗G1i( bΣ ∗ u) + G2i( bΣ ∗ u) − G1i( bΣu) − G2i( bΣu)
and the quantity G3i(Σu) estimated by
2E∗ h b τig(Yi; bβ ∗ g,bγ ∗ , bΣ∗u) −τbig(Y∗i; bβ ∗ g,γb ∗ , bΣu) b τig(Yi; bβ ∗ g,bγ ∗ , bΣu) −τbig(Yi; bβg,bγ, bΣu) 0i + E∗ h b τig(Yi; bβ ∗ g,γb ∗ , bΣ∗u) −τbig(Yi; bβ ∗ g,γb ∗ , bΣu) 0i ,
where E∗ is the expectation with respect to model (17), and the calculation ofγb
∗
, bβ∗g, bΣ∗u is performed similarly to that ofγ, bb βg, bΣu except thatγb
∗
, bβ∗g, bΣ∗u are calculated based on Y∗i instead of Yi.
Thus, when unknown covariance Σuis replaced by its estimator in MSE(τbig) of equation (16), we obtain the proposed estimator of MSE whose results are summarized in the following theorem.
Theorem 6.1. Consider the model (3) and assume that the matrices X and Ko20K1are
of full rank. In addition, let p = q so that the matrix A is non singular. Consider also the bootstrap model (17) and the estimated quantities (18), (19) and (20). Then, the estimator of MSE for predicted small area means given in (12) can be expressed by
[ MSE(τbig) =2 h G1i( bΣu) + G2i( bΣu) i − E∗ h G1i( bΣ ∗ u) + G2i( bΣ ∗ u) i + 2E∗ h b τig(Yi; bβ ∗ g,γb ∗ , bΣ∗u) −τbig(Y∗i; bβ ∗ g,γb ∗ , bΣu) ×τbig(Yi; bβ ∗ g,γb ∗ , bΣu) −bτig(Yi; bβg,γ, bb Σu) 0i + E∗ h b τig(Yi; bβ ∗ g,γb ∗ , bΣ∗u) −τbig(Yi; bβ ∗ g,bγ ∗ , bΣu) 0i , (21) where G1i( bΣu) + G2i( bΣu) and G1i( bΣ ∗ u) + G2i( bΣ ∗
u) are given by equation (15) with Σu
replaced by bΣu and bΣ ∗
u, respectively. In addition, the dispersion matrices D[bγ], D[ bBHC],
D[ eU ] and the covariance matrices cov[ eU ,bγ], cov[ bBHC, eU ], cov[ bBHC,γ] involved in equa-b tion (15) are presented in Theorem 5.2 and Theorem 5.3, respectively.
Appendix
Proof of Theorem 3.1. From the results of Theorem 4.1 in Ngaruye et al. (2016), for A, Ko20K1and X of full rank, the the parameters γ and B are uniquely estimated. Furthermore
b T1Ko2 0 =A−1(A0S−1A)−1A0S−1V3K01K o 2(K o 2 0 K1K01K o 2) −1Ko 2 0 =A−1V3K01K o 2(K o 2 0 K1K01K o 2) −1Ko 2 0 , A bT1Ko2 0 K1=V3PK0 1Ko2.
Therefore, we can deduce the expressions of bB and bΣu. Moreover, using the results of
Lemma 4.1, straightforward calculations show that
(Y − A bBHC − 1pγb 0X)Z0= V 3QK0 1Ko2R 0 1Z 0.
Hence, the predictor of the random effect matrix U is given by
b U =(ΣeΣb −1 u + Ip)−1(Y − A bBHC − 1pbγ 0X)Z0 =(Σe+ bΣu)−1ΣbuV3QK0 1Ko2R 0 1Z 0.
Furthermore, we observe that the matrix V3given by
V3=Y R1− Y R2K02(K2K02) −K 1− 1 p1p1 0 pY P X 0(XP X0)−1 XR1 +1 p1p1 0 pY P X 0(XP X0)−1 XR2K02(K2K02) −K 1,
in Theorem 4.1 from Ngaruye et al. (2016), can be simplified to
V3= Y −1 p1p1 0 pY P X0(XP X0) −1 XG2R1,
where G2 = I − R2K02(K2K02)−HC. Thus, replacing the estimators by their respective
expressions yields the desired results.
References
Battese, G. E., Harter, R. M., and Fuller, W. A. (1988). An error-components model for prediction of county crop areas using survey and satellite data. Journal of the American Statistical Association, 83(401):28–36.
Bolfarine, H. and Zacks, S. (1992). Prediction theory for finite populations. Springer.
Butar, F. B. and Lahiri, P. (2003). On measures of uncertainty of empirical bayes small-area estimators. Journal of Statistical Planning and Inference, 112(1):63–76.
Das, K., Jiang, J., and Rao, J. (2004). Mean squared error of empirical predictor. The Annals of Statistics, 32(2):818–840.
Datta, G. S. and Lahiri, P. (2000). A unified measure of uncertainty of estimated best linear unbiased predictors in small area estimation problems. Statistica Sinica, pages 613–627.
Datta, G. S., Lahiri, P., Maiti, T., and Lu, K. L. (1999). Hierarchical Bayes estimation of unemployment rates for the states of the US. Journal of the American Statistical Association, 94(448):1074–1082.
Henderson, C. R. (1973). Sire evaluation and genetic trends. Journal of Animal Science, 1973(Symposium):10–41.
Henderson, C. R. (1975). Best linear unbiased estimation and prediction under a selection model. Biometrics, 31(2):423–447.
Kackar, R. N. and Harville, D. A. (1984). Approximations for standard errors of estimators of fixed and random effects in mixed linear models. Journal of the American Statistical Association, 79(388):853–862.
Kollo, T. and von Rosen, D. (2005). Advanced Multivariate Statistics with Matrices. Springer, Dordrecht.
Kubokawa, T. and Nagashima, B. (2012). Parametric bootstrap methods for bias correction in linear mixed models. Journal of Multivariate Analysis, 106:1–16.
Ngaruye, I., Nzabanita, J., von Rosen, D., and Singull, M. (2016). Small area estimation un-der a multivariate linear model for repeated measures data. Communications in Statistics - Theory and Methods. http://dx.doi.org/10.1080/03610926.2016.1248784.
Nummi, T. (1997). Estimation in a random effects growth curve model. Journal of Applied Statistics, 24(2):157–168.
Pan, J.-X., Fang, K., and Fang, K.-T. (2002). Growth Curve Models and Statistical Diag-nostics. Springer Science & Business Media.
Pfeffermann, D. (2002). Small area estimation - new developments and directions. Inter-national Statistical Review, 70(1):125–143.
Pfeffermann, D. (2013). New important developments in small area estimation. Statistical Science, 28(1):40–68.
Prasad, N. and Rao, J. (1990). The estimation of the mean squared error of small-area estimators. Journal of the American Statistical Association, 85(409):163–171.
Rao, J. N. K. and Isabel, M. (2015). Small Area Estimation. John Wiley and Sons, New York, 2 edition.
Yokoyama, T. (1995). Statistical inference on some mixed manova-gmanova models with random effects. Hiroshima Mathematical Journal, 25(3):441–474.
Yokoyama, T. and Fujikoshi, Y. (1992). Tests for random-effects covariance structures in the growth curve model with covariates. Hiroshima Mathematical Journal, 22:195–202.