Calibration Adjustment for Nonresponse in Sample Surveys

(1)

Calibration Adjustment for Nonresponse in Sample Surveys

av

Bernardo João Rota

Akademisk avhandling

Avhandling för filosofie doktorsexamen i statistik, som kommer att försvaras offentligt torsdag den 27 oktober 2016 kl. 14.15,

Hörsal M, Örebro universitet Opponent: professor Risto Lehtonen

Helsingfors universitet

Örebro universitet Handelshögskolan 701 82 ÖREBRO

(2)

Abstract

Bernardo João Rota (2016): Calibration Adjustment for Nonresponse in Sample Surveys. Örebro Studies in Statistics 8.

In this thesis, we discuss calibration estimation in the presence of nonre-sponse with a focus on the linear calibration estimator and the propensi-ty calibration estimator, along with the use of different levels of auxilia-ry information, that is, sample and population levels. This is a four-papers-based thesis, two of which discuss estimation in two steps. The two-step-type estimator here suggested is an improved compromise of both the linear calibration and the propensity calibration estimators mentioned above. Assuming that the functional form of the response model is known, it is estimated in the first step following using calibra-tion approach. In the second step the linear calibracalibra-tion estimator is con-structed replacing the design weights by products of these with the in-verse of the estimated response probabilities in the first step. The first step of estimation uses sample level of auxiliary information and we demonstrate that this results in more efficient estimated response proba-bilities than using population-level as earlier suggested. The variance expression for the two-step estimator is derived and an estimator of this is suggested. Two other papers address the use of auxiliary variables in estimation. One of which introduces the use of principal components theory in the calibration for nonresponse adjustment and suggests a selection of components using a theory of canonical correlation. Princi-pal components are used as a mean to accounting the problem of estima-tion in presence of large sets of candidate auxiliary variables. In addiestima-tion to the use of auxiliary variables, the last paper also discusses the use of explicit models representing the true response behavior. Usually simple models such as logistic, probit, linear or log-linear are used for this pur-pose. However, given a possible complexity on the structure of the true response probability, it may raise a question whether these simple mod-els are effective. We use an example of telephone-based survey data col-lection process and demonstrate that the logistic model is generally not appropriate.

Keywords: Auxiliary variables, Calibration, Nonresponse, principal

com-ponents, regression estimator, response probability, survey sampling, two-step estimator, variance estimator, weighting.

Bernardo João Rota, School of Business