Citation:
Molina, I., & Martín, N. (2018). Empirical best prediction under a nested error model with log transformation. The Annals of Statistics, 46 (5), pp. 1961 - 1993.
xmlui.dri2xhtml.METS-1.0.item-contributor-funder:
Ministerio de Economía y Competitividad (España)
Sponsor:
Supported by the Spanish Grants SEJ-2007-64500 and MTM2012-37077-C02-01. Supported by the Spanish Grants MTM-2012-33740 and ECO-2011-25706.
Project:
Gobierno de España. SEJ-2007-64500 Gobierno de España. MTM2012-37077-C02-01 Gobierno de España. MTM-2012-33740 Gobierno de España. ECO-2011-25706
Keywords:
Empirical best estimator
,
Mean squared error
,
Parametric bootstrap
In regression models involving economic variables such as income, log
transformation is typically taken to achieve approximate normality and stabilize
the variance. However, often the interest is predicting individual values
or means of the variable in the In regression models involving economic variables such as income, log
transformation is typically taken to achieve approximate normality and stabilize
the variance. However, often the interest is predicting individual values
or means of the variable in the original scale. Under a nested error model for
the log transformation of the target variable, we show that the usual approach
of back transforming the predicted values may introduce a substantial bias.
We obtain the optimal (or “best”) predictors of individual values of the original
variable and of small area means under that model. Empirical best predictors
are defined by estimating the unknown model parameters in the best
predictors. When estimation is desired for subpopulations with small sample
sizes (small areas), nested error models are widely used to “borrow strength”
from the other areas and obtain estimators with greater efficiency than direct
estimators based on the scarce area-specific data. We show that naive predictors
of small area means obtained by back-transformation under the mentioned
model may even underperform direct estimators. Moreover, assessing
the uncertainty of the considered predictor is not straightforward. Exact mean
squared errors of the best predictors and second-order approximations to the
mean squared errors of the empirical best predictors are derived. Estimators
of the mean squared errors that are second-order correct are also obtained.
Simulation studies and an example with Mexican data on living conditions
illustrate the procedures.[+][-]