A note on centering in subsample selection for linear regressionCentering is a commonly used technique in linear regression analysis. With
centered data on both the responses and covariates, the ordinary least squares
estimator of the slope parameter can be calculated from a model without the
intercept. If a subsample is selected from a centered full data, the subsample
is typically un-centered. In this case, is it still appropriate to fit a model
without the intercept? The answer is yes, and we show that the least squares
estimator on the slope parameter obtained from a model without the intercept is
unbiased and it has a smaller variance covariance matrix in the Loewner order
than that obtained from a model with the intercept. We further show that for
noninformative weighted subsampling when a weighted least squares estimator is
used, using the full data weighted means to relocate the subsample improves the
estimation efficiency.
arxiv.org