International Statistical Review

Survey Estimates by Calibration on Complex Auxiliary Information

Victor M. Estevao and Carl-Erik Särndal

Full-text: Access denied (no subscription detected)

We're sorry, but we are unable to provide you with the full text of this article because we are not able to identify you as a subscriber. If you have a personal subscription to this journal, then please login. If you are already logged in, then you may need to update your profile to register your subscription. Read more about accessing full-text


In the last decade, calibration estimation has developed into an important field of research in survey sampling. Calibration is now an important methodological instrument in the production of statistics. Several national statistical agencies have developed software designed to compute calibrated weights based on auxiliary information available in population registers and other sources.

This paper reviews some recent progress and offers some new perspectives. Calibration estimation can be used to advantage in a range of different survey conditions. This paper examines several situations, including estimation for domains in one-phase sampling, estimation for two-phase sampling, and estimation for two-stage sampling with integrated weighting. Typical of those situations is complex auxiliary information, a term that we use for information made up of several components. An example occurs when a two-stage sample survey has information both for units and for clusters of units, or when estimation for domains relies on information from different parts of the population.

Complex auxiliary information opens up more than one way of computing the final calibrated weights to be used in estimation. They may be computed in a single step or in two or more successive steps. Depending on the approach, the resulting estimates do differ to some degree. All significant parts of the total information should be reflected in the final weights. The effectiveness of the complex information is mirrored by the variance of the resulting calibration estimator. Its exact variance is not presentable in simple form. Close approximation is possible via the corresponding linearized statistic. We define and use automated linearization as a shortcut in finding the linearized statistic. Its variance is easy to state, to interpret and to estimate. The variance components are expressed in terms of residuals, similar to those of standard regression theory. Visual inspection of the residuals reveals how the different components of the complex auxiliary information interact and work together toward reducing the variance.

Article information

Internat. Statist. Rev., Volume 74, Number 2 (2006), 127-147.

First available in Project Euclid: 24 July 2006

Permanent link to this document

Official statistics production administrative registers calibrated weights design-based inference automated linearization integrated weighting ecological fallacy domains of interest two-stage sampling two-phase sampling


Estevao, Victor M.; Särndal, Carl-Erik. Survey Estimates by Calibration on Complex Auxiliary Information. Internat. Statist. Rev. 74 (2006), no. 2, 127--147.

Export citation


  • [1] Andersson, C. (1997). Continuous labour force surveys: performance analysis of a single weight procedure. Internal report, Statistical Methodology Unit, Statistics Sweden.
  • [2] Andersson, C. & Nordberg, L. (1998). A user's guide to CLAN97. Statistics Sweden.
  • [3] Andersson, P.G. & Thorburn, D. (2005). An optimal calibration distance leading to the optimal regression estimator. Survey Methodology, 31, 95-99.
  • [4] Binder, D. & Kova\u{c}evi\'{c}, M.S. (1995). Estimating some measures of income inequality from survey data: An application of the estimating equations approach. Survey Methodology, 21, 137-145.
  • [5] Caron, N. (1998). Le logiciel POULPE: Aspects méthodologiques. Actes des Journées de Méthodologie, INSEE, Paris.
  • [6] Caron, N., Deville, J.C. & Sautory, O. (1998). Estimation de précision de données issues d'enqu\^{e}tes: document méthodologique sur le logiciel POULPE. Document de travail de la Direction des Statistiques Démographiques et Sociales No. 9806, INSEE, Paris.
  • [7] Deming, W.E. & Stephan, F.F. (1940). On a least squares adjustment of a sample frequency table when the expected marginal totals are known. Annals of Mathematical Statistics, 21, 427-444
  • [8] Deville, J.C. (1999). Variance estimation for complex statistics and estimators: Linearization and residual techniques. Survey Methodology, 25, 193-203.
  • [9] Deville, J.C. (2002). La correction de la nonréponse par calage généralisé. Actes des Journeés de Méthodologie, INSEE, Paris.
  • [10] Deville, J.C. & Särndal, C.E. (1992). Calibration estimators in survey sampling . Journal of the American Statistical Association, 87, 376-382.
  • [11] Estevao, V. & Särndal, C.E. (2002). The ten cases of auxiliary information for calibration in two-phase sampling. Journal of Official Statistics, 18, 233-255.
  • [12] Estevao, V. & Särndal, C.E. (2004). Borrowing strength is not the best technique within a wide class of design-consistent domain estimators. Journal of Official Statistics, 20, 645-660.
  • [13] Estevao, V., Hidiroglou, M.A. & Särndal, C.E. (1995). Methodological principles for a generalized estimation system at Statistics Canada. Journal of Official Statistics, 11, 181-204.
  • [14] Houbiers, M., Knottnerus, P., Kroese, A.H., Renssen, R.H. & Snijders, V. (2003). Estimating consistent table sets: position paper on repeated weighting. Statistics Netherlands, Discussion paper 03005, 2003.
  • [15] Huang, E.T. & Fuller, W.A. (1978). Nonnegative regression estimation for sample survey data. Proceedings Social Statistics Section, American Statistical Association, 300-305.
  • [16] Kalton, G. & Kasprzyk, D. (1986). The treatment of missing data. Survey Methodology, 12, 1-16.
  • [17] Kalton, G. & Maligalig, D.S. (1991). A comparison of weighting adjustment for nonresponse. Proceedings of the Bureau of the Census Annual Research Conference, pp. 409-428.
  • [18] LeGuennec, J. & Sautory, O. (2002). CALMAR 2: une nouvelle version de la macro CALMAR de redressement d'échantillon par calage. Actes des Journeés de Méthodologie, INSEE, Paris.
  • [19] Lema\^{\i}tre, G.E. & Dufour, J. (1987). An integrated method for weighting persons and families. Survey Methodology, 13, 199-207.
  • [20] Montanari, G.E. (1987). Post-sampling efficient prediction in large-scale surveys. International Statistical Review, 55, 191-202.
  • [21] Montanari, G.E. (1998). On regression estimation of finite population mean. Survey Methodology, 24, 69-77.
  • [22] Montanari, G.E. (2000). Conditioning on auxiliary variable means in finite population inference. Australian and New Zealand Journal of Statistics, 42, 407-421.
  • [23] Nieuwenbroek, N.J. (1993). An integrated method for weighting characteristics of persons and households using the linear regression estimator. Internal report, Central Bureau of Statistics, The Netherlands.
  • [24] Nieuwenbroek, N.J. & Boonstra, H.J. (2002). Bascula 4.0 for weighting sample survey data with estimation of variances. The Survey Statistician, Software Reviews, July 2002.
  • [25] Park, M. & Fuller, W.A. (2005). Towards nonnegative regression weights for survey samples. Survey Methodology, 31, 85-94.
  • [26] Renssen, R.H., Kroese, A.H. & Willeboordse, A.J. (2001). Aligning estimates by repeated weighting. Internal report, Methods and Informatics Department, Central Bureau of Statistics, The Netherlands.
  • [27] Särndal, C.E., Swensson, B. & Wretman, J. (1992). Model Assisted Survey Sampling. New York: Springer-Verlag.
  • [28] Särndal, C.E. & Lundström, S. (2005). Estimation in Surveys with Nonresponse. New York: Wiley.
  • [29] Théberge, A. (1999). Extensions of calibration estimators in survey sampling . Journal of the American Statistical Association, 94, 635-644.
  • [30] Woodruff, R.S. (1971). A simple method for approximating the variance of a complicated estimate. Journal of the American Statistical Association, 66, 411-414.