## The Annals of Applied Statistics

- Ann. Appl. Stat.
- Volume 13, Number 2 (2019), 1166-1197.

### Semiparametric empirical best prediction for small area estimation of unemployment indicators

Maria Francesca Marino, Maria Giovanna Ranalli, Nicola Salvati, and Marco Alfò

#### Abstract

The Italian National Institute for Statistics regularly provides estimates of unemployment indicators using data from the labor force survey. However, direct estimates of unemployment incidence cannot be released for local labor market areas. These are unplanned domains defined as clusters of municipalities; many are out-of-sample areas, and the majority is characterized by a small sample size which renders direct estimates inadequate. The empirical best predictor represents an appropriate, model-based alternative. However, for non-Gaussian responses its computation and the computation of the analytic approximation to its mean squared error require the solution of (possibly) multiple integrals that, generally, have not a closed form. To solve the issue, Monte Carlo methods and parametric bootstrap are common choices, even though the computational burden is a nontrivial task. In this paper, we propose a semiparametric empirical best predictor for a (possibly) nonlinear mixed effect model by leaving the distribution of the area-specific random effects unspecified and estimating it from the observed data. This approach is known to lead to a discrete mixing distribution which helps avoid unverifiable parametric assumptions and heavy integral approximations. We also derive a second-order, bias-corrected analytic approximation to the corresponding mean squared error. Finite sample properties of the proposed approach are tested via a large scale simulation study. Furthermore, the proposal is applied to unit-level data from the 2012 Italian Labor Force Survey to estimate unemployment incidence for 611 local labor market areas using auxiliary information from administrative registers and the 2011 Census.

#### Article information

**Source**

Ann. Appl. Stat., Volume 13, Number 2 (2019), 1166-1197.

**Dates**

Received: December 2017

Revised: August 2018

First available in Project Euclid: 17 June 2019

**Permanent link to this document**

https://projecteuclid.org/euclid.aoas/1560758442

**Digital Object Identifier**

doi:10.1214/18-AOAS1226

**Mathematical Reviews number (MathSciNet)**

MR3963567

**Zentralblatt MATH identifier**

07094850

**Keywords**

Binary data Exponential Family finite mixtures general parameters mixed logistic model unit-level model

#### Citation

Marino, Maria Francesca; Ranalli, Maria Giovanna; Salvati, Nicola; Alfò, Marco. Semiparametric empirical best prediction for small area estimation of unemployment indicators. Ann. Appl. Stat. 13 (2019), no. 2, 1166--1197. doi:10.1214/18-AOAS1226. https://projecteuclid.org/euclid.aoas/1560758442

#### Supplemental materials

- Supplement to “Semiparametric empirical best prediction for small area estimation of unemployment indicators”. The online Supplementary Material describes the EM algorithm for parameter estimation and the procedure for estimating the covariance matrix of model parameters. Also, computational details for deriving the bias correction term for the MSE estimator of the proposed sp-EBP, as well as explicit formulas for computing model derivatives in the case of binary data are reported. Some additional simulation results are also presented. Last, a computationally efficient algorithm for estimation and inference developed in R language from the authors, together with an example data set, is made available as part of the online Supplementary Material.Digital Object Identifier: doi:10.1214/18-AOAS1226SUPPSupplemental files are immediately available to subscribers. Non-subscribers gain access to supplemental files with the purchase of the article.