The Annals of Statistics

Asymptotic optimality and efficient computation of the leave-subject-out cross-validation

Ganggang Xu and Jianhua Z. Huang

Full-text: Open access


Although the leave-subject-out cross-validation (CV) has been widely used in practice for tuning parameter selection for various nonparametric and semiparametric models of longitudinal data, its theoretical property is unknown and solving the associated optimization problem is computationally expensive, especially when there are multiple tuning parameters. In this paper, by focusing on the penalized spline method, we show that the leave-subject-out CV is optimal in the sense that it is asymptotically equivalent to the empirical squared error loss function minimization. An efficient Newton-type algorithm is developed to compute the penalty parameters that optimize the CV criterion. Simulated and real data are used to demonstrate the effectiveness of the leave-subject-out CV in selecting both the penalty parameters and the working correlation matrix.

Ann. Statist., Volume 40, Number 6 (2012), 3003-3030.

First available in Project Euclid: 8 February 2013

Primary: 62G08: Nonparametric regression
Secondary: 62G05: Estimation 62G20: Asymptotic properties 62H12: Estimation 41A15: Spline approximation

Cross-validation generalized estimating equations multiple smoothing parameters penalized splines working correlation matrices


Xu, Ganggang; Huang, Jianhua Z. Asymptotic optimality and efficient computation of the leave-subject-out cross-validation. Ann. Statist. 40 (2012), no. 6, 3003--3030. doi:10.1214/12-AOS1063.

Supplemental materials

  • Supplementary material: Efficient algorithm and additional proofs. In the Supplementary Material, we give a detailed description of the algorithm proposed in Section 3.2. In addition, proofs of some technical lemmas are also included.