## Electronic Journal of Statistics

### P-splines with an $\ell_{1}$ penalty for repeated measures

#### Abstract

P-splines are penalized B-splines, in which finite order differences in coefficients are typically penalized with an $\ell_{2}$ norm. P-splines can be used for semiparametric regression and can include random effects to account for within-subject correlations. In addition to $\ell_{2}$ penalties, $\ell_{1}$-type penalties have been used in nonparametric and semiparametric regression to achieve greater flexibility, such as in locally adaptive regression splines, $\ell_{1}$ trend filtering, and the fused lasso additive model. However, there has been less focus on using $\ell_{1}$ penalties in P-splines, particularly for estimating conditional means.

In this paper, we demonstrate the potential benefits of using an $\ell_{1}$ penalty in P-splines with an emphasis on fitting non-smooth functions. We propose an estimation procedure using the alternating direction method of multipliers and cross validation, and provide degrees of freedom and approximate confidence bands based on a ridge approximation to the $\ell_{1}$ penalized fit. We also demonstrate potential uses through simulations and an application to electrodermal activity data collected as part of a stress study.

#### Article information

Source
Electron. J. Statist., Volume 12, Number 2 (2018), 3554-3600.

Dates
Received: July 2017
First available in Project Euclid: 31 October 2018

Permanent link to this document
https://projecteuclid.org/euclid.ejs/1540951342

Digital Object Identifier
doi:10.1214/18-EJS1487

Mathematical Reviews number (MathSciNet)
MR3870506

Zentralblatt MATH identifier
06970012

Subjects
Primary: 62G08: Nonparametric regression
Secondary: 62P10: Applications to biology and medical sciences

#### Citation

Segal, Brian D.; Elliott, Michael R.; Braun, Thomas; Jiang, Hui. P-splines with an $\ell_{1}$ penalty for repeated measures. Electron. J. Statist. 12 (2018), no. 2, 3554--3600. doi:10.1214/18-EJS1487. https://projecteuclid.org/euclid.ejs/1540951342

#### References

• [1] Bollaerts, K., Eilers, P. H. C. and Aerts, M. (2006). Quantile regression with monotonicity restrictions using P-splines and the L1-norm., Statistical Modelling 6 189–207.
• [2] Boyd, S., Parikh, N., Chu, E., Peleato, B. and Eckstein, J. (2011). Distributed optimization and statistical learning via the alternating direction method of multipliers., Foundations and Trends® in Machine Learning 3 1–122.
• [3] Chen, H. and Wang, Y. (2011). A penalized spline approach to functional mixed effects model analysis., Biometrics 67 861–870.
• [4] De Boor, C. (2001)., A practical guide to splines, Revised ed. Springer, New York, NY.
• [5] Donoho, D. L. and Johnstone, I. M. (1988). Minimax estimation via wavelet shrinkage., The Annals of Statistics 26 879–921.
• [6] Efron, B. (1986). How biased is the apparent error rate of a prediction rule., Journal of the American Statistical Association 81 461–470.
• [7] Eilers, P. H. C. (2000). Robust and Quantile Smoothing with P-splines and the L1 Norm. In, Proceedings of the 15th International Workshop on Statistical Modelling, Bilbao.
• [8] Eilers, P. H. C. and Marx, B. D. (1996). Flexible smoothing with B-splines and penalties., Statistical science 11 89–121.
• [9] Eilers, P. H. C., Marx, B. D. and Durbán, M. (2015). Twenty years of P-splines., SORT: statistics and operations research transactions 39 149–186.
• [10] Fitzmaurice, G., Davidian, M., Verbeke, G. and Molenberghs, G. (2008)., Longitudinal data analysis. Chapman and Hall/CRC, Boca Raton, FL.
• [11] Gelman, A., Jakulin, A., Pittau, M. G. and Su, Y.-S. (2008). A weakly informative default prior distribution for logistic and other regression models., The Annals of Applied Statistics 2 1360–1383.
• [12] Gelman, A., Carlin, J. B., Stern, H. S., Dunson, D. B., Vehtari, A. and Rubin, D. B. (2014)., Bayesian data analysis, 3rd ed. Chapman and Hall/CRC, Boca Raton, FL.
• [13] Green, P. J. (1987). Penalized likelihood for general semi-parametric regression models., International Statistical Review 55 245–259.
• [14] Guo, W. (2002). Functional mixed effects models., Biometrics 58 121–128.
• [15] Hastie, T. and Tibshirani, R. (1986). Generalized additive models., Statistical Science 1 297–318.
• [16] Hastie, T. and Tibshirani, R. (1990)., Generalized Additive Models, 1st ed. Monographs on Statistics and Applied Probability. Chapman & Hall, London.
• [17] Hastie, T. and Tibshirani, R. (1993). Varying-coefficient models., Journal of the Royal Statistical Society: Series B (Statistical Methodology) 55 757–796.
• [18] Ishwaran, H. and Rao, J. S. (2005). Spike and slab variable selection: Frequentist and Bayesian strategies., The Annals of Statistics 33 730–773.
• [19] Janson, L., Fithian, W. and Hastie, T. J. (2015). Effective degrees of freedom: a flawed metaphor., Biometrika 1–8.
• [20] Kim, S.-J., Koh, K., Boyd, S. and Gorinevsky, D. (2009). $\ell_1$ Trend Filtering., SIAM review 51 339–360.
• [21] Lin, Y., Zhang, H. H. et al. (2006). Component selection and smoothing in multivariate nonparametric regression., The Annals of Statistics 34 2272–2297.
• [22] Lou, Y., Bien, J., Caruana, R. and Gehrke, J. (2016). Sparse Partially Linear Additive Models., Journal of Computational and Graphical Statistics 25.
• [23] Mammen, E., van de Geer, S. et al. (1997). Locally adaptive regression splines., The Annals of Statistics 25 387–413.
• [24] Meier, L., Van de Geer, S. and Bühlmann, P. (2009). High-dimensional additive modeling., The Annals of Statistics 37 3779–3821.
• [25] Petersen, A., Witten, D. and Simon, N. (2016). Fused lasso additive model., Journal of Computational and Graphical Statistics 25 1005–1025.
• [26] Ramdas, A. and Tibshirani, R. J. (2016). Fast and flexible ADMM algorithms for trend filtering., Journal of Computational and Graphical Statistics 25 839–858.
• [27] Ravikumar, P., Lafferty, J. D., Liu, H. and Wasserman, L. (2009). Sparse Additive Models., Journal of the Royal Statistical Society: Series B (Statistical Methodology) 71.
• [28] Rice, J. A. and Wu, C. O. (2001). Nonparametric mixed effects models for unequally sampled noisy curves., Biometrics 57 253–259.
• [29] Ruppert, D. (2002). Selecting the number of knots for penalized splines., Journal of computational and graphical statistics 11 735–757.
• [30] Ruppert, D., Wand, M. P. and Carroll, R. J. (2003)., Semiparametric regression. Cambridge University Press, New York, NY.
• [31] Sadhanala, V. and Tibshirani, R. J. (2017). Additive Models with Trend Filtering., arXiv preprint arXiv:1702.05037.
• [32] Scheipl, F., Staicu, A.-M. and Greven, S. (2015). Functional additive mixed models., Journal of Computational and Graphical Statistics 24 447–501.
• [33] Segal, B. D., Elliott, M. R., Braun, T. and Jiang, H. (2018). Supplementary material for “P-splines with an $\ell_1$ penalty for repeated, measures”.
• [34] Speed, T. (1991). Comment on “That BLUP is a Good Thing: The Estimation of Random Effects”., Statistical science 6 42–44.
• [35] Stein, C. M. (1981). Estimation of the mean of a multivariate normal distribution., The Annals of Statistics 9 1135–1151.
• [36] Stan Development Team (2016). RStan: the R interface to Stan. R package version, 2.14.1.
• [37] R Core Team (2017). R: A Language and Environment for Statistical Computing R Foundation for Statistical Computing, Vienna, Austria.
• [38] Tibshirani, R. (1996). Regression shrinkage and selection via the lasso., Journal of the Royal Statistical Society: Series B (Methodological) 58 267–288.
• [39] Tibshirani, R. J. (2014a). Adaptive piecewise polynomial estimation via trend filtering., The Annals of Statistics 42 285–323.
• [40] Tibshirani, R. J. (2014b). Supplement to “Adaptive piecewise polynomial estimation via trend, filtering”.
• [41] Tibshirani, R. J. and Taylor, J. (2012). Degrees of freedom in lasso problems., The Annals of Statistics 2 1198–1232.
• [42] Tibshirani, R., Saunders, M., Rosset, S., Zhu, J. and Knight, K. (2005). Sparsity and smoothness via the fused lasso., Journal of the Royal Statistical Society: Series B (Statistical Methodology) 67 91–108.
• [43] Wahba, G. (1990)., Spline models for observational data. Society for industrial and applied mathematics, Philadelphia, PA.
• [44] Wang, Y. (1998). Mixed effects smoothing spline analysis of variance., Journal of the Royal Statistical Society: Series B (Statistical Methodology) 60 159–174.
• [45] Wang, Y.-X., Smola, A. and Tibshirani, R. (2014). The falling factorial basis and its statistical applications. In, International Conference on Machine Learning 730–738.
• [46] Wood, S. N. (2004). Stable and efficient multiple smoothing parameter estimation for generalized additive models., Journal of the American Statistical Association 99 673–686.
• [47] Wood, S. N. (2006)., Generalized additive models: an introduction with R. Chapman and Hall/CRC, Boca Raton, FL.
• [48] Wood, S. N. (2011). Fast stable restricted maximum likelihood and marginal likelihood estimation of semiparametric generalized linear models., Journal of the Royal Statistical Society: Series B (Statistical Methodology) 73 3–36.
• [49] Wood, S. N., Goude, Y. and Shaw, S. (2015). Generalized additive models for large data sets., Journal of the Royal Statistical Society: Series C (Applied Statistics) 64 139–155.
• [50] Wood, S. N., Pya, N. and Säfken, B. (2016). Smoothing Parameter and Model Selection for General Smooth Models., Journal of the American Statistical Association 111 1548–1575.
• [51] Yuan, M. and Lin, Y. (2006). Model selection and estimation in regression with grouped variables., Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68 49–67.
• [52] Zhang, D., Lin, X., Raz, J. and Sowers, M. (1998). Semiparametric stochastic mixed models for longitudinal data., Journal of the American Statistical Association 93 710–719.
• [53] Zhao, P., Rocha, G. and Yu, B. (2009). The composite absolute penalties family for grouped and hierarchical variable selection., The Annals of Statistics 37 3468–3497.
• [54] Zou, H. and Hastie, T. (2005). Regularization and variable selection via the elastic net., Journal of the Royal Statistical Society: Series B (Statistical Methodology) 67 301–320.

#### Supplemental materials

• Supplementary material for “P-splines with an $\ell_{1}$ penalty for repeated measures”. Code and R package for all simulations and analyses. These materials are also available at https://github.com/ bdsegal/code-for-psplinesl1-paper (code) and https://github.com/ bdsegal/psplinesl1 (R package).