The Annals of Applied Statistics

Sparse modeling of categorial explanatory variables

Jan Gertheiss and Gerhard Tutz

Full-text: Open access


Shrinking methods in regression analysis are usually designed for metric predictors. In this article, however, shrinkage methods for categorial predictors are proposed. As an application we consider data from the Munich rent standard, where, for example, urban districts are treated as a categorial predictor. If independent variables are categorial, some modifications to usual shrinking procedures are necessary. Two L1-penalty based methods for factor selection and clustering of categories are presented and investigated. The first approach is designed for nominal scale levels, the second one for ordinal predictors. Besides applying them to the Munich rent standard, methods are illustrated and compared in simulation studies.

Article information

Ann. Appl. Stat., Volume 4, Number 4 (2010), 2150-2180.

First available in Project Euclid: 4 January 2011

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Categorial predictors fused lasso ordinal predictors rent standard variable fusion


Gertheiss, Jan; Tutz, Gerhard. Sparse modeling of categorial explanatory variables. Ann. Appl. Stat. 4 (2010), no. 4, 2150--2180. doi:10.1214/10-AOAS355.

Export citation


  • Bondell, H. D. and Reich, B. J. (2009). Simultaneous factor selection and collapsing levels in anova. Biometrics 65 169–177.
  • Bühlmann, P. and Yu, B. (2003). Boosting with the L2 loss: Regression and classification. J. Amer. Statist. Assoc. 98 324–339.
  • Candes, E. and Tao, T. (2007). The dantzig selector: Statistical estimation when p is much larger than n. Ann. Statist. 35 2313–2351.
  • Efron, B., Hastie, T., Johnstone, I. and Tibshirani, R. (2004). Least angle regression. Ann. Statist. 32 407–499.
  • Fan, J. and Li, R. (2001). Variable selection via nonconcave penalized likelihood and its oracle properties. J. Amer. Statist. Assoc. 96 1348–1360.
  • Gertheiss, J., Hogger, S., Oberhauser, C. and Tutz, G. (2009). Selection of ordinally scaled independent variables. Technical Report 62, Dept. Statistics, Ludwig-Maximilians-Univ. München. To appear.
  • Gertheiss, J. and Tutz, G. (2009). Penalized regression with ordinal predictors. Int. Statist. Rev. 77 345–365.
  • Karatzoglou, A., Smola, A., Hornik, K. and Zeileis, A. (2004). kernlab—an S4 package for kernel methods in R. J. Statist. Soft. 11 1–20.
  • Kneib, T., Heinzl, F., Brezger, A. and Sabanés Bové, D. (2009). BayesX: R utilities accompanying the software package BayesX. R package version 0.2.
  • Land, S. R. and Friedman, J. H. (1997). Variable fusion: A new adaptive signal regression method. Technical Report 656, Dept. Statistics, Carnegie Mellon Univ. Pittsburg.
  • Meier, L. (2007). grplasso: Fitting user specified models with Group Lasso penalty. R package version 0.2-1.
  • Meier, L., Van de Geer, S. and Bühlmann, P. (2008). The group lasso for logistic regression. J. Roy. Statist. Soc. Ser. B 70 53–71.
  • Meinshausen, N. (2007). Relaxed lasso. Comput. Statist. Data Anal. 52 374–393.
  • R Development Core Team (2009). R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing. ISBN 3-900051-07-0.
  • Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. J. Roy. Statist. Soc. Ser. B 58 267–288.
  • Tibshirani, R., Saunders, M., Rosset, S., Zhu, J. and Kneight, K. (2005). Sparsity and smoothness via the fused lasso. J. Roy. Statist. Soc. Ser. B 67 91–108.
  • Walter, S. D., Feinstein, A. R. and Wells, C. K. (1987). Coding ordinal independent variables in multiple regression analysis. American Journal of Epidemiology 125 319–323.
  • Yuan, M. and Lin, Y. (2006). Model selection and estimation in regression with grouped variables. J. Roy. Statist. Soc. Ser. B 68 49–67.
  • Zou, H. (2006). The adaptive lasso and its oracle properties. J. Amer. Statist. Assoc. 101 1418–1429.
  • Zou, H. and Hastie, T. (2005). Regularization and variable selection via the elastic net. J. Roy. Statist. Soc. Ser. B 67 301–320.