Open Access
August 2019 ROS Regression: Integrating Regularization with Optimal Scaling Regression
Jacqueline J. Meulman, Anita J. van der Kooij, Kevin L. W. Duisters
Statist. Sci. 34(3): 361-390 (August 2019). DOI: 10.1214/19-STS697

Abstract

We present a methodology for multiple regression analysis that deals with categorical variables (possibly mixed with continuous ones), in combination with regularization, variable selection and high-dimensional data ($P\gg N$). Regularization and optimal scaling (OS) are two important extensions of ordinary least squares regression (OLS) that will be combined in this paper. There are two data analytic situations for which optimal scaling was developed. One is the analysis of categorical data, and the other the need for transformations because of nonlinear relationships between predictors and outcome. Optimal scaling of categorical data finds quantifications for the categories, both for the predictors and for the outcome variables, that are optimal for the regression model in the sense that they maximize the multiple correlation. When nonlinear relationships exist, nonlinear transformation of predictors and outcome maximize the multiple correlation in the same way. We will consider a variety of transformation types; typically we use step functions for categorical variables, and smooth (spline) functions for continuous variables. Both types of functions can be restricted to be monotonic, preserving the ordinal information in the data. In combination with optimal scaling, three popular regularization methods will be considered: Ridge regression, the Lasso and the Elastic Net. The resulting method will be called ROS Regression (Regularized Optimal Scaling Regression). The OS algorithm provides straightforward and efficient estimation of the regularized regression coefficients, automatically gives the Group Lasso and Blockwise Sparse Regression, and extends them by the possibility to maintain ordinal properties in the data. Extended examples are provided.

Citation

Download Citation

Jacqueline J. Meulman. Anita J. van der Kooij. Kevin L. W. Duisters. "ROS Regression: Integrating Regularization with Optimal Scaling Regression." Statist. Sci. 34 (3) 361 - 390, August 2019. https://doi.org/10.1214/19-STS697

Information

Published: August 2019
First available in Project Euclid: 11 October 2019

zbMATH: 07162128
MathSciNet: MR4017519
Digital Object Identifier: 10.1214/19-STS697

Keywords: Lasso and Elastic Net regularization for nominal and ordinal data , linearization of nonlinear relationships , monotonic group Lasso , monotonic step functions and splines , Optimal scaling , regularization for categorical high-dimensional data

Rights: Copyright © 2019 Institute of Mathematical Statistics

Vol.34 • No. 3 • August 2019
Back to Top