Electronic Journal of Statistics

Bounded isotonic regression

Ronny Luss and Saharon Rosset

Full-text: Open access


Isotonic regression offers a flexible modeling approach under monotonicity assumptions, which are natural in many applications. Despite this attractive setting and extensive theoretical research, isotonic regression has enjoyed limited interest in practical modeling primarily due to its tendency to suffer significant overfitting, even in moderate dimension, as the monotonicity constraints do not offer sufficient complexity control. Here we propose to regularize isotonic regression by penalizing or constraining the range of the fitted model (i.e., the difference between the maximal and minimal predictions). We show that the optimal solution to this problem is obtained by constraining the non-penalized isotonic regression model to lie in the required range, and hence can be found easily given this non-penalized solution. This makes our approach applicable to large datasets and to generalized loss functions such as Huber’s loss or exponential family log-likelihoods. We also show how the problem can be reformulated as a Lasso problem in a very high dimensional basis of upper sets. Hence, range regularization inherits some of the statistical properties of Lasso, notably its degrees of freedom estimation. We demonstrate the favorable empirical performance of our approach compared to various relevant alternatives.

Article information

Electron. J. Statist., Volume 11, Number 2 (2017), 4488-4514.

Received: February 2017
First available in Project Euclid: 17 November 2017

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Primary: 62G08: Nonparametric regression
Secondary: 62J07: Ridge regression; shrinkage estimators

Multivariate isotonic regression nonparametric regression regularization path range regularization lasso regularization

Creative Commons Attribution 4.0 International License.


Luss, Ronny; Rosset, Saharon. Bounded isotonic regression. Electron. J. Statist. 11 (2017), no. 2, 4488--4514. doi:10.1214/17-EJS1365. https://projecteuclid.org/euclid.ejs/1510887944

Export citation


  • [1] Bacchetti, P. (1989). Additive isotonic model., Journal of the American Statistical Association 84(405), 289–294.
  • [2] Barlow, R. and H. Brunk (1972). The isotonic regression problem and its dual., Journal of the American Statistical Association 67(337), 140–147.
  • [3] Block, H., S. Qian, and A. Sampson (1994). Structure algorithms for partially ordered isotonic regression., Journal of Computational and Graphical Statistcs 3(3), 285–300.
  • [4] Chakravarti, N. (1989). Bounded isotonic median regression., Computational Statistics & Data Analysis 8, 135–142.
  • [5] Chambolle, A. and P.-L. Lions (1997). Image recovery via total variation minimization and related problems., Numerische Mathematik 76(2), 167–188.
  • [6] Chen, X., Q. Lin, and B. Sen (2015). On degrees of freedom of projection estimators with applications to multivariate shape restricted regression., arXiv:1509.01877.
  • [7] Efron, B. (1986). How biased is the apparent error rate of a prediction rule?, Journal of the American Statistical Association 81(394), 461–470.
  • [8] Efron, B., T. Hastie, I. Johnstone, and R. Tibshirani (2004). Least angle regression., Annals of Statistics 32(2), 407–499.
  • [9] Fang, Z. and N. Meinshausen (2012). Liso isotone for high-dimensional additive isotonic regression., Journal of Computational and Graphical Statistics 21(1), 72–91.
  • [10] Flynn, C. J., C. M. Hurvich, and J. S. Simonoff (2013). Selection in penalized likelihood estimation of misspecified models., Journal of the American Statistical Association 108(503), 1031–1043.
  • [11] Friedman, J. H. (1991). Multivariate adaptive regression splines., Annals of Statistics 19(1), 1–67.
  • [12] Hastie, T., R. Tibshirani, and J. Friedman (2009)., The Elements of Statistical Learning (2 ed.). Springer.
  • [13] He, X., P. Ng, and S. Portnoy (1998). Bivariate quantile smoothing splines., Journal of the Royal Statistical Society. Series B 60(3), 537–550.
  • [14] Hochbaum, D. S. and M. Queyranne (2003). Minimizing a convex cost closure set., SIAM Journal of Discrete Mathematics 16(2), 192–207.
  • [15] Hu, X. (1999). Application of the limit of truncated isotonic regression in optimization subject to isotonic and bounding constraints., Journal of Multivariate Analysis 71, 56–66.
  • [16] Kato, K. (2009). On the degrees of freedom in shrinkage estimation., Journal of Multivariate Analysis 100(7), 1338–1352.
  • [17] Knight, K. and W. Fu (2000). Asymptotics for lasso-type estimators., The Annals of Statistics 20(5), 1356–1378.
  • [18] Kruskal, J. (1964). Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis., Psychometrika 29(1), 1–27.
  • [19] Lee, C.-I. C. (1983). The min-max algorithm and isotonic regression., The Annals of Statistics 11(2), 467–477.
  • [20] Luss, R. and S. Rosset (2014). Generalized isotonic regression., Journal of Computational and Graphical Statistics 23(1), 192–210.
  • [21] Luss, R., S. Rosset, and M. Shahar (2012). Efficient regularized isotonic regression with application to gene-gene interaction search., Annals of Applied Statistics 6(1), 253–283.
  • [22] Mammen, E. and S. van der Geer (1997). Locally adaptive regression splines., Annals of Statistics 25(1), 387–413.
  • [23] Maxwell, W. and J. Muckstadt (1985). Establishing consistent and realistic reorder intervals in production-distribution systems., Operations Research 33(6), 1316–1341.
  • [24] Meyer, M. and M. Woodroofe (2000). On the degrees of freedom in shape-restricted regression., Annals of Statistics 28(4), 1083–1104.
  • [25] Obozinski, G., G. Lanckriet, C. Grant, M. Jordan, and W. Noble (2008). Consistent probabilistic outputs for protein function prediction., Genome Biology 9, 247–254. Open Access.
  • [26] Osborne, M. R., B. Presnell, and B. A. Turlach (1999). On the lasso and its dual., Journal of Computational and Graphical Statistics 9, 319–337.
  • [27] Rosset, S., G. Swirszcz, N. Srebro, and J. Zhu (2007). L1 regularization in infinite dimensional feature spaces., Proceedings of the Conference on Learning Theory (COLT).
  • [28] Roundy, R. (1986). A 98%-effective lot-sizing rule for a multi-product, multi-stage productoin/inventory system., Mathematics of Operations Research 11(4), 699–727.
  • [29] Schell, M. and B. Singh (1997). The reduced monotonic regression method., Journal of the American Statistical Association 92(437), 128–135.
  • [30] Spouge, M., H. Wan, and W. J. Wilbur (2003). Least squares isotonic regression in two dimensions., Journal of Optimization Theory and Applications 117(3), 585–605.
  • [31] Tibshirani, R., H. Hoefling, and R. Tibshirani (2011). Nearly-isotonic regression., Technometrics 53(1), 54–61.
  • [32] Zhao, P. and B. Yu (2006). On model selection consistency of lasso., Journal of Machine Learning Research 7, 2541–2563.
  • [33] Zou, H., T. Hastie, and R. Tibshirani (2007). On the degrees of freedom of the lasso., Annals of Statistics 35(3), 2173–2192.