• Bernoulli
  • Volume 17, Number 4 (2011), 1368-1385.

Support vector machines with a reject option

Marten Wegkamp and Ming Yuan

Full-text: Open access


This paper studies $ℓ_1$ regularization with high-dimensional features for support vector machines with a built-in reject option (meaning that the decision of classifying an observation can be withheld at a cost lower than that of misclassification). The procedure can be conveniently implemented as a linear program and computed using standard software. We prove that the minimizer of the penalized population risk favors sparse solutions and show that the behavior of the empirical risk minimizer mimics that of the population risk minimizer. We also introduce a notion of classification complexity and prove that our minimizers adapt to the unknown complexity. Using a novel oracle inequality for the excess risk, we identify situations where fast rates of convergence occur.

Article information

Bernoulli, Volume 17, Number 4 (2011), 1368-1385.

First available in Project Euclid: 4 November 2011

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

adaptive prediction classification with a reject option lasso oracle inequalities sparsity support vector machines statistical learning


Wegkamp, Marten; Yuan, Ming. Support vector machines with a reject option. Bernoulli 17 (2011), no. 4, 1368--1385. doi:10.3150/10-BEJ320.

Export citation


  • [1] Bartlett, P.L. and Wegkamp, M.H. (2008). Classification with a reject option using a hinge loss. J. Mach. Learn. Res. 9 1823–1840.
  • [2] Bickel, P.J., Ritov, Y. and Tsybakov, A.B. (2009). Simultaneous analysis of Lasso and Dantzig selector. Ann. Statist. 37 1705–1732.
  • [3] Devroye, L. and Lugosi, G. (2000). Combinatorial Methods in Density Estimation. New York: Springer.
  • [4] Hastie, T., Tibshirani, R. and Friedman, J. (2001). The Elements of Statistical Learning. New York: Springer.
  • [5] Herbei, R. and Wegkamp, M.H. (2006). Classification with reject option. Canad. J. Statist. 34 709–721.
  • [6] Koltchinskii, V. (2009). Sparsity in penalized empirical risk minimization. Ann. Inst. H. Poincaré Probab. Statist. 45 7–57.
  • [7] Ledoux, M. and Talagrand, M. (1991). Probability in Banach Spaces. New York: Springer.
  • [8] Tarigan, B. and van de Geer, S.A. (2006). Classifiers of support vector machine type with 1 complexity regularization. Bernoulli 12 1045–1076.
  • [9] Tsybakov, A.B. (2004). Optimal aggregation of classifiers in statistical learning. Ann. Statist. 32 135–166.
  • [10] van de Geer, S.A. (2000). Empirical Processes in M-estimation. Cambridge: Cambridge Univ. Press.
  • [11] Wegkamp, M.H. (2007). Lasso type classifiers with a reject option. Electron. J. Statist. 1 155–168.
  • [12] Yuan, M. and Wegkamp, M.H. (2010). Classification methods with reject option based on convex risk minimization. J. Mach. Learn. Res. 11 111–130.