The Annals of Statistics

Population theory for boosting ensembles

Leo Breiman

Full-text: Open access

Abstract

Tree ensembles are looked at in distribution space, that is, the limit case of "infinite" sample size. It is shown that the simplest kind of trees is complete in D-dimensional $L_2(P)$ space if the number of terminal nodes T is greater than D. For such trees we show that the AdaBoost algorithm gives an ensemble converging to the Bayes risk.

Article information

Source
Ann. Statist., Volume 32, Number 1 (2004), 1-11.

Dates
First available in Project Euclid: 12 March 2004

Permanent link to this document
https://projecteuclid.org/euclid.aos/1079120126

Digital Object Identifier
doi:10.1214/aos/1079120126

Mathematical Reviews number (MathSciNet)
MR2050998

Zentralblatt MATH identifier
1105.62308

Subjects
Primary: 62H30: Classification and discrimination; cluster analysis [See also 68T10, 91C20] 68T10: Pattern recognition, speech recognition {For cluster analysis, see 62H30} 68T05: Learning and adaptive systems [See also 68Q32, 91E40]

Keywords
Trees AdaBoost Bayes risk

Citation

Breiman, Leo. Population theory for boosting ensembles. Ann. Statist. 32 (2004), no. 1, 1--11. doi:10.1214/aos/1079120126. https://projecteuclid.org/euclid.aos/1079120126


Export citation

References

  • Bauer, E. and Kohavi, R. (1999). An empirical comparison of voting classification algorithms: Bagging, boosting and variants. Machine Learning 36 105--139.
  • Breiman, L. (1996). Bagging predictors. Machine Learning 24 123--140.
  • Breiman, L. (1997). Arcing the edge. Technical Report 486, Dept. Statistics, Univ. California, Berkeley. Available at www.stat.berkeley.edu.
  • Breiman, L. (1998). Arcing classifiers (with discussion). Ann. Statist. 26 801--849.
  • Breiman, L. (1999). Prediction games and arcing algorithms. Neural Computation 11 1493--1517.
  • Breiman, L. (2000). Some infinite theory for predictor ensembles. Technical Report 577, Dept. Statistics, Univ. California, Berkeley.
  • Bühlmann, P. and Yu, B. (2003). Boosting with the $L_2$ loss: Regression and classification. J. Amer. Statist. Assoc. 98 324--339.
  • Dietterich, T. (2000). An experimental comparison of three methods for constructing ensembles of decision trees: Bagging, boosting and randomization. Machine Learning 40 139--157.
  • Drucker, H. and Cortes, C. (1996). Boosting decision trees. In Advances in Neural Information Processing Systems 8 479--485. MIT Press, Cambridge, MA.
  • Dunford, N. and Schwartz, J. (1958). Linear Operators. I. Interscience Publishers, New York.
  • Forsythe, G. E. and Wasow, W. R. (1960). Finite-Difference Methods for Partial Differential Equations. Wiley, New York.
  • Freund, Y. and Schapire, R. (1996). Experiments with a new boosting algorithm. In Proc. 13th International Conference on Machine Learning 148--156. Morgan Kaufmann, San Francisco.
  • Friedman, J., Hastie, T. and Tibshirani, R. (2000). Additive logistic regression: A statistical view of boosting (with discussion). Ann. Statist. 28 337--407.
  • Jiang, W. (2004). Process consistency for AdaBoost. Ann. Statist. 32 13--29.
  • Lugosi, G. and Vayatis, N. (2004). On the Bayes-risk consistency of regularized boosting methods. Ann. Statist. 32 30--55.
  • Mannor, S., Meir, R. and Zhang, T. (2002). The consistency of greedy algorithms for classification. In Proc. 15th Annual Conference on Computational Learning Theory. Lecture Notes in Comp. Sci. 2375 319--333. Springer, New York.
  • Schapire, R., Freund, Y., Bartlett, P. and Lee, W. (1998). Boosting the margin: A new explanation for the effectiveness of voting methods. Ann. Statist. 26 1651--1686.
  • Schapire, R. and Singer, Y. (1999). Improved boosting algorithms using confidence-rated predictions. Machine Learning 37 297--336.
  • Wheway, V. (1999). Variance reduction trends on ``boosted'' classifiers. Unpublished manuscript.
  • Zhang, T. and Yu, B. (2003). Boosting with early stopping: Convergence and consistency. Technical Report 635, Dept. Statistics, Univ. California, Berkeley. Available from www.stat.berkeley.edu/~binyu/publications.html.