The Annals of Statistics

Peter Hall’s work on high-dimensional data and classification

Richard J. Samworth

Full-text: Access denied (no subscription detected)

We're sorry, but we are unable to provide you with the full text of this article because we are not able to identify you as a subscriber. If you have a personal subscription to this journal, then please login. If you are already logged in, then you may need to update your profile to register your subscription. Read more about accessing full-text

Abstract

In this article, I summarise Peter Hall’s contributions to high-dimensional data, including their geometric representations and variable selection methods based on ranking. I also discuss his work on classification problems, concluding with some personal reflections on my own interactions with him. This article complements [Ann. Statist. 44 (2016) 1821–1836; Ann. Statist. 44 (2016) 1837–1853; Ann. Statist. 44 (2016) 1854–1866 and Ann. Statist. 44 (2016) 1867–1887], which focus on other aspects of Peter’s research.

Article information

Source
Ann. Statist., Volume 44, Number 5 (2016), 1888-1895.

Dates
Received: June 2016
First available in Project Euclid: 12 September 2016

Permanent link to this document
https://projecteuclid.org/euclid.aos/1473685262

Digital Object Identifier
doi:10.1214/16-AOS1493

Mathematical Reviews number (MathSciNet)
MR3546437

Zentralblatt MATH identifier
1351.62004

Subjects
Primary: 01A70: Biographies, obituaries, personalia, bibliographies 62-03: Historical (must also be assigned at least one classification number from Section 01)

Keywords
Classification high-dimensional data

Citation

Samworth, Richard J. Peter Hall’s work on high-dimensional data and classification. Ann. Statist. 44 (2016), no. 5, 1888--1895. doi:10.1214/16-AOS1493. https://projecteuclid.org/euclid.aos/1473685262


Export citation

References

  • Biau, G. and Devroye, L. (2010). On the layered nearest neighbour estimate, the bagged nearest neighbour estimate and the random forest method in regression and classification. J. Multivariate Anal. 101 2499–2518.
  • Breiman, L. (1996). Bagging predictors. Mach. Learn. 24 123–140.
  • Chan, Y. and Hall, P. (2009a). Robust nearest-neighbor methods for classifying high-dimensional data. Ann. Statist. 37 3186–3203.
  • Chan, Y.-B. and Hall, P. (2009b). Scale adjustments for classifiers in high-dimensional, low sample size settings. Biometrika 96 469–478.
  • Chan, Y. and Hall, P. (2010). Using evidence of mixed populations to select variables for clustering very high-dimensional data. J. Amer. Statist. Assoc. 105 798–809.
  • Chen, S. X. (2016). Peter Hall’s contribution to the bootstrap. Ann. Statist. 44 1821–1836.
  • Cheng, M. Y. and Fan, J. (2016). Peter Hall’s contributions to nonparametric function estimation and modeling. Ann. Statist. 44 1837–1853.
  • Christianini, N. and Shawe-Taylor, J. (2000). An Introduction to Support Vector Machines. Cambridge Univ. Press, Cambridge.
  • Delaigle, A. (2016). Peter Hall’s main contributions to deconvolution. Ann. Statist. 44 1854–1866.
  • Delaigle, A. and Hall, P. (2012). Effect of heavy tails on ultra high dimensional variable ranking methods. Statist. Sinica 22 909–932.
  • Diaconis, P. and Freedman, D. (1984). Asymptotics of graphical projection pursuit. Ann. Statist. 12 793–815.
  • Dümbgen, L., Samworth, R. J. and Schuhmacher, D. (2013). Stochastic search for semiparametric linear regression models. In From Probability to Statistics and Back: High-Dimensional Models and Processes. Inst. Math. Stat. (IMS) Collect. 9 78–90. IMS, Beachwood, OH.
  • Fan, J. and Lv, J. (2008). Sure independence screening for ultrahigh dimensional feature space. J. R. Stat. Soc. Ser. B Stat. Methodol. 70 849–911.
  • Fan, J., Samworth, R. and Wu, Y. (2009). Ultrahigh dimensional feature selection: Beyond the linear model. J. Mach. Learn. Res. 10 2013–2038.
  • Friedman, J. H. and Hall, P. (2007). On bagging and nonlinear estimation. J. Statist. Plann. Inference 137 669–683.
  • Ghosh, A. K. and Hall, P. (2008). On error-rate estimation in nonparametric classification. Statist. Sinica 18 1081–1100.
  • Hall, P. and Kang, K.-H. (2005). Bandwidth choice for nonparametric classification. Ann. Statist. 33 284–306.
  • Hall, P. and Li, K.-C. (1993). On almost linearity of low-dimensional projections from high-dimensional data. Ann. Statist. 21 867–889.
  • Hall, P., Marron, J. S. and Neeman, A. (2005). Geometric representation of high dimension, low sample size data. J. R. Stat. Soc. Ser. B Stat. Methodol. 67 427–444.
  • Hall, P. and Miller, H. (2009a). Using generalized correlation to effect variable selection in very high dimensional problems. J. Comput. Graph. Statist. 18 533–550.
  • Hall, P. and Miller, H. (2009b). Using the bootstrap to quantify the authority of an empirical ranking. Ann. Statist. 37 3929–3959.
  • Hall, P., Park, B. U. and Samworth, R. J. (2008). Choice of neighbor order in nearest-neighbor classification. Ann. Statist. 36 2135–2152.
  • Hall, P. and Pham, T. (2010). Optimal properties of centroid-based classifiers for very high-dimensional data. Ann. Statist. 38 1071–1093.
  • Hall, P. and Samworth, R. J. (2005). Properties of bagged nearest neighbour classifiers. J. R. Stat. Soc. Ser. B Stat. Methodol. 67 363–379.
  • Hall, P., Titterington, D. M. and Xue, J.-H. (2009a). Tilting methods for assessing the influence of components in a classifier. J. R. Stat. Soc. Ser. B Stat. Methodol. 71 783–803.
  • Hall, P., Titterington, D. M. and Xue, J.-H. (2009b). Median-based classifiers for high-dimensional data. J. Amer. Statist. Assoc. 104 1597–1608.
  • Hall, P., Xia, Y. and Xue, J.-H. (2013). Simple tiered classifiers. Biometrika 100 431–445.
  • Hall, P. and Xue, J.-H. (2010). Incorporating prior probabilities into high-dimensional classifiers. Biometrika 97 31–48.
  • Li, K.-C. (1991). Sliced inverse regression for dimension reduction. J. Amer. Statist. Assoc. 86 316–342.
  • Li, R., Zhong, W. and Zhu, L. (2012). Feature screening via distance correlation learning. J. Amer. Statist. Assoc. 107 1129–1139.
  • Marron, J. S., Todd, M. J. and Ahn, J. (2007). Distance-weighted discrimination. J. Amer. Statist. Assoc. 102 1267–1271.
  • Meinshausen, N. and Bühlmann, P. (2010). Stability selection. J. R. Stat. Soc. Ser. B Stat. Methodol. 72 417–473.
  • Müller, H.-G. (2016). Peter Hall, functional data analysis and random objects. Ann. Statist. 44 1867–1887.
  • Samworth, R. (2005). Small confidence sets for the mean of a spherically symmetric distribution. J. R. Stat. Soc. Ser. B Stat. Methodol. 67 343–361.
  • Samworth, R. J. (2012). Optimal weighted nearest neighbour classifiers. Ann. Statist. 40 2733–2763.
  • Shah, R. D. and Samworth, R. J. (2013). Variable selection with error control: Another look at stability selection. J. R. Stat. Soc. Ser. B. Stat. Methodol. 75 55–80.