The Annals of Applied Statistics

Discussion of: Treelets—An adaptive multi-scale basis for sparse unordered data

Catherine Tuglus and Mark J. van der Laan

Full-text: Open access

Abstract

We would like to congratulate Lee, Nadler and Wasserman on their contribution to clustering and data reduction methods for high p and low n situations. A composite of clustering and traditional principal components analysis, treelets is an innovative method for multi-resolution analysis of unordered data. It is an improvement over traditional PCA and an important contribution to clustering methodology. Their paper presents theory and supporting applications addressing the two main goals of the treelet method: (1) Uncover the underlying structure of the data and (2) Data reduction prior to statistical learning methods. We will organize our discussion into two main parts to address their methodology in terms of each of these two goals. We will present and discuss treelets in terms of a clustering algorithm and an improvement over traditional PCA. We will also discuss the applicability of treelets to more general data, in particular, the application of treelets to microarray data.

Article information

Source
Ann. Appl. Stat., Volume 2, Number 2 (2008), 489-493.

Dates
First available in Project Euclid: 3 July 2008

Permanent link to this document
https://projecteuclid.org/euclid.aoas/1215118524

Digital Object Identifier
doi:10.1214/08-AOAS137F

Mathematical Reviews number (MathSciNet)
MR2524342

Zentralblatt MATH identifier
05591284

Citation

Tuglus, Catherine; van der Laan, Mark J. Discussion of: Treelets—An adaptive multi-scale basis for sparse unordered data. Ann. Appl. Stat. 2 (2008), no. 2, 489--493. doi:10.1214/08-AOAS137F. https://projecteuclid.org/euclid.aoas/1215118524


Export citation

References

  • Bembom, O., Petersen, M. L., Rhee, S.-Y., Fessel, W. J., Sinisi, S. E., Shafer, R. W. and van der Laan, M. J. (2007). Biomarker discovery using targeted maximum likelihood estimation: Application to the treatment of antiretroviral resistant hiv infection. U.C. Berkeley Division of Biostatistics Working Paper Series, Working Paper, 221.
  • Kaufman, L. and Rousseeuw, P. (1990)., Finding Groups in Data: An Introduction to Cluster Analysis. Wiley, New York.
  • Pollard, K. and van der Laan, M. (2005). Cluster analysis of genomic data with applications in r. U.C. Berkeley Division of Biostatistics Working Paper Series, Working Paper, 167.
  • Tuglus, C. and van der Laan, M. (2008). Targeted methods for biomarker discovery: The search for a standard. Univ. California, Berkeley Division of Biostatistics Working Paper Series, Working Paper, 233.
  • van der Laan, M. and Bryan, J. (2001). Gene expression analysis with the parametric bootstrap., Biostatistics 2 1–17.
  • van der Laan, M. and Pollard, K. (2003). A new algorithm for hierarchical hybrid clustering with visualization and the bootstrap., J. Statist. Plann. Inference 117 275–303.
  • van der Laan, M., Pollard, K. and Bryan, J. (2003). A new partitioning around medoids algorithm., J. Statist. Comput. Simul. 73 575–584.
  • van der Laan, M., Polley, E. and Hubbard, A. (2007). Super learner. U.C. Berkeley Division of Biostatistics Working Paper Series, Working Paper, 222.