Institute of Mathematical Statistics Collections

Kendall’s tau in high-dimensional genomic parsimony

Pranab K. Sen

Full-text: Open access

Abstract

High-dimensional data models, often with low sample size, abound in many interdisciplinary studies, genomics and large biological systems being most noteworthy. The conventional assumption of multinormality or linearity of regression may not be plausible for such models which are likely to be statistically complex due to a large number of parameters as well as various underlying restraints. As such, parametric approaches may not be very effective. Anything beyond parametrics, albeit, having increased scope and robustness perspectives, may generally be baffled by the low sample size and hence unable to give reasonable margins of errors. Kendall’s tau statistic is exploited in this context with emphasis on dimensional rather than sample size asymptotics. The Chen–Stein theorem has been thoroughly appraised in this study. Applications of these findings in some microarray data models are illustrated.

Chapter information

Source
Bertrand Clarke and Subhashis Ghosal, eds., Pushing the Limits of Contemporary Statistics: Contributions in Honor of Jayanta K. Ghosh (Beachwood, Ohio, USA: Institute of Mathematical Statistics, 2008), 251-266

Dates
First available in Project Euclid: 28 April 2008

Permanent link to this document
https://projecteuclid.org/euclid.imsc/1209398473

Digital Object Identifier
doi:10.1214/074921708000000183

Subjects
Primary: 62G10: Hypothesis testing 62G99: None of the above, but in this section
Secondary: 62P99: None of the above, but in this section

Keywords
bioinformatics Chen–Stein theorem dimensional asymptotics FDR multiple hypotheses testing nonparametrics permutational invariance U-statistics

Rights
Copyright © 2008, Institute of Mathematical Statistics

Citation

Sen, Pranab K. Kendall’s tau in high-dimensional genomic parsimony. Pushing the Limits of Contemporary Statistics: Contributions in Honor of Jayanta K. Ghosh, 251--266, Institute of Mathematical Statistics, Beachwood, Ohio, USA, 2008. doi:10.1214/074921708000000183. https://projecteuclid.org/euclid.imsc/1209398473


Export citation

References

  • [1] Arratia, R., Goldstein, L. and Gordon, L. (1990). Poisson approximation and the Chen–Stein method: Rejoinder. Statist. Sci. 5 432–434.
  • [2] Benjamini, Y. and Hochberg, Y. (1995). Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. Roy. Statist. Soc. Ser. B 57 289–300.
  • [3] Chen, L. H. Y. (1975). Poisson approximation for dependent trials. Ann. Probab. 3 534–545.
  • [4] Dudoit, S., Shaffer, J. and Boldrick, J. (2003). Multiple hypothesis testing in microarray experiments. Statist. Sci. 18 71–103.
  • [5] Ghosal, S., Sen, A. and van der Vaart, A. W. (2000). Testing monotonicity of regression. Ann. Statist. 28 1054–1081.
  • [6] Hochberg, Y. (1988). A sharper Bonferroni procedure for multiple tests of significance. Biometrika 75 800–802.
  • [7] Hoeffding, W. (1948). A class of statistics with asymptotically normal distribution. Ann. Math. Statist. 19 293–325.
  • [8] Jurečková, J. and Sen, P. K. (1996). Robust Statistical Procedures: Asymptotics and Interrelations. Wiley, New York.
  • [9] Karlin, S. (1969). A First Course in Stochastic Processes. Academic Press, New York.
  • [10] Lehmann, E. L. and Romano, J. P. (2005). Generalizations of the familywise error rate. Ann. Statist. 33 1138–1154.
  • [11] Lobenhofer, E. K., Bennett, L., Cable, P. L., Li, L., Bushel, P. R. and Afshari, C. A. (2002). Regulation of DNA replication fork genes by 17β-estradiol. Molecular Endocrinology 16 1219–1229.
  • [12] Peddada, S., Harris, S., Zajd, J. and Harvey, E. (2005). ORIGEN: Order restricted inference ordered gene expression data. Bioinformatics 21 3933–3934.
  • [13] Roy, S. N. (1953). A heuristic method of test construction and its use in multivariate analysis. Ann. Math. Statist. 24 220–238.
  • [14] Sarkar, S. K. (2006). False discovery and false nondiscovery rates in single-step multiple testing procedures. Ann. Statist. 34 394–415.
  • [15] Sen, P. K. (1968). Estimates of regression coefficients based on Kendall’s tau. J. Amer. Statist. Assoc. 63 1379–1389.
  • [16] Sen, P. K. (2004). Excursions in Biostochastics: Biometry to Biostatistics to Bioinformatics. Institute of Statistical Studies, Academia Sinica, Taipei.
  • [17] Sen, P. K. (2006). Robust statistical inference for high-dimensional data models with applications to genomics. Austrian J. Statist. 35 197–214.
  • [18] Sen, P. K., Tsai, M.-T. and Jou, Y.-S. (2007). High-dimension low sample size perspectives in constrained statistical inference: The SARSCoV RNA genome in illustration. J. Amer. Statist. Assoc. 102 686–694.
  • [19] Sibuya, M. (1959). Bivariate extreme statistics. Ann. Inst. Statist. Math. 11 195–210.
  • [20] Simes, R. J. (1986). An improved Bonferroni procedure for multiple tests of significance. Biometrika 73 751–754.
  • [21] Storey, J. (2007). The optimal discovery procedure: a new approach to simultaneous significance testing. J. Roy. Statist. Soc. Ser. B 69 1–22.