The Annals of Applied Statistics

Gene-level pharmacogenetic analysis on survival outcomes using gene-trait similarity regression

Jung-Ying Tzeng, Wenbin Lu, and Fang-Chi Hsu

Full-text: Open access

Abstract

Gene/pathway-based methods are drawing significant attention due to their usefulness in detecting rare and common variants that affect disease susceptibility. The biological mechanism of drug responses indicates that a gene-based analysis has even greater potential in pharmacogenetics. Motivated by a study from the Vitamin Intervention for Stroke Prevention (VISP) trial, we develop a gene-trait similarity regression for survival analysis to assess the effect of a gene or pathway on time-to-event outcomes. The similarity regression has a general framework that covers a range of survival models, such as the proportional hazards model and the proportional odds model. The inference procedure developed under the proportional hazards model is robust against model misspecification. We derive the equivalence between the similarity survival regression and a random effects model, which further unifies the current variance component-based methods. We demonstrate the effectiveness of the proposed method through simulation studies. In addition, we apply the method to the VISP trial data to identify the genes that exhibit an association with the risk of a recurrent stroke. The TCN2 gene was found to be associated with the recurrent stroke risk in the low-dose arm. This gene may impact recurrent stroke risk in response to cofactor therapy.

Article information

Source
Ann. Appl. Stat., Volume 8, Number 2 (2014), 1232-1255.

Dates
First available in Project Euclid: 1 July 2014

Permanent link to this document
https://projecteuclid.org/euclid.aoas/1404229532

Digital Object Identifier
doi:10.1214/14-AOAS735

Mathematical Reviews number (MathSciNet)
MR3262552

Zentralblatt MATH identifier
06333794

Keywords
Association study gene/pathway pharmacogenetics similarity regression survival data proportional odds model proportional hazards model

Citation

Tzeng, Jung-Ying; Lu, Wenbin; Hsu, Fang-Chi. Gene-level pharmacogenetic analysis on survival outcomes using gene-trait similarity regression. Ann. Appl. Stat. 8 (2014), no. 2, 1232--1255. doi:10.1214/14-AOAS735. https://projecteuclid.org/euclid.aoas/1404229532


Export citation

References

  • Afman, L. A., Lievers, K. J. A., Kluijtmans, L. A. J., Trijbels, F. J. M. and Blom, H. J. (2003). Gene–gene interaction between the cystathionine beta-synthase 31 base pair variable number of tandem repeats and the methylenetetrahydrofolate reductase 677C$>$T polymorphism on homocysteine levels and risk for neural tube defects. Mol. Genet. Metab. 78 211–215.
  • Beckmann, L., Thomas, D. C., Fischer, C. and Chang-Claude, J. (2005). Haplotype sharing analysis using mantel statistics. Hum. Hered. 59 67–78.
  • Bennett, S. (1983). Analysis of survival data by the proportional odds model. Stat. Med. 2 273–277.
  • Cai, T., Tonini, G. and Lin, X. (2011). Kernel machine approach to testing the significance of multiple genetic markers for risk prediction. Biometrics 67 975–986.
  • Chen, K., Jin, Z. and Ying, Z. (2002). Semiparametric analysis of transformation models with censored data. Biometrika 89 659–668.
  • Cheng, S. C., Wei, L. J. and Ying, Z. (1995). Analysis of transformation models with censored data. Biometrika 82 835–845.
  • Cox, D. R. (1972). Regression models and life-tables. J. R. Stat. Soc. Ser. B Stat. Methodol. 34 187–220.
  • Duchesne, P. and Lafaye De Micheaux, P. (2010). Computing the distribution of quadratic forms: Further comparisons between the Liu–Tang–Zhang approximation and exact methods. Comput. Statist. Data Anal. 54 858–862.
  • Elston, R. C., Buxbaum, S., Jacobs, K. B. and Olson, J. M. (2000). Haseman and Elston revisited. Genet. Epidemiol. 19 1–17.
  • Giusti, B., Saracinim, C., Bolli, P., Magi, A., Martinelli, I., Peyvandi, F., Rasura, M., Volpe, M., Lotta, L. A., Rubattu, S., Mannucci, P. M. and Abbate, R. (2010). Early-onset ischaemic stroke: Analysis of 58 polymorphisms in 17 genes involved in methionine metabolism. Thrombosis and Haemostasis 104 231–242.
  • Goeman, J. J., Oosting, J., Cleton-Jansen, A.-M., Anninga, J. K. and van Houwelingen, H. C. (2005). Testing association of a pathway with survival using gene expression data. Bioinformatics 21 1950–1957.
  • Goldstein, D. B. (2005). The genetics of human drug response. Philosophical Transactions of the Royal Society B: Biological Sciences 360 1571–1572.
  • Goldstein, D. B., Tate, S. K. and Sisodiya, S. M. (2003). Pharmacogenetics goes genomic. Nat. Rev. Genet. 4 937–947.
  • Haseman, J. K. and Elston, R. C. (1972). The investigation of linkage between a quantitative trait and a marker locus. Behav. Genet. 2 3–19.
  • Hsu, F. C., Sides, E. G. and Mychaleckyj, J. C.et al. (2011). A Transcobalamin 2 gene variant associated with post-stroke homocysteine modifies recurrent stroke risk. Neurology 77 1543–1550.
  • Li, H. and Luan, Y. (2005). Boosting proportional hazards models using smoothing spline, with application to high-dimensional microarray data. Biostatistics 21 2403–2409.
  • Lin, D. Y. and Wei, L. J. (1989). The robust inference for the Cox proportional hazards model. J. Amer. Statist. Assoc. 84 1074–1078.
  • Lin, W.-Y. and Schaid, D. J. (2009). Power comparisons between similarity-based multilocus association methods, logistic regression, and score tests for haplotypes. Genet. Epidemiol. 33 183–197.
  • Lin, X., Cai, T., Wu, M. C., Zhou, Q., Liu, G., Christiani, D. C. and Lin, X. (2011). Kernel machine SNP-set analysis for censored survival outcomes in genome-wide association studies. Genet. Epidemiol. 35 620–631.
  • Low, H.-Q., Chen, C. P. L. H., Kasiman, K., Thalamuthu, A., Ng, S.-S., Foo, J.-N., Chang, H.-M., Wong, M.-C., Tai, E.-S. and Liu, J. (2011). A comprehensive association analysis of homocysteine metabolic pathway genes in Singaporean Chinese with ischemic stroke. PLoS ONE 6 e24757.
  • Moskvina, V. and Schmidt, K. M. (2008). On multiple-testing correction in genome-wide association studies. Genet. Epidemiol. 32 567–573.
  • Pearson, E. S. (1959). Note on an approximation to the distribution of non-central $\chi_{2}$. Biometrika 46 364.
  • Pongpanich, M., Neely, M. and Tzeng, J. Y. (2012). On the aggregation of multimarker information for marker-set and sequencing data analysis: Genotype collapsing vs. similarity collapsing. Frontiers in Statistical Genetics and Methodology 2 110.
  • Price, A. L., Kryukov, G. V., de Bakker, P. I. W., Purcell, S. M., Staples, J., Wei, L.-J. and Sunyaev, S. R. (2010). Pooled association tests for rare variants in exon-resequencing studies. Am. J. Hum. Genet. 86 832–838.
  • Qian, D. and Thomas, D. (2001). Genome scan of complex traits by haplotype sharing correlation. Genetic Epidemiology 21 S582–S587.
  • Schaid, D. J. (2010a). Genomic similarity and kernel methods I: Advancements by building on mathematical and statistical foundations. Human Heredity 70 109–131.
  • Schaid, D. J. (2010b). Genomic similarity and kernel methods II: Methods for genomic information. Human Heredity 70 132–140.
  • Toole, J. F., Malinow, M. R., Chambless, L. E. et al. (2004). Lowering homocysteine in patients with ischemic stroke to prevent recurrent stroke, myocardial infarction, and death: The Vitamin Intervention for Stroke Prevention (VISP) randomized controlled trial. Journal of American Medical Association 291 565–575.
  • Tzeng, J. Y., Devlin, D., Wasserman, L. and Roeder, K. (2003). On the identification of disease mutations by the analysis of haplotype similarity and goodness of fit. The American Journal of Human Genetics 72 891–902.
  • Tzeng, J.-Y., Zhang, D., Chang, S.-M., Thomas, D. C. and Davidian, M. (2009). Gene-trait similarity regression for multimarker-based association analysis. Biometrics 65 822–832.
  • Tzeng, J.-Y., Zhang, D., Pongpanich, M., Smith, C., McCarthy, M. I., Sale, M. M., Worrall, B. B., Hsu, F.-C., Thomas, D. C. and Sullivan, P. F. (2011). Studying gene and gene-environment effects of uncommon and common variants on continuous traits: A marker-set approach using gene-trait similarity regression. Am. J. Hum. Genet. 89 277–288.
  • von Castel-Dunwoody, K. M., Kauwell, G. P. A., Shelnutt, K. P., Vaughn, J. D., Griffin, E. R., Maneval, D. R., Theriaque, D. W. and Bailey, L. B. (2005). Transcobalamin 776C${}\to{}$G polymorphism negatively affects vitamin B-12 metabolism. Am. J. Clin. Nutr. 81 1436–1441.
  • Wang, J., Huff, A. M., Spence, J. D. and Hegele, R. A. (2004). Single nucleotide polymorphism in CTH associated with variation in plasma homocysteine concentration. Clin. Genet. 65 483–486.
  • Wessel, J. and Schork, N. J. (2006). Generalized genomic distance-based regression methodology for multilocus association analysis. Am. J. Hum. Genet. 79 792–806.
  • Wu, M. C., Lee, S., Cai, T., Li, Y., Boehnke, M. and Lin, X. (2011). Rare-variant association testing for sequencing data with the sequence kernel association test. Am. J. Hum. Genet. 89 82–93.
  • Zeng, D. and Lin, D. Y. (2006). Efficient estimation of semiparametric transformation models for counting processes. Biometrika 93 627–640.
  • Zhong, P.-S. and Chen, S. X. (2011). Tests for high-dimensional regression coefficients with factorial designs. J. Amer. Statist. Assoc. 106 260–274.