## The Annals of Applied Statistics

### Variance function estimation in quantitative mass spectrometry with application to iTRAQ labeling

#### Abstract

This paper describes and compares two methods for estimating the variance function associated with iTRAQ (isobaric tag for relative and absolute quantitation) isotopic labeling in quantitative mass spectrometry based proteomics. Measurements generated by the mass spectrometer are proportional to the concentration of peptides present in the biological sample. However, the iTRAQ reporter signals are subject to errors that depend on the peptide amounts. The variance function of the errors is therefore an essential parameter for evaluating the results, but estimating it is complicated, as the number of nuisance parameters increases with sample size while the number of replicates for each peptide remains small. Two experiments that were conducted with the sole goal of estimating the variance function and its stability over time are analyzed, and the resulting estimated variance function is used to analyze an experiment targeting aberrant signaling cascades in cells harboring distinct oncogenic mutations. Methods for constructing conservative $p$-values and confidence intervals are discussed.

#### Article information

Source
Ann. Appl. Stat., Volume 7, Number 1 (2013), 1-24.

Dates
First available in Project Euclid: 9 April 2013

https://projecteuclid.org/euclid.aoas/1365527188

Digital Object Identifier
doi:10.1214/12-AOAS572

Mathematical Reviews number (MathSciNet)
MR3086408

Zentralblatt MATH identifier
06171261

#### Citation

Mandel, Micha; Askenazi, Manor; Zhang, Yi; Marto, Jarrod A. Variance function estimation in quantitative mass spectrometry with application to iTRAQ labeling. Ann. Appl. Stat. 7 (2013), no. 1, 1--24. doi:10.1214/12-AOAS572. https://projecteuclid.org/euclid.aoas/1365527188

#### References

• Aggarwal, K., Choe, L. H. and Lee, K. H. (2006). Shotgun proteomics using the iTRAQ isobaric tags. Briefings in Functional Genomics and Proteomics 5 112–120.
• Berger, R. L. and Boos, D. D. (1994). $P$ values maximized over a confidence set for the nuisance parameter. J. Amer. Statist. Assoc. 89 1012–1016.
• Blume-Jensen, P. and Hunter, T. (2001). Oncogenic kinase signalling. Nature 411 355–365.
• Böhning, D. (1999). Computer-Assisted Analysis of Mixtures and Applications: Meta-Analysis, Disease Mapping and Others. Monographs on Statistics and Applied Probability 81. Chapman & Hall/CRC, Boca Raton, FL.
• Bruni, C. and Koch, G. (1985). Identifiability of continuous mixtures of unknown Gaussian distributions. Ann. Probab. 13 1341–1357.
• Carroll, R. J. and Wang, Y. (2008). Nonparametric variance estimation in the analysis of microarray data: A measurement error approach. Biometrika 95 437–449.
• Davidian, M. and Carroll, R. J. (1987). Variance function estimation. J. Amer. Statist. Assoc. 82 1079–1091.
• Eckel-Passow, J. E., Oberg, A. L., Therneau, T. M. and Bergen, H. R. (2009). An insight into high-resolution mass-spectrometry data. Biostatistics 10 481–500.
• Fan, J., Feng, Y. and Niu, Y. S. (2010). Nonparametric estimation of genewise variance for microarray data. Ann. Statist. 38 2723–2750.
• Hundertmark, C., Fischer, R., Reinl, T., May, S., Klawonn, F. and Jänsch, L. (2009). MS-specific noise model reveals the potential of iTRAQ in quantitative proteomics. Bioinformatics 25 1004–1011.
• Kiefer, J. and Wolfowitz, J. (1956). Consistency of the maximum likelihood estimator in the presence of infinitely many incidental parameters. Ann. Math. Statist. 27 887–906.
• Klawonn, F., Hundertmark, C. and Jänsch, L. (2006). A maximum likelihood approach to noise estimation for intensity measurements in biology. In Proceedings of the Sixth IEEE International Conference on Data Mining Workshops 180–184. IEEE conference publications.
• Mandel, M., Askenazi, M., Zhang, Y. and Marto, J. A. (2013). Supplement to “Variance function estimation in quantitative mass spectrometry with application to iTRAQ labeling.” DOI:10.1214/12-AOAS572SUPP.
• Neyman, J. and Scott, E. L. (1948). Consistent estimates based on partially consistent observations. Econometrica 16 1–32.
• O’Malley, A. J., Smith, M. H. and Sadler, W. A. (2008). A restricted maximum likelihood procedure for estimating the variance function of an immunoassay. Aust. N. Z. J. Stat. 50 161–177.
• R Development Core Team (2011). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0. Available at http://www.R-project.org/.
• Raab, G. M. (1981). Estimation of a variance function, with application to immunoassay. Appl. Statist. 30 32–40.
• Ross, P. L., Huang, Y. N., Marchese, J. N., Williamson, B., Parker, K., Hattan, S., Khainovski, N., Pillai, S., Dey, S., Daniels, S., Purkayastha, S., Juhasz, P., Martin, S., Bartlet-Jones, M., He, F., Jacobson, A. and Pappin, D. J. (2004). Multiplexed protein quantitation in saccharomyces cerevisiae using amine-reactive isobaric tagging reagents. Molecular and Cellular Proteomics 3 1154–1169.
• Sadler, W. A. and Smith, M. H. (1986). A reliable method of estimating the variance function in immunoassay. Comput. Statist. Data Anal. 3 227–239.
• Wang, Y., Ma, Y. and Carroll, R. J. (2009). Variance estimation in the analysis of microarray data. J. R. Stat. Soc. Ser. B Stat. Methodol. 71 425–445.
• Zhang, Y., Askenazi, M., Jiang, J., Luckey, C. J., Griffin, J. D. and Marto, J. A. (2010). A robust error model for iTRAQ quantification reveals divergent signaling between oncogenic FLT3 mutants in acute myeloid leukemia. Mol. Cell Proteomics 9 780–790.

#### Supplemental materials

• Supplementary material: Web-based supplementary materials variance function estimation in quantitative mass spectrometry with application to iTRAQ labeling. Section A: Workflow of the iTRAQ technique. Section B: Estimate of $G_{0}$. Section C: Sensitivity of the EM algorithm to initial values. Section D: Detailed simulation results.