The Annals of Applied Statistics

Network classification with applications to brain connectomics

Jesús D. Arroyo Relión, Daniel Kessler, Elizaveta Levina, and Stephan F. Taylor

Full-text: Access denied (no subscription detected)

We're sorry, but we are unable to provide you with the full text of this article because we are not able to identify you as a subscriber. If you have a personal subscription to this journal, then please login. If you are already logged in, then you may need to update your profile to register your subscription. Read more about accessing full-text

Abstract

While statistical analysis of a single network has received a lot of attention in recent years, with a focus on social networks, analysis of a sample of networks presents its own challenges which require a different set of analytic tools. Here we study the problem of classification of networks with labeled nodes, motivated by applications in neuroimaging. Brain networks are constructed from imaging data to represent functional connectivity between regions of the brain, and previous work has shown the potential of such networks to distinguish between various brain disorders, giving rise to a network classification problem. Existing approaches tend to either treat all edge weights as a long vector, ignoring the network structure, or focus on graph topology as represented by summary measures while ignoring the edge weights. Our goal is to design a classification method that uses both the individual edge information and the network structure of the data in a computationally efficient way, and that can produce a parsimonious and interpretable representation of differences in brain connectivity patterns between classes. We propose a graph classification method that uses edge weights as predictors but incorporates the network nature of the data via penalties that promote sparsity in the number of nodes, in addition to the usual sparsity penalties that encourage selection of edges. We implement the method via efficient convex optimization and provide a detailed analysis of data from two fMRI studies of schizophrenia.

Article information

Source
Ann. Appl. Stat., Volume 13, Number 3 (2019), 1648-1677.

Dates
Received: January 2017
Revised: January 2019
First available in Project Euclid: 17 October 2019

Permanent link to this document
https://projecteuclid.org/euclid.aoas/1571277767

Digital Object Identifier
doi:10.1214/19-AOAS1252

Mathematical Reviews number (MathSciNet)
MR4019153

Keywords
Graph classification high-dimensional data variable selection fMRI data

Citation

Arroyo Relión, Jesús D.; Kessler, Daniel; Levina, Elizaveta; Taylor, Stephan F. Network classification with applications to brain connectomics. Ann. Appl. Stat. 13 (2019), no. 3, 1648--1677. doi:10.1214/19-AOAS1252. https://projecteuclid.org/euclid.aoas/1571277767


Export citation

References

  • Aine, C. J., Bockholt, H. J., Bustillo, J. R., Cañive, J. M., Caprihan, A., Gasparovic, C., Hanlon, F. M., Houck, J. M., Jung, R. E. et al. (2017). Multimodal neuroimaging in schizophrenia: Description and dissemination. Neuroinformatics 15 343–364.
  • Arroyo Relión, J. D., Kessler, D., Levina, E. and Taylor, S. F. (2019). Supplement to “Network classification with applications to brain connectomics.” DOI:10.1214/19-AOAS1252SUPPA, DOI:10.1214/19-AOAS1252SUPPB.
  • Bach, F. R. (2008). Consistency of the group lasso and multiple kernel learning. J. Mach. Learn. Res. 9 1179–1225.
  • Bach, F., Jenatton, R., Mairal, J. and Obozinski, G. (2012). Structured sparsity through convex optimization. Statist. Sci. 27 450–468.
  • Beck, A. and Teboulle, M. (2009). A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J. Imaging Sci. 2 183–202.
  • Becker, N., Werft, W., Toedt, G., Lichter, P. and Benner, A. (2009). penalizedSVM: A R-package for feature selection SVM classification. Bioinformatics 25 1711–1712.
  • Bengio, Y. and Monperrus, M. (2005). Non-local manifold tangent learning. In Advances in Neural Information Processing Systems 129–136.
  • Bickel, P. J. and Chen, A. (2009). A nonparametric view of network models and Newman–Girvan and other modularities. Proc. Natl. Acad. Sci. USA 106 21068–21073.
  • Borgwardt, K. M., Ong, C. S., Schönauer, S., Vishwanathan, S., Smola, A. J. and Kriegel, H.-P. (2005). Protein function prediction via graph kernels. Bioinformatics 21 i47–i56.
  • Boyd, S., Parikh, N., Chu, E., Peleato, B. and Eckstein, J. (2011). Distributed optimization and statistical learning via the alternating direction method of multipliers. Found. Trends Mach. Learn. 3 1–122.
  • Broyd, S. J., Demanuele, C., Debener, S., Helps, S. K., James, C. J. and Sonuga-Barke, E. J. (2009). Default-mode brain dysfunction in mental disorders: A systematic review. Neurosci. Biobehav. Rev. 33 279–296.
  • Bullmore, E. T. and Bassett, D. S. (2011). Brain graphs: Graphical models of the human brain connectome. Annu. Rev. Clin. Psychol. 7 113–140.
  • Bullmore, E. and Sporns, O. (2009). Complex brain networks: Graph theoretical analysis of structural and functional systems. Nat. Rev., Neurosci. 10 186–198.
  • Bunney, W. E. and Bunney, B. G. (2000). Evidence for a compromised dorsolateral prefrontal cortical parallel circuit in schizophrenia. Brains Res. Rev. 31 138–146.
  • Button, K. S., Ioannidis, J. P., Mokrysz, C., Nosek, B. A., Flint, J., Robinson, E. S. and Munafò, M. R. (2013). Power failure: Why small sample size undermines the reliability of neuroscience. Nat. Rev., Neurosci. 14 365–376.
  • Chen, X., Lin, Q., Kim, S., Carbonell, J. G. and Xing, E. P. (2012). Smoothing proximal gradient method for general structured sparse regression. Ann. Appl. Stat. 6 719–752.
  • Cortes, C. and Vapnik, V. (1995). Support-vector networks. Mach. Learn. 20 273–297.
  • Craddock, R. C., Holtzheimer, P. E., Hu, X. P. and Mayberg, H. S. (2009). Disease state prediction from resting state functional connectivity. Magn. Reson. Med. 62 1619–1628.
  • Deshpande, M., Kuramochi, M., Wale, N. and Karypis, G. (2005). Frequent substructure-based approaches for classifying chemical compounds. IEEE Trans. Knowl. Data Eng. 17 1036–1050.
  • Dong, D., Wang, Y., Chang, X., Luo, C. and Yao, D. (2017). Dysfunction of large-scale brain networks in schizophrenia: A meta-analysis of resting-state functional connectivity. Schizophr. Bull. 44 168–181.
  • Duchi, J. and Singer, Y. (2009). Efficient online and batch learning using forward backward splitting. J. Mach. Learn. Res. 10 2899–2934.
  • Durante, D. and Dunson, D. B. (2018). Bayesian inference and testing of group differences in brain networks. Bayesian Anal. 13 29–58.
  • Fei, H. and Huan, J. (2010). Boosting with structure information in the functional space: An application to graph classification. In Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 643–652. ACM, New York.
  • Finn, E. S., Shen, X., Scheinost, D., Rosenberg, M. D., Huang, J., Chun, M. M., Papademetris, X. and Constable, R. T. (2015). Functional connectome fingerprinting: Identifying individuals using patterns of brain connectivity. Nat. Neurosci. 18 1664–1671.
  • Fornito, A., Zalesky, A., Pantelis, C. and Bullmore, E. T. (2012). Schizophrenia, neuroimaging and connectomics. NeuroImage 62 2296–2314.
  • Friedman, J., Hastie, T. and Tibshirani, R. (2010a). A note on the group lasso and a sparse group lasso. Available at arXiv:1001.0736.
  • Friedman, J., Hastie, T. and Tibshirani, R. (2010b). Regularization paths for generalized linear models via coordinate descent. J. Stat. Softw. 33 1–22.
  • Gao, C., Ma, Z., Zhang, A. Y. and Zhou, H. H. (2017). Achieving optimal misclassification proportion in stochastic block models. J. Mach. Learn. Res. 18 Paper No. 60, 45.
  • Gärtner, T., Flach, P. and Wrobel, S. (2003). On graph kernels: Hardness results and efficient alternatives. In Learning Theory and Kernel Machines 129–143. Springer, Berlin.
  • Ginestet, C. E., Li, J., Balachandran, P., Rosenberg, S. and Kolaczyk, E. D. (2017). Hypothesis testing for network data in functional neuroimaging. Ann. Appl. Stat. 11 725–750.
  • Gonzalez, J., Holder, L. B. and Cook, D. J. (2000). Graph based concept learning. In Proceedings of the Seventeenth National Conference on Artificial Intelligence and Twelfth Conference on Innovative Applications of Artificial Intelligence. AAAI Press 1072. MIT Press, Cambridge, MA.
  • Gotts, S., Saad, Z., Jo, H. J., Wallace, G., Cox, R. and Martin, A. (2013). The perils of global signal regression for group comparisons: A case study of autism spectrum disorders. Front. Human Neurosci. 7 356.
  • Grosenick, L., Klingenberg, B., Katovich, K., Knutson, B. and Taylor, J. E. (2013). Interpretable whole-brain prediction analysis with GraphNet. NeuroImage 72 304–321.
  • Hastie, T., Tibshirani, R. and Wainwright, M. (2015). Statistical Learning with Sparsity: The Lasso and Generalizations. Monographs on Statistics and Applied Probability 143. CRC Press, Boca Raton, FL.
  • Helma, C., King, R. D., Kramer, S. and Srinivasan, A. (2001). The predictive toxicology challenge 2000–2001. Bioinformatics 17 107–108.
  • Holland, P. W., Laskey, K. B. and Leinhardt, S. (1983). Stochastic blockmodels: First steps. Soc. Netw. 5 109–137.
  • Hu, Y. and Allen, G. I. (2015). Local-aggregate modeling for big data via distributed optimization: Applications to neuroimaging. Biometrics 71 905–917.
  • Inokuchi, A., Washio, T. and Motoda, H. (2000). An apriori-based algorithm for mining frequent substructures from graph data. In Principles of Data Mining and Knowledge Discovery 13–23. Springer, Berlin.
  • Jacob, L., Obozinski, G. and Vert, J.-P. (2009). Group lasso with overlap and graph lasso. In Proceedings of the 26th Annual International Conference on Machine Learning 433–440. ACM, New York.
  • Kashima, H., Tsuda, K. and Inokuchi, A. (2003). Marginalized kernels between labeled graphs. In International Conference of Machine Learning 3 321–328.
  • Ketkar, N. S., Holder, L. B. and Cook, D. J. (2009). Empirical comparison of graph classification algorithms. In Computational Intelligence and Data Mining, 2009. CIDM’09. IEEE Symposium on 259–266. IEEE, New York.
  • Kiar, G., Bridgeford, E., Roncal, W. G., Chandrashekhar, V., Mhembere, D., Ryman, S., Zuo, X.-N., Marguiles, D. S., Craddock, R. C. et al. (2018). A high-throughput pipeline identifies robust connectomes but troublesome variability. BioRxiv 188706.
  • Kudo, T., Maeda, E. and Matsumoto, Y. (2004). An application of boosting to graph classification. In Advances in Neural Information Processing Systems 729–736.
  • Le, C. M., Levina, E. and Vershynin, R. (2017). Concentration and regularization of random graphs. Random Structures Algorithms 51 538–561.
  • Lee, J. D., Sun, Y. and Taylor, J. E. (2015). On model selection consistency of regularized M-estimators. Electron. J. Stat. 9 608–642.
  • Lee, J. D., Sun, D. L., Sun, Y. and Taylor, J. E. (2016). Exact post-selection inference, with application to the lasso. Ann. Statist. 44 907–927.
  • Lindquist, M. A. (2008). The statistical analysis of fMRI data. Statist. Sci. 23 439–464.
  • Liu, Y., Liang, M., Zhou, Y., He, Y., Hao, Y., Song, M., Yu, C., Liu, H., Liu, Z. et al. (2008). Disrupted small-world networks in schizophrenia. Brain 131 945–961.
  • Lockhart, R., Taylor, J., Tibshirani, R. J. and Tibshirani, R. (2014). A significance test for the lasso. Ann. Statist. 42 413–468.
  • Meinshausen, N. (2007). Relaxed Lasso. Comput. Statist. Data Anal. 52 374–393.
  • Meinshausen, N. and Bühlmann, P. (2006). High-dimensional graphs and variable selection with the lasso. Ann. Statist. 34 1436–1462.
  • Meinshausen, N. and Bühlmann, P. (2010). Stability selection. J. R. Stat. Soc. Ser. B. Stat. Methodol. 72 417–473.
  • Menon, V. (2011). Large-scale brain networks and psychopathology: A unifying triple network model. Trends Cogn. Sci. 15 483–506.
  • Narayan, M., Allen, G. I. and Tomson, S. (2015). Two sample inference for populations of graphical models with applications to functional connectivity. Available at arXiv:1502.03853.
  • Ongür, D., Lundy, M., Greenhouse, I., Shinn, A. K., Menon, V., Cohen, B. M. and Renshaw, P. F. (2010). Default mode network abnormalities in bipolar disorder and schizophrenia. Psychiatry Res. 183 59–68.
  • Parikh, N. and Boyd, S. (2013). Proximal algorithms. Found. Trends Optim. 1 123–231.
  • Peeters, S. C., van de Ven, V., Gronenschild, E. H. M., Patel, A. X., Habets, P., Goebel, R., van Os, J. and Marcelis, M. (2015). Default mode network connectivity as a function of familial and environmental risk for psychotic disorder. PLoS ONE 10 e0120030.
  • Power, J. D., Cohen, A. L., Nelson, S. M., Wig, G. S., Barnes, K. A., Church, J. A., Vogel, A. C., Laumann, T. O., Miezin, F. M. et al. (2011). Functional network organization of the human brain. Neuron 72 665–678.
  • Prasad, G., Joshi, S. H., Nir, T. M., Toga, A. W., Thompson, P. M. and ADNI (2015). Brain connectivity and novel network measures for Alzheimer’s disease classification. Neurobiol. Aging 36 121–131.
  • Richiardi, J., Eryilmaz, H., Schwartz, S., Vuilleumier, P. and Van De Ville, D. (2011). Decoding brain states from fMRI connectivity graphs. NeuroImage 56 616–626.
  • Scheinberg, K., Goldfarb, D. and Bai, X. (2014). Fast first-order methods for composite convex optimization with backtracking. Found. Comput. Math. 14 389–417.
  • Scott, J. G., Kelly, R. C., Smith, M. A., Zhou, P. and Kass, R. E. (2015). False discovery rate regression: An application to neural synchrony detection in primary visual cortex. J. Amer. Statist. Assoc. 110 459–471.
  • Shah, R. D. and Samworth, R. J. (2013). Variable selection with error control: Another look at stability selection. J. R. Stat. Soc. Ser. B. Stat. Methodol. 75 55–80.
  • Smith, S. M., Vidaurre, D., Beckmann, C. F., Glasser, M. F., Jenkinson, M., Miller, K. L., Nichols, T. E., Robinson, E. C., Salimi-Khorshidi, G. et al. (2013). Functional connectomics from resting-state fMRI. Trends Cogn. Sci. 17 666–682.
  • Srinivasan, A., Muggleton, S. H., Sternberg, M. J. and King, R. D. (1996). Theories for mutagenicity: A study in first-order and feature-based induction. Artificial Intelligence 85 277–299.
  • Sripada, C., Angstadt, M., Kessler, D., Phan, K. L., Liberzon, I., Evans, G. W., Welsh, R. C., Kim, P. and Swain, J. E. (2014a). Volitional regulation of emotions produces distributed alterations in connectivity between visual, attention control, and default networks. NeuroImage 89 110–121.
  • Sripada, C., Kessler, D., Fang, Y., Welsh, R. C., Prem Kumar, K. and Angstadt, M. (2014b). Disrupted network architecture of the resting brain in attention-deficit/hyperactivity disorder. Hum. Brain Mapp. 35 4693–4705.
  • Supekar, K., Menon, V., Rubin, D., Musen, M. and Greicius, M. D. (2008). Network analysis of intrinsic functional brain connectivity in Alzheimer’s disease. PLoS Comput. Biol. 4 e1000100.
  • Tang, M., Athreya, A., Sussman, D. L., Lyzinski, V., Park, Y. and Priebe, C. E. (2017a). A semiparametric two-sample hypothesis testing problem for random graphs. J. Comput. Graph. Statist. 26 344–354.
  • Tang, M., Athreya, A., Sussman, D. L., Lyzinski, V. and Priebe, C. E. (2017b). A nonparametric two-sample hypothesis testing problem for random graphs. Bernoulli 23 1599–1630.
  • Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. J. Roy. Statist. Soc. Ser. B 58 267–288.
  • van de Geer, S., Bühlmann, P., Ritov, Y. and Dezeure, R. (2014). On asymptotically optimal confidence regions and tests for high-dimensional models. Ann. Statist. 42 1166–1202.
  • Varoquaux, G. and Craddock, R. C. (2013). Learning and comparing functional connectomes across subjects. NeuroImage 80 405–415.
  • Vishwanathan, S. V. N., Schraudolph, N. N., Kondor, R. and Borgwardt, K. M. (2010). Graph kernels. J. Mach. Learn. Res. 11 1201–1242.
  • Vogelstein, J. T., Roncal, W. G., Vogelstein, R. J. and Priebe, C. E. (2013). Graph classification using signal-subgraphs: Applications in statistical connectomics. IEEE Trans. Pattern Anal. Mach. Intell. 35 1539–1551.
  • Watanabe, T., Kessler, D., Scott, C., Angstadt, M. and Sripada, C. (2014). Disease prediction based on functional connectomes using a scalable and spatially-informed support vector machine. NeuroImage 96 183–202.
  • Whitfield-Gabrieli, S., Thermenos, H. W., Milanovic, S., Tsuang, M. T., Faraone, S. V., McCarley, R. W., Shenton, M. E., Green, A. I., Nieto-Castanon, A. et al. (2009). Hyperactivity and hyperconnectivity of the default network in schizophrenia and in first-degree relatives of persons with schizophrenia. Proc. Natl. Acad. Sci. USA 106 1279–1284.
  • Xin, B., Kawahara, Y., Wang, Y. and Gao, W. (2014). Efficient generalized fused lasso and its application to the diagnosis of Alzheimer’s disease. In Twenty-Eighth AAAI Conference on Artificial Intelligence 2163–2169.
  • Yuan, M. and Lin, Y. (2006). Model selection and estimation in regression with grouped variables. J. R. Stat. Soc. Ser. B. Stat. Methodol. 68 49–67.
  • Yuan, L., Liu, J. and Ye, J. (2011). Efficient methods for overlapping group lasso. In Advances in Neural Information Processing Systems 352–360.
  • Zalesky, A., Fornito, A. and Bullmore, E. T. (2010). Network-based statistic: Identifying differences in brain networks. NeuroImage 53 1197–1207.
  • Zhang, A. Y. and Zhou, H. H. (2016). Minimax rates of community detection in stochastic block models. Ann. Statist. 44 2252–2280.
  • Zhang, J., Cheng, W., Wang, Z., Zhang, Z., Lu, W., Lu, G. and Feng, J. (2012). Pattern classification of large-scale functional brain networks: Identification of informative neuroimaging markers for epilepsy. PLoS ONE 7 e36733.
  • Zhang, L., Guindani, M., Versace, F., Engelmann, J. M. and Vannucci, M. (2016). A spatiotemporal nonparametric Bayesian model of multi-subject fMRI data. Ann. Appl. Stat. 10 638–666.
  • Zhou, H., Li, L. and Zhu, H. (2013). Tensor regression with applications in neuroimaging data analysis. J. Amer. Statist. Assoc. 108 540–552.
  • Zhu, J., Rosset, S., Tibshirani, R. and Hastie, T. J. (2004). 1-norm support vector machines. In Advances in Neural Information Processing Systems 49–56.
  • Zou, H. and Hastie, T. (2005). Regularization and variable selection via the elastic net. J. R. Stat. Soc. Ser. B. Stat. Methodol. 67 301–320.

Supplemental materials

  • Supplement A: Algorithms, proofs, and data aquisition and preprocessing details. In this supplementary material, we provide the details of the optimization algorithms, proof of the theoretical results and a detailed description of the data aquisition and preprocessing.
  • Supplement B: Code and data. The .zip file contains source code of an R package that implements the methods described in this paper, as well as the post-processed connectomes used in the analysis.