Open Access
March 2022 Manifold valued data analysis of samples of networks, with applications in corpus linguistics
Katie E. Severn, Ian L. Dryden, Simon P. Preston
Author Affiliations +
Ann. Appl. Stat. 16(1): 368-390 (March 2022). DOI: 10.1214/21-AOAS1480

Abstract

Networks arise in many applications, such as in the analysis of text documents, social interactions and brain activity. We develop a general framework for extrinsic statistical analysis of samples of networks, motivated by networks representing text documents in corpus linguistics. We identify networks with their graph Laplacian matrices for which we define metrics, embeddings, tangent spaces and a projection from Euclidean space to the space of graph Laplacians. This framework provides a way of computing means, performing principal component analysis, regression, and carrying out hypothesis tests, such as for testing for equality of means between two samples of networks. We apply the methodology to the set of novels by Jane Austen and Charles Dickens.

Funding Statement

This work was supported by the Engineering and Physical Sciences Research Council [grant number EP/T003928/1 and EP/M02315X/1].

Acknowledgments

The authors are grateful to Michaela Mahlberg, Viola Wiegand and Anthony Hennessey for their help and discussions about the data obtained from https://clic.bham.ac.uk and to the Editor, Associate Editor and two anonymous referees for their very helpful comments.

Citation

Download Citation

Katie E. Severn. Ian L. Dryden. Simon P. Preston. "Manifold valued data analysis of samples of networks, with applications in corpus linguistics." Ann. Appl. Stat. 16 (1) 368 - 390, March 2022. https://doi.org/10.1214/21-AOAS1480

Information

Received: 1 January 2019; Revised: 1 September 2020; Published: March 2022
First available in Project Euclid: 28 March 2022

MathSciNet: MR4400514
zbMATH: 1498.62351
Digital Object Identifier: 10.1214/21-AOAS1480

Keywords: extrinsic mean , graph Laplacian , hypothesis test , regression , Riemannian

Rights: Copyright © 2022 Institute of Mathematical Statistics

Vol.16 • No. 1 • March 2022
Back to Top