Open Access
March 2024 A partially functional linear regression framework for integrating genetic, imaging, and clinical data
Ting Li, Yang Yu, J. S. Marron, Hongtu Zhu
Author Affiliations +
Ann. Appl. Stat. 18(1): 704-728 (March 2024). DOI: 10.1214/23-AOAS1808


This paper is motivated by the joint analysis of genetic, imaging, and clinical (GIC) data collected in the Alzheimer’s Disease Neuroimaging Initiative (ADNI) study. We propose a partially functional linear regression (PFLR) framework to map high-dimensional GIC-related pathways for Alzheimer’s disease (AD). We develop a joint model selection and estimation procedure by embedding imaging data in the reproducing kernel Hilbert space and imposing the 0 penalty for the coefficients of genetic variables. We apply the proposed method to the ADNI dataset to identify important features from tens of thousands of genetic polymorphisms (reduced from millions using a preprocessing step) and study the effects of a certain set of informative genetic variants and the baseline hippocampus surface on 13 future cognitive scores. We also explore the shared and distinct heritability patterns of these cognitive scores. Analysis results suggest that both the hippocampal and genetic data have heterogeneous effects on different scores, with the trend that the value of both hippocampi are negatively associated with the severity of cognition deficits. Polygenic effects are observed for all the thirteen cognitive scores. The well-known APOE4 genotype only explains a small part of the cognitive function. Shared genetic etiology exists; however, greater genetic heterogeneity exists within disease classifications after accounting for the baseline diagnosis status. These analyses are useful in further investigation of functional mechanisms for AD progression.

Funding Statement

Research reported in this publication was partially supported by the National Institute On Aging of the National Institutes of Health under Award Number RF1AG082938. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.


These authors contributed equally: Ting Li and Yang Yu.

The authors would like to thank the referees, the Associate Editor, and the Editor for their constructive comments that improved the quality of this paper.

Hongtu Zhu is also affiliated with Department of Statistics and Operations Research, Departments of Genetics, and Departments of Computer Science, University of North Carolina at Chapel Hill.

Data used in preparation of this article were obtained from the Alzheimer’s Disease Neuroimaging Initiative (ADNI) database ( As such, the investigators within the ADNI contributed to the design and implementation of ADNI and/or provided data but did not participate in analysis or writing of this report. A complete listing of ADNI investigators can be found at:


Download Citation

Ting Li. Yang Yu. J. S. Marron. Hongtu Zhu. "A partially functional linear regression framework for integrating genetic, imaging, and clinical data." Ann. Appl. Stat. 18 (1) 704 - 728, March 2024.


Received: 1 April 2023; Revised: 1 August 2023; Published: March 2024
First available in Project Euclid: 31 January 2024

Digital Object Identifier: 10.1214/23-AOAS1808

Keywords: Clinical , Genetics , imaging , nonasymptotic error bounds , partially functional linear regression , Sparsity

Rights: Copyright © 2024 Institute of Mathematical Statistics

Vol.18 • No. 1 • March 2024
Back to Top