The Annals of Applied Statistics
- Ann. Appl. Stat.
- Volume 5, Number 2B (2011), 1534-1552.
Nonparametric Bayesian sparse factor models with application to gene expression modeling
A nonparametric Bayesian extension of Factor Analysis (FA) is proposed where observed data Y is modeled as a linear superposition, G, of a potentially infinite number of hidden factors, X. The Indian Buffet Process (IBP) is used as a prior on G to incorporate sparsity and to allow the number of latent features to be inferred. The model’s utility for modeling gene expression data is investigated using randomly generated data sets based on a known sparse connectivity matrix for E. Coli, and on three biological data sets of increasing complexity.
Ann. Appl. Stat., Volume 5, Number 2B (2011), 1534-1552.
First available in Project Euclid: 13 July 2011
Permanent link to this document
Digital Object Identifier
Mathematical Reviews number (MathSciNet)
Zentralblatt MATH identifier
Knowles, David; Ghahramani, Zoubin. Nonparametric Bayesian sparse factor models with application to gene expression modeling. Ann. Appl. Stat. 5 (2011), no. 2B, 1534--1552. doi:10.1214/10-AOAS435. https://projecteuclid.org/euclid.aoas/1310562732
- Supplementary material: Graphs of precision and recall for the synthetic data experiment. The precision and recall of active elements of the Z matrix achieved by each algorithm (after thresholding for the nonsparse algorithms) on the synthetic data experiment, described in Section 5.1. The results are consistent with the reconstruction error.