Open Access
December 2013 Feature Allocations, Probability Functions, and Paintboxes
Tamara Broderick, Jim Pitman, Michael I. Jordan
Bayesian Anal. 8(4): 801-836 (December 2013). DOI: 10.1214/13-BA823

Abstract

The problem of inferring a clustering of a data set has been the subject of much research in Bayesian analysis, and there currently exists a solid mathematical foundation for Bayesian approaches to clustering. In particular, the class of probability distributions over partitions of a data set has been characterized in a number of ways, including via exchangeable partition probability functions (EPPFs) and the Kingman paintbox. Here, we develop a generalization of the clustering problem, called feature allocation, where we allow each data point to belong to an arbitrary, non-negative integer number of groups, now called features or topics. We define and study an “exchangeable feature probability function” (EFPF)—analogous to the EPPF in the clustering setting—for certain types of feature models. Moreover, we introduce a “feature paintbox” characterization—analogous to the Kingman paintbox for clustering—of the class of exchangeable feature models. We provide a further characterization of the subclass of feature allocations that have EFPF representations.

Citation

Download Citation

Tamara Broderick. Jim Pitman. Michael I. Jordan. "Feature Allocations, Probability Functions, and Paintboxes." Bayesian Anal. 8 (4) 801 - 836, December 2013. https://doi.org/10.1214/13-BA823

Information

Published: December 2013
First available in Project Euclid: 4 December 2013

zbMATH: 1329.62278
MathSciNet: MR3150470
Digital Object Identifier: 10.1214/13-BA823

Keywords: beta process , EFPF , feature , feature allocation , feature frequency model , Indian buffet process , paintbox

Rights: Copyright © 2013 International Society for Bayesian Analysis

Vol.8 • No. 4 • December 2013
Back to Top