Bayesian Analysis

Efficient utility-based clustering over high dimensional partition spaces

Paul E. Anderson, Kieron D. Edwards, Silvia Liverani, Andrew J. Millar, and Jim Q. Smith

Full-text: Open access

Abstract

Because of the huge number of partitions of even a moderately sized dataset, even when Bayes factors have a closed form, in model-based clustering a comprehensive search for the highest scoring (MAP) partition is usually impossible. However, when each cluster in a partition has a signature and it is known that some signatures are of scientific interest whilst others are not, it is possible, within a Bayesian framework, to develop search algorithms which are guided by these cluster signatures. Such algorithms can be expected to find better partitions more quickly. In this paper we develop a framework within which these ideas can be formalized. We then briefly illustrate the efficacy of the proposed guided search on a microarray time course data set where the clustering objective is to identify clusters of genes with different types of circadian expression profiles.

Article information

Source
Bayesian Anal., Volume 4, Number 3 (2009), 539-571.

Dates
First available in Project Euclid: 22 June 2012

Permanent link to this document
https://projecteuclid.org/euclid.ba/1340369854

Digital Object Identifier
doi:10.1214/09-BA420

Mathematical Reviews number (MathSciNet)
MR2551045

Zentralblatt MATH identifier
1330.62253

Citation

Liverani, Silvia; Anderson, Paul E.; Edwards, Kieron D.; Millar, Andrew J.; Smith, Jim Q. Efficient utility-based clustering over high dimensional partition spaces. Bayesian Anal. 4 (2009), no. 3, 539--571. doi:10.1214/09-BA420. https://projecteuclid.org/euclid.ba/1340369854


Export citation