Random partitions on decomposable graphs

François Caron

INRIA de Bordeaux Sud-Ouest

Probabilistic data clustering has numerous applications in machine learning and statistics. Formally, we associate to each data a latent allocation variable. These latent variables can share the same value and induce a partition of the data. In Bayesian setting, the partition is assumed random and we set a prior distribution on it. Models with both a fixed or unknown number of clusters have been considered in the literature. In particular, Dirichlet multinomial allocation and Dirichlet process partition models have become very popular over the past few years. We propose here extensions of these models to decomposable graphical models. These models have appealing properties and can be fitted using Markov chain Monte Carlo and Sequential Monte Carlo algorithms.