BayesBinMix: an R Package for Model Based Clustering of Multivariate Binary Data

Abstract:

The BayesBinMix package offers a Bayesian framework for clustering binary data with or without missing values by fitting mixtures of multivariate Bernoulli distributions with an unknown number of components. It allows the joint estimation of the number of clusters and model parameters using Markov chain Monte Carlo sampling. Heated chains are run in parallel and accelerate the convergence to the target posterior distribution. Identifiability issues are addressed by implementing label switching algorithms. The package is demonstrated and benchmarked against the Expectation Maximization algorithm using a simulation study as well as a real dataset.

Cite PDF Tweet

Published

May 9, 2017

Received

Sep 30, 2016

DOI

10.32614/RJ-2017-022

Volume

Pages

9/1

403 - 420

CRAN packages used

BayesBinMix, label.switching, foreach, doParallel, coda, FlexMix, flexclust

CRAN Task Views implied by cited packages

Bayesian, Cluster, gR, HighPerformanceComputing

Footnotes

    Reuse

    Text and figures are licensed under Creative Commons Attribution CC BY 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".

    Citation

    For attribution, please cite this work as

    Papastamoulis & Rattray, "The R Journal: BayesBinMix: an R Package for Model Based Clustering of Multivariate Binary Data", The R Journal, 2017

    BibTeX citation

    @article{RJ-2017-022,
      author = {Papastamoulis, Panagiotis and Rattray, Magnus},
      title = {The R Journal: BayesBinMix: an R Package for Model Based Clustering of Multivariate Binary Data},
      journal = {The R Journal},
      year = {2017},
      note = {https://doi.org/10.32614/RJ-2017-022},
      doi = {10.32614/RJ-2017-022},
      volume = {9},
      issue = {1},
      issn = {2073-4859},
      pages = {403-420}
    }