MDFS: MultiDimensional Feature Selection in R

Abstract:

Identification of informative variables in an information system is often performed using simple one-dimensional filtering procedures that discard information about interactions between variables. Such an approach may result in removing some relevant variables from consideration. Here we present an R package MDFS (MultiDimensional Feature Selection) that performs identification of informative variables taking into account synergistic interactions between multiple descriptors and the decision variable. MDFS is an implementation of an algorithm based on information theory (Mnich and Rudnicki, 2017). The computational kernel of the package is implemented in C++. A high-performance version implemented in CUDA C is also available. The application of MDFS is demonstrated using the well-known Madelon dataset, in which a decision variable is generated from synergistic interactions between descriptor variables. It is shown that the application of multidimen sional analysis results in better sensitivity and ranking of importance.

Cite PDF Tweet

Published

Aug. 15, 2019

Received

Dec 1, 2018

DOI

10.32614/RJ-2019-019

Volume

Pages

11/1

198 - 210

Supplementary materials

Supplementary materials are available in addition to this article. It can be downloaded at RJ-2019-019.zip

CRAN packages used

MDFS, Rfast

CRAN Task Views implied by cited packages

Footnotes

    Reuse

    Text and figures are licensed under Creative Commons Attribution CC BY 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".

    Citation

    For attribution, please cite this work as

    Piliszek, et al., "The R Journal: MDFS: MultiDimensional Feature Selection in R", The R Journal, 2019

    BibTeX citation

    @article{RJ-2019-019,
      author = {Piliszek, Radosław and Mnich, Krzysztof and Migacz, Szymon and Tabaszewski, Paweł and Sułecki, Andrzej and Polewko-Klim, Aneta and Rudnicki, Witold},
      title = {The R Journal: MDFS: MultiDimensional Feature Selection in R},
      journal = {The R Journal},
      year = {2019},
      note = {https://doi.org/10.32614/RJ-2019-019},
      doi = {10.32614/RJ-2019-019},
      volume = {11},
      issue = {1},
      issn = {2073-4859},
      pages = {198-210}
    }