rmcfs: An R Package for Monte Carlo Feature Selection and Interdependency Discovery

Michał Dramiński, Jacek Koronacki

Main Article Content

Abstract

We describe the R package rmcfs that implements an algorithm for ranking features from high dimensional data according to their importance for a given supervised classification task. The ranking is performed prior to addressing the classification task per se. This R package is the new and extended version of the MCFS (Monte Carlo feature selection) algorithm where an early version was published in 2005. The package provides an easy R interface, a set of tools to review results and the new ID (interdependency discovery) component. The algorithm can be used on continuous and/or categorical features (e.g., gene expression and phenotypic data) to produce an objective ranking of features with a statistically well-defined cutoff between informative and non-informative ones. Moreover, the directed ID graph that presents interdependencies between informative features is provided.

Article Details

Article Sidebar