Nonparametric Machine Learning and Efficient Computation with Bayesian Additive Regression Trees: The BART R Package

Rodney Sparapani; Charles Spanbauer; Robert McCulloch

doi:10.18637/jss.v097.i01

Rodney Sparapani, Charles Spanbauer, Robert McCulloch

Abstract

In this article, we introduce the BART R package which is an acronym for Bayesian additive regression trees. BART is a Bayesian nonparametric, machine learning, ensemble predictive modeling method for continuous, binary, categorical and time-to-event outcomes. Furthermore, BART is a tree-based, black-box method which fits the outcome to an arbitrary random function, f , of the covariates. The BART technique is relatively computationally efficient as compared to its competitors, but large sample sizes can be demanding. Therefore, the BART package includes efficient state-of-the-art implementations for continuous, binary, categorical and time-to-event outcomes that can take advantage of modern off-the-shelf hardware and software multi-threading technology. The BART package is written in C++ for both programmer and execution efficiency. The BART package takes advantage of multi-threading via forking as provided by the parallel package and OpenMP when available and supported by the platform. The ensemble of binary trees produced by a BART fit can be stored and re-used later via the R predict function. In addition to being an R package, the installed BART routines can be called directly from C++. The BART package provides the tools for your BART toolbox.

Files:

Paper R package (BART) Replication materials

Published:

Jan 14, 2021

DOI:

10.18637/jss.v097.i01

Keywords:

Main Article Content

Abstract

Article Details

Article Sidebar