Published by the Foundation for Open Access Statistics
Editors-in-chief: Bettina Grün, Torsten Hothorn, Edzer Pebesma, Achim Zeileis    ISSN 1548-7660; CODEN JSSOBK
mbonsai: Application Package for Sequence Classification by Tree Methodology | Hamuro | Journal of Statistical Software
Authors: Yukinobu Hamuro, Masakazu Nakamoto, Stephane Cheung, Edward H. Ip
Title: mbonsai: Application Package for Sequence Classification by Tree Methodology
Abstract: In many applications such as transaction data analysis, the classification of long chains of sequences is required. For example, brand purchase history in customer transaction data is in a form like AABCABAA, where A, B, and C are brands of a consumer product. The decision tree-based package mbonsai is designed to handle sequence data of varying lengths using one or multiple variables of interest as predictor variables. This software package uses tree growing and pruning strategies adopted from C4.5 and CART algorithms, and includes new features for handling sequence data and indexing for classification purpose. The software uses a simple command line program for learning and predicting processes, and has the ability to generate user-friendly graphics depicting decision trees. The underlying C++ codes are designed to efficiently process large data sets in ASCII files. Two examples from transaction data sets are used to illustrate the application of mbonsai.

Page views:: 730. Submitted: 2016-03-19. Published: 2018-09-03.
Paper: mbonsai: Application Package for Sequence Classification by Tree Methodology     Download PDF (Downloads: 262)
Supplements:
mbonsai.zip: Source code Download (Downloads: 22; 2KB)

DOI: 10.18637/jss.v086.i06

by
This work is licensed under the licenses
Paper: Creative Commons Attribution 3.0 Unported License
Code: GNU General Public License (at least one of version 2 or version 3) or a GPL-compatible license.