Current Volume | Browse | Search | RSSHome | Instructions for Authors | JSS Style Guide | Editorial Board

Authors: Hadley Wickham
Title: [download]
(87436)
The Split-Apply-Combine Strategy for Data Analysis
Reference: Vol. 40, Issue 1, Apr 2011
Submitted 2009-09-24, Accepted 2010-12-27
Type: Article
Abstract:

Many data analysis problems involve the application of a split-apply-combine strategy, where you break up a big problem into manageable pieces, operate on each piece independently and then put all the pieces back together. This insight gives rise to a new R package that allows you to smoothly apply this strategy, without having to worry about the type of structure in which your data is stored.

The paper includes two case studies showing how these insights make it easier to work with batting records for veteran baseball players and a large 3d array of spatio-temporal ozone measurements.

Paper: [download]
(87436)
The Split-Apply-Combine Strategy for Data Analysis
(application/pdf, 2.6 MB)
Supplements: [download]
(1634)
plyr_1.4.1.tar.gz: R source package
(application/x-gzip, 509.6 KB)
[download]
(2756)
v40i01.R: R example code from the paper
(application/octet-stream, 8.1 KB)
[download]
(1894)
ozone-map.R: Supplementary R code for ozone map
(application/octet-stream, 1.2 KB)
[download]
(1862)
timings.R: Supplementary R code for timing comparisons
(application/octet-stream, 1.4 KB)
Resources: BibTeX | OAI
Creative Commons License
This work is licensed under the licenses
Paper: Creative Commons Attribution 3.0 Unported License
Code: GNU General Public License (at least one of version 2 or version 3)
Current Volume | Browse | Search | RSSHome | Instructions for Authors | JSS Style Guide | Editorial Board