Journal of Statistical Software http://www.jstatsoft.org/rss Sat, 23 Aug 2014 13:21:05 GMT Sat, 23 Aug 2014 13:21:05 GMT Most recent publications from the Journal of Statistical Software Nestedness for Dummies (NeD): A User-Friendly Web Interface for Exploratory Nestedness Analysis http://www.jstatsoft.org/v59/c03/paper Vol. 59, Code Snippet 3, Aug 2014

]]>
Wed, 13 Aug 2014 07:00:00 GMT http://www.jstatsoft.org/v59/c03
General Purpose Convolution Algorithm in S4 Classes by Means of FFT http://www.jstatsoft.org/v59/i04/paper Vol. 59, Issue 4, Aug 2014

Abstract:

Object orientation provides a flexible framework for the implementation of the convolution of arbitrary distributions of real-valued random variables. We discuss an algorithm which is based on the fast Fourier transform. It directly applies to lattice-supported distributions. In the case of continuous distributions an additional discretization to a linear lattice is necessary and the resulting lattice-supported distributions are suitably smoothed after convolution.
We compare our algorithm to other approaches aiming at a similar generality as to accuracy and speed. In situations where the exact results are known, several checks confirm a high accuracy of the proposed algorithm which is also illustrated for approximations of non-central χ2 distributions.
By means of object orientation this default algorithm is overloaded by more specific algorithms where possible, in particular where explicit convolution formulae are available. Our focus is on R package distr which implements this approach, overloading operator + for convolution; based on this convolution, we define a whole arithmetics of mathematical operations acting on distribution objects, comprising operators +, -, *, /, and ^.

]]>
Wed, 13 Aug 2014 07:00:00 GMT http://www.jstatsoft.org/v59/i04
ART: A Data Aggregation Program for the Behavioral Sciences http://www.jstatsoft.org/v59/i03/paper Vol. 59, Issue 3, Aug 2014

Abstract:

Today, many experiments in the field of behavioral sciences are conducted using a computer. While there is a broad choice of computer programs facilitating the process of conducting experiments as well as programs for statistical analysis there are relatively few programs facilitating the intermediate step of data aggregation. ART has been developed in order to fill this gap and to provide a computer program for data aggregation that has a graphical user interface such that aggregation can be done more easily and without any programming. All “rules” that are necessary to extract variables can be seen “at a glance” which helps the user to conduct even complex aggregations with several hundreds of variables and makes aggregation more resistant against errors. ART runs with Windows XP, Vista, 7, and 8 and it is free. Copies (executable and source code) are available at http://www.psychologie.hhu.de/arbeitsgruppen/allgemeine-psychologie-und-arbeitspsychologie/art.html.

]]>
Wed, 13 Aug 2014 07:00:00 GMT http://www.jstatsoft.org/v59/i03
The R Package survsim for the Simulation of Simple and Complex Survival Data http://www.jstatsoft.org/v59/i02/paper Vol. 59, Issue 2, Aug 2014

Abstract:

We present an R package for the simulation of simple and complex survival data. It covers different situations, including recurrent events and multiple events. The main simulation routine allows the user to introduce an arbitrary number of distributions, each corresponding to a new event or episode, with its parameters, choosing between the Weibull (and exponential as a particular case), log-logistic and log-normal distributions.

]]>
Wed, 13 Aug 2014 07:00:00 GMT http://www.jstatsoft.org/v59/i02
cancerclass: An R Package for Development and Validation of Diagnostic Tests from High-Dimensional Molecular Data http://www.jstatsoft.org/v59/i01/paper Vol. 59, Issue 1, Aug 2014

Abstract:

Progress in molecular high-throughput techniques has led to the opportunity of a comprehensive monitoring of biomolecules in medical samples. In the era of personalized medicine, these data form the basis for the development of diagnostic, prognostic and predictive tests for cancer. Because of the high number of features that are measured simultaneously in a relatively low number of samples, supervised learning approaches are sensitive to overfitting and performance overestimation. Bioinformatic methods were developed to cope with these problems including control of accuracy and precision. However, there is demand for easy-to-use software that integrates methods for classifier construction, performance assessment and development of diagnostic tests. To contribute to filling of this gap, we developed a comprehensive R package for the development and validation of diagnostic tests from high-dimensional molecular data. An important focus of the package is a careful validation of the classification results. To this end, we implemented an extended version of the multiple random validation protocol, a validation method that was introduced before. The package includes methods for continuous prediction scores. This is important in a clinical setting, because scores can be converted to probabilities and help to distinguish between clear-cut and borderline classification results. The functionality of the package is illustrated by the analysis of two cancer microarray data sets.

]]>
Wed, 13 Aug 2014 07:00:00 GMT http://www.jstatsoft.org/v59/i01
runmixregls: A Program to Run the MIXREGLS Mixed-Effects Location Scale Software from within Stata http://www.jstatsoft.org/v59/c02/paper Vol. 59, Code Snippet 2, Aug 2014

]]>
Wed, 13 Aug 2014 07:00:00 GMT http://www.jstatsoft.org/v59/c02
deltaPlotR: An R Package for Differential Item Functioning Analysis with Angoff ’s Delta Plot http://www.jstatsoft.org/v59/c01/paper Vol. 59, Code Snippet 1, Aug 2014

]]>
Wed, 13 Aug 2014 07:00:00 GMT http://www.jstatsoft.org/v59/c01
Regularization Paths for Conditional Logistic Regression: The clogitL1 Package http://www.jstatsoft.org/v58/i12/paper Vol. 58, Issue 12, Jul 2014

Abstract:

We apply the cyclic coordinate descent algorithm of Friedman, Hastie, and Tibshirani (2010) to the fitting of a conditional logistic regression model with lasso ("1) and elastic net penalties. The sequential strong rules of Tibshirani, Bien, Hastie, Friedman, Taylor, Simon, and Tibshirani (2012) are also used in the algorithm and it is shown that these offer a considerable speed up over the standard coordinate descent algorithm with warm starts.

Once implemented, the algorithm is used in simulation studies to compare the variable selection and prediction performance of the conditional logistic regression model against that of its unconditional (standard) counterpart. We find that the conditional model performs admirably on datasets drawn from a suitable conditional distribution, outperforming its unconditional counterpart at variable selection. The conditional model is also fit to a small real world dataset, demonstrating how we obtain regularization paths for the parameters of the model and how we apply cross validation for this method where natural unconditional prediction rules are hard to come by.

]]>
Sat, 05 Jul 2014 07:00:00 GMT http://www.jstatsoft.org/v58/i12
Optimal Asset Pricing http://www.jstatsoft.org/v58/i11/paper Vol. 58, Issue 11, Jul 2014

Abstract:

We describe an R package for determining the optimal price of an asset which is “perishable” in a certain sense, given the intensity of customer arrivals and a time-varying price sensitivity function which specifies the probability that a customer will purchase an asset offered at a given price at a given time. The package deals with the case of customers arriving in groups, with a probability distribution for the group size being specified. The methodology and software allow for both discrete and continuous pricing. The class of possible models for price sensitivity functions is very wide, and includes piecewise linear models. A mechanism for constructing piecewise linear price sensitivity functions is provided.

]]>
Sat, 05 Jul 2014 07:00:00 GMT http://www.jstatsoft.org/v58/i11
movMF: An R Package for Fitting Mixtures of von Mises-Fisher Distributions http://www.jstatsoft.org/v58/i10/paper Vol. 58, Issue 10, Jul 2014

Abstract:

Finite mixtures of von Mises-Fisher distributions allow to apply model-based clustering methods to data which is of standardized length, i.e., all data points lie on the unit sphere. The R package movMF contains functionality to draw samples from finite mixtures of von Mises-Fisher distributions and to fit these models using the expectation-maximization algorithm for maximum likelihood estimation. Special features are the possibility to use sparse matrix representations for the input data, different variants of the expectation-maximization algorithm, different methods for determining the concentration parameters in the M-step and to impose constraints on the concentration parameters over the components.

In this paper we describe the main fitting function of the package and illustrate its application. In addition we compare the clustering performance of finite mixtures of von Mises-Fisher distributions to spherical k-means. We also discuss the resolution of several numerical issues which occur for estimating the concentration parameters and for determining the normalizing constant of the von Mises-Fisher distribution.

]]>
Sat, 05 Jul 2014 07:00:00 GMT http://www.jstatsoft.org/v58/i10
Statistical Software (R, SAS, SPSS, and Minitab) for Blind Students and Practitioners http://www.jstatsoft.org/v58/s01/paper Vol. 58, Software Review 1, Jul 2014

Statistical Software (R, SAS, SPSS, and Minitab) for Blind Students and Practitioners, version varies
R, SAS, SPSS, and Minitab

]]>
Tue, 01 Jul 2014 07:00:00 GMT http://www.jstatsoft.org/v58/s01
Growth Curve Analysis and Visualization Using R http://www.jstatsoft.org/v58/b02/paper Vol. 58, Book Review 2, Jul 2014

Growth Curve Analysis and Visualization Using R
Daniel Mirman
Chapman & Hall/CRC, 2014
ISBN: 9781466584327

]]>
Tue, 01 Jul 2014 07:00:00 GMT http://www.jstatsoft.org/v58/b02
Analyzing Spatial Models of Choice and Judgment with R http://www.jstatsoft.org/v58/b01/paper Vol. 58, Book Review 1, Jul 2014

Analyzing Spatial Models of Choice and Judgment with R
David A. Armstrong III, Ryan Bakker, Royce Carroll, Christopher Hare, Keith T. Poole, Howard Rosenthal
CRC Press, 2014
ISBN: 978-14665-1715-8

]]>
Tue, 01 Jul 2014 07:00:00 GMT http://www.jstatsoft.org/v58/b01
copulaedas: An R Package for Estimation of Distribution Algorithms Based on Copulas http://www.jstatsoft.org/v58/i09/paper Vol. 58, Issue 9, Jun 2014

Abstract:

The use of copula-based models in EDAs (estimation of distribution algorithms) is currently an active area of research. In this context, the copulaedas package for R provides a platform where EDAs based on copulas can be implemented and studied. The package offers complete implementations of various EDAs based on copulas and vines, a group of well-known optimization problems, and utility functions to study the performance of the algorithms. Newly developed EDAs can be easily integrated into the package by extending an S 4 class with generic functions for their main components. This paper presents copulaedas by providing an overview of EDAs based on copulas, a description of the implementation of the package, and an illustration of its use through examples. The examples include running the EDAs defined in the package, implementing new algorithms, and performing an empirical study to compare the behavior of different algorithms on benchmark functions and a real-world problem.

]]>
Mon, 30 Jun 2014 07:00:00 GMT http://www.jstatsoft.org/v58/i09
%HPGLIMMIX: A High-Performance SAS Macro for GLMM Estimation http://www.jstatsoft.org/v58/i08/paper Vol. 58, Issue 8, Jun 2014

Abstract:

Generalized linear mixed models (GLMMs) comprise a class of widely used statistical tools for data analysis with fixed and random effects when the response variable has a conditional distribution in the exponential family. GLMM analysis also has a close relationship with actuarial credibility theory. While readily available programs such as the GLIMMIX procedure in SAS and the lme4 package in R are powerful tools for using this class of models, these progarms are not able to handle models with thousands of levels of fixed and random effects. By using sparse-matrix and other high performance techniques, procedures such as HPMIXED in SAS can easily fit models with thousands of factor levels, but only for normally distributed response variables. In this paper, we present the %HPGLIMMIX SAS macro that fits GLMMs with large number of sparsely populated design matrices using the doubly-iterative linearization (pseudo-likelihood) method, in which the sparse-matrix-based HPMIXED is used for the inner iterations with the pseudo-variable constructed from the inverse-link function and the chosen model. Although the macro does not have the full functionality of the GLIMMIX procedure, time and memory savings can be large with the new macro. In applications in which design matrices contain many zeros and there are hundreds or thousands of factor levels, models can be fitted without exhausting computer memory, and 90% or better reduction in running time can be observed. Examples with a Poisson, binomial, and gamma conditional distribution are presented to demonstrate the usage and efficiency of this macro.

]]>
Mon, 30 Jun 2014 07:00:00 GMT http://www.jstatsoft.org/v58/i08