https://www.jstatsoft.org/index.php/jss/issue/feedJournal of Statistical Software2020-03-15T23:53:50+00:00Editorial Officeeditor@jstatsoft.orgOpen Journal Systemshttps://www.jstatsoft.org/index.php/jss/article/view/v092i01Most Likely Transformations: The mlt Package2020-03-02T00:47:11+00:00Torsten HothornTorsten.Hothorn@uzh.chThe mlt package implements maximum likelihood estimation in the class of conditional transformation models. Based on a suitable explicit parameterization of the unconditional or conditional transformation function using infrastructure from package basefun, we show how one can define, estimate, and compare a cascade of increasingly complex transformation models in the maximum likelihood framework. Models for the unconditional or conditional distribution function of any univariate response variable are set-up and estimated in the same computational framework simply by choosing an appropriate transformation function and parameterization thereof. As it is computationally cheap to evaluate the distribution function, models can be estimated by maximization of the exact likelihood, especially in the presence of random censoring or truncation. The relatively dense high-level implementation in the R system for statistical computing allows generalization of many established implementations of linear transformation models, such as the Cox model or other parametric models for the analysis of survival or ordered categorical data, to the more complex situations illustrated in this paper.2020-02-18T11:32:27+00:00Copyright (c) 2020 Torsten Hothornhttps://www.jstatsoft.org/index.php/jss/article/view/v092i02The Calculus of M-Estimation in R with geex2020-03-02T00:47:11+00:00Bradley C. Saulbradleysaul@gmail.comMichael G. Hudgensmhudgens@email.unc.eduM-estimation, or estimating equation, methods are widely applicable for point estimation and asymptotic inference. In this paper, we present an R package that can find roots and compute the empirical sandwich variance estimator for any set of user-specified, unbiased estimating equations. Examples from the M-estimation primer by Stefanski and Boos (2002) demonstrate use of the software. The package also includes a framework for finite sample, heteroscedastic, and autocorrelation variance corrections, and a website with an extensive collection of tutorials.2020-02-18T11:32:27+00:00Copyright (c) 2020 Bradley C. Saul, Michael G. Hudgenshttps://www.jstatsoft.org/index.php/jss/article/view/v092i03PAFit: An R Package for the Non-Parametric Estimation of Preferential Attachment and Node Fitness in Temporal Complex Networks2020-03-02T00:47:11+00:00Thong Phamthong.pham@riken.jpPaul Sheridanno@e-mail.providedHidetoshi Shimodairano@e-mail.providedMany real-world systems are profitably described as complex networks that grow over time. Preferential attachment and node fitness are two simple growth mechanisms that not only explain certain structural properties commonly observed in real-world systems, but are also tied to a number of applications in modeling and inference. While there are statistical packages for estimating various parametric forms of the preferential attachment function, there is no such package implementing non-parametric estimation procedures. The non-parametric approach to the estimation of the preferential attachment function allows for comparatively finer-grained investigations of the "rich-get-richer" phenomenon that could lead to novel insights in the search to explain certain nonstandard structural properties observed in real-world networks. This paper introduces the R package PAFit, which implements non-parametric procedures for estimating the preferential attachment function and node fitnesses in a growing network, as well as a number of functions for generating complex networks from these two mechanisms. The main computational part of the package is implemented in C++ with OpenMP to ensure scalability to large-scale networks. In this paper, we first introduce the main functionalities of PAFit through simulated examples, and then use the package to analyze a collaboration network between scientists in the field of complex networks. The results indicate the joint presence of "richget-richer" and "fit-get-richer" phenomena in the collaboration network. The estimated attachment function is observed to be near-linear, which we interpret as meaning that the chance an author gets a new collaborator is proportional to their current number of collaborators. Furthermore, the estimated author fitnesses reveal a host of familiar faces from the complex networks community among the field's topmost fittest network scientists.2020-02-18T11:32:27+00:00Copyright (c) 2020 Thong Pham, Paul Sheridan, Hidetoshi Shimodairahttps://www.jstatsoft.org/index.php/jss/article/view/v092i04Integration of R and Scala Using rscala2020-03-02T00:47:11+00:00David B. Dahldahl@stat.byu.eduThe rscala software is a simple, two-way bridge between R and Scala that allows users to leverage the unique strengths of both languages in a single project. Scala classes can be instantiated from R and Scala methods can be called. Arbitrary Scala code can be executed on-the-fly from within R and callbacks to R are supported. R packages can be developed based on Scala. Conversely, rscala also enables R code to be embedded within a Scala application. The rscala package is available from the Comprehensive R Archive Network (CRAN) and has no dependencies beyond base R and the Scala standard library.2020-02-18T11:32:27+00:00Copyright (c) 2020 David B. Dahlhttps://www.jstatsoft.org/index.php/jss/article/view/v092i05BPEC: An R Package for Bayesian Phylogeographic and Ecological Clustering2020-03-02T00:47:11+00:00Ioanna Manolopoulouioanna@stats.ucl.ac.ukAxel Hilleaxel.hille@gmx.netBrent Emersonbemerson@ipna.csic.esBPEC is an R package for Bayesian phylogeographic and ecological clustering which allows geographical, environmental and phenotypic measurements to be combined with deoxyribonucleic acid (DNA) sequences in order to reveal geographic structuring of DNA sequence clusters consistent with migration events. DNA sequences are modelled using a collapsed version of a simplified coalescent model projected onto haplotype trees, which subsequently give rise to constrained clusterings as migrations occur. Within each cluster, a multivariate Gaussian distribution of the covariates (geographical, environmental, phenotypic) is used. Inference follows tailored reversible jump Markov chain Monte Carlo sampling so that the number of clusters (i.e., migrations) does not need to be pre-specified. A number of output plots and visualizations are provided which reflect the posterior distribution of the parameters of interest. BPEC also includes functions that create output files which can be loaded into Google Earth. The package commands are illustrated through an example dataset of the polytypic Near Eastern brown frog Rana macrocnemis analyzed using BPEC.2020-03-02T00:00:00+00:00Copyright (c) 2020 Ioanna Manolopoulou, Axel Hille, Brent Emersonhttps://www.jstatsoft.org/index.php/jss/article/view/v092i06The MOEADr Package: A Component-Based Framework for Multiobjective Evolutionary Algorithms Based on Decomposition2020-03-02T00:47:11+00:00Felipe Campelof.campelo@aston.ac.ukLucas S. Batistalusoba@ufmg.brClaus Aranhacaranha@cs.tsukuba.ac.jpMultiobjective evolutionary algorithms based on decomposition (MOEA/D) represent a widely used class of population-based metaheuristics for the solution of multicriteria optimization problems. We introduce the MOEADr package, which offers many of these variants as instantiations of a component-oriented framework. This approach contributes for easier reproducibility of existing MOEA/D variants from the literature, as well as for faster development and testing of new composite algorithms. The package offers an standardized, modular implementation of MOEA/D based on this framework, which was designed aiming at providing researchers and practitioners with a standard way to discuss and express MOEA/D variants. In this paper we introduce the design principles behind the MOEADr package, as well as its current components. Three case studies are provided to illustrate the main aspects of the package.2020-02-23T00:00:00+00:00Copyright (c) 2020 Felipe Campelo, Lucas S. Batista, Claus Aranhahttps://www.jstatsoft.org/index.php/jss/article/view/v092i07SQUAREM: An R Package for Off-the-Shelf Acceleration of EM, MM and Other EM-Like Monotone Algorithms2020-03-02T00:47:11+00:00Yu Duydu10@jhu.eduRavi Varadhanrvaradhan@jhmi.eduWe discuss the R package SQUAREM for accelerating iterative algorithms which exhibit slow, monotone convergence. These include the well-known expectation-maximization algorithm, majorize-minimize (MM), and other EM-like algorithms such as expectation conditional maximization, and generalized EM algorithms. We demonstrate the simplicity, generality, and power of SQUAREM through a wide array of applications of EM/MM problems, including binary Poisson mixture, factor analysis, interval censoring, genetics admixture, and logistic regression maximum likelihood estimation (an MM problem). We show that SQUAREM is easy to apply, and can accelerate any fixed-point, smooth, contraction mapping with linear convergence rate. The squared iterative scheme (SQUAREM) algorithm provides significant speed-up of EM-like algorithms. The margin of the advantage for SQUAREM is especially huge for high-dimensional problems or when the EM step is relatively time-consuming to evaluate. SQUAREM can be used off-the-shelf since there is no need for the user to tweak any control parameters to optimize performance. Given its remarkable ease of use, SQUAREM may be considered as a default accelerator for slowly converging EM-like algorithms. All the comparisons of CPU computing time in the paper are made on a quad-core 2.3 GHz Intel Core i7 Mac computer. R package SQUAREM is available from the Comprehensive R Archive Network (CRAN) at https://CRAN.R-project.org/package=SQUAREM/.2020-02-23T00:00:00+00:00Copyright (c) 2020 Yu Du, Ravi Varadhanhttps://www.jstatsoft.org/index.php/jss/article/view/v092i08Computing the Oja Median in R: The Package OjaNP2020-03-02T00:47:11+00:00Daniel Fischerdaniel.fischer@luke.fiKarl Moslerno@e-mail.providedJyrki Möttönenno@e-mail.providedKlaus Nordhausenno@e-mail.providedOleksii Pokotylono@e-mail.providedDaniel Vogelno@e-mail.providedThe Oja median is one of several extensions of the univariate median to the multivariate case. It has many desirable properties, but is computationally demanding. In this paper, we first review the properties of the Oja median and compare it to other multivariate medians. Then, we discuss four algorithms to compute the Oja median, which are implemented in our R package OjaNP. Besides these algorithms, the package contains also functions to compute Oja signs, Oja signed ranks, Oja ranks, and the related scatter concepts. To illustrate their use, the corresponding multivariate one- and C-sample location tests are implemented.2020-02-23T00:00:00+00:00Copyright (c) 2020 Daniel Fischer, Karl Mosler, Jyrki Möttönen, Klaus Nordhausen, Oleksii Pokotylo, Daniel Vogelhttps://www.jstatsoft.org/index.php/jss/article/view/v092i09spBayesSurv: Fitting Bayesian Spatial Survival Models Using R2020-03-02T00:47:11+00:00Haiming Zhouzhouh@niu.eduTimothy Hansontim.hanson2@medtronic.comJiajia Zhangjzhang@mailbox.sc.eduSpatial survival analysis has received a great deal of attention over the last 20 years due to the important role that geographical information can play in predicting survival. This paper provides an introduction to a set of programs for implementing some Bayesian spatial survival models in R using the package spBayesSurv. The function survregbayes includes the three most commonly-used semiparametric models: proportional hazards, proportional odds, and accelerated failure time. All manner of censored survival times are simultaneously accommodated including uncensored, interval censored, current-status, left and right censored, and mixtures of these. Left-truncated data are also accommodated. Time-dependent covariates are allowed under the piecewise constant assumption. Both georeferenced and areally observed spatial locations are handled via frailties. Model fit is assessed with conditional Cox-Snell residual plots, and model choice is carried out via the log pseudo marginal likelihood, the deviance information criterion and the WatanabeAkaike information criterion. The accelerated failure time frailty model with a covariatedependent baseline is included in the function frailtyGAFT. In addition, the package also provides two marginal survival models: proportional hazards and linear dependent Dirichlet process mixtures, where the spatial dependence is modeled via spatial copulas. Note that the package can also handle non-spatial data using non-spatial versions of the aforementioned models.2020-02-27T00:00:00+00:00Copyright (c) 2020 Haiming Zhou, Timothy Hanson, Jiajia Zhanghttps://www.jstatsoft.org/index.php/jss/article/view/v092i10bridgesampling: An R Package for Estimating Normalizing Constants2020-03-02T00:47:11+00:00Quentin F. GronauQuentin.F.Gronau@gmail.comHenrik SingmannHenrik.Singmann@warwick.ac.ukEric-Jan WagenmakersEJ.Wagenmakers@gmail.comStatistical procedures such as Bayes factor model selection and Bayesian model averaging require the computation of normalizing constants (e.g., marginal likelihoods). These normalizing constants are notoriously difficult to obtain, as they usually involve highdimensional integrals that cannot be solved analytically. Here we introduce an R package that uses bridge sampling (Meng and Wong 1996; Meng and Schilling 2002) to estimate normalizing constants in a generic and easy-to-use fashion. For models implemented in Stan, the estimation procedure is automatic. We illustrate the functionality of the package with three examples.2020-02-27T00:00:00+00:00Copyright (c) 2020 Quentin F. Gronau, Henrik Singmann, Eric-Jan Wagenmakershttps://www.jstatsoft.org/index.php/jss/article/view/v092i11Algebraic Analysis of Multiple Social Networks with multiplex2020-03-02T00:47:11+00:00J. Antonio Rivero Ostoicmultiplex@post.commultiplex is a computer program that provides algebraic tools for the analysis of multiple network structures within the R environment. Apart from the possibility to create and manipulate multivariate data representing multiplex, signed, and two-mode networks, this package offers a collection of functions that deal with algebraic systems - such as the partially ordered semigroup, and balance or cluster semirings - their decomposition, and the enumeration of bundle patterns occurring at different levels of the network. Moreover, through Galois derivations between families of the pairs of subsets in different domains it is possible to analyze affiliation networks with an algebraic approach. Visualization of multigraphs, different forms of bipartite graphs, inclusion lattices, Cayley graphs is supported as well with related packages.2020-03-01T00:00:00+00:00Copyright (c) 2020 J. Antonio Rivero Ostoichttps://www.jstatsoft.org/index.php/jss/article/view/v092i12Fitting Prediction Rule Ensembles with R Package pre2020-03-15T23:53:50+00:00Marjolein Fokkemam.fokkema@fsw.leidenuniv.nlPrediction rule ensembles (PREs) are sparse collections of rules, offering highly interpretable regression and classification models. This paper shows how they can be fitted using function pre from R package pre, which derives PREs largely through the methodology of Friedman and Popescu (2008). The implementation and functionality of pre is described and illustrated through application on a dataset on the prediction of depression. Furthermore, accuracy and sparsity of pre is compared with that of single trees, random forests, lasso regression and the original RuleFit implementation of Friedman and Popescu (2008) in four benchmark datasets. Results indicate that pre derives ensembles with predictive accuracy similar to that of random forests, while using a smaller number of variables for prediction. Furthermore, pre provided better accuracy and sparsity than the original RuleFit implementation.2020-03-16T00:00:00+00:00Copyright (c) 2020 Marjolein Fokkemahttps://www.jstatsoft.org/index.php/jss/article/view/v092c01Working with User Agent Strings in Stata: The parseuas Command2020-03-02T00:47:11+00:00Joss Roßmannjoss.rossmann@gesis.orgTobias Gummertobias.gummer@gesis.orgLars Kaczmireklars.kaczmirek@univie.ac.atWith the rising popularity of web surveys and the increasing use of paradata by survey methodologists, assessing information stored in user agent strings becomes inevitable. These data contain meaningful information about the browser, operating system, and device that a survey respondent uses. This article provides an overview of user agent strings, their specific structure and history, how they can be obtained when conducting a web survey, as well as what kind of information can be extracted from the strings. Further, the user written command parseuas is introduced as an efficient means to gather detailed information from user agent strings. The application of parseuas is illustrated by an example that draws on a pooled data set consisting of 29 web surveys.2020-02-18T11:32:27+00:00Copyright (c) 2020 Joss Roßmann, Tobias Gummer, Lars Kaczmirekhttps://www.jstatsoft.org/index.php/jss/article/view/v092c02gdpc: An R Package for Generalized Dynamic Principal Components2020-03-02T00:47:11+00:00Daniel Peñadaniel.pena@uc3m.esEzequiel Smucleresmucler@utdt.eduVictor J. Yohaivyohai@dm.uba.argdpc is an R package for the computation of the generalized dynamic principal components proposed in Peña and Yohai (2016). In this paper, we briefly introduce the problem of dynamical principal components, propose a solution based on a reconstruction criteria and present an automatic procedure to compute the optimal reconstruction. This solution can be applied to the non-stationary case, where the components need not be a linear combination of the observations, as is the case in the proposal of Brillinger (1981). This article discusses some new features that are included in the package and that were not considered in Peña and Yohai (2016). The most important one is an automatic procedure for the identification of both the number of lags to be used in the generalized dynamic principal components as well as the number of components required for a given reconstruction accuracy. These tools make it easy to use the proposed procedure in large data sets. The procedure can also be used when the number of series is larger than the number of observations. We describe an iterative algorithm and present an example of the use of the package with real data.2020-02-23T00:00:00+00:00Copyright (c) 2020 Daniel Peña, Ezequiel Smucler, Victor J. Yohaihttps://www.jstatsoft.org/index.php/jss/article/view/v092b01R Graphics (3rd Edition)2020-03-02T00:47:11+00:00Jose M. Pavíapavia@uv.es2020-02-18T11:32:27+00:00Copyright (c) 2020 Jose M. Pavía