https://www.jstatsoft.org/index.php/jss/issue/feedJournal of Statistical Software2024-02-18T22:58:37+00:00Editorial Officeeditor@jstatsoft.orgOpen Journal SystemsThe Journal of Statistical Software publishes articles on statistical software along with the source code of the software itself and replication code for all empirical results.https://www.jstatsoft.org/index.php/jss/article/view/v108i02The R Package markets: Estimation Methods for Markets in Equilibrium and Disequilibrium2023-01-16T18:47:28+00:00Pantelis Karapanagiotiskarapanagiotis@ebs.edu<p>Market models constitute a significant cornerstone of empirical applications in business, industrial organization, and policymaking macroeconomics. The econometric literature proposes various estimation methods for markets in equilibrium, which entail a market-clearing structural condition, and disequilibrium, which are described based on a structural short-side rule. Nonetheless, maximum likelihood estimations of such models are computationally demanding, and software providing simple, out-of-the-box methods for estimating them is scarce. Therefore, applications rely on project-specific implementations for estimating these models, which hinders research reproducibility and result comparability. This article presents the R package markets, which provides a common interface with generic functionality simplifying the estimation of models for markets in equilibrium and disequilibrium. The package specializes in estimating demanded, supplied, and aggregated market quantities and absolute, normalized, and relative market shortages. Its functionality is exemplified via an empirical application using a classic dataset of United States credit for housing starts. Moreover, the article details the scope and design of the implementation and provides statistical measurements of the computational performance of its estimation functionality gathered via large-scale benchmarking simulations. The markets package is free software distributed under the Expat license as part of the R software ecosystem. It comprises a set of estimation and analysis tools that are not directly available from either alternative R packages or other statistical software projects.</p>2024-02-18T00:00:00+00:00Copyright (c) 2024 Pantelis Karapanagiotishttps://www.jstatsoft.org/index.php/jss/article/view/v108i03DoubleML: An Object-Oriented Implementation of Double Machine Learning in R2021-08-05T21:20:01+00:00Philipp Bachphilipp.bach@uni-hamburg.deMalte S. Kurzmalte.kurz@tum.deVictor Chernozhukovvchern@mit.eduMartin Spindlermartin.spindler@uni-hamburg.deSven Klaassensven.klaassen@uni-hamburg.de<p>The R package DoubleML implements the double/debiased machine learning framework of Chernozhukov, Chetverikov, Demirer, Duflo, Hansen, Newey, and Robins (2018). It provides functionalities to estimate parameters in causal models based on machine learning methods. The double machine learning framework consists of three key ingredients: Neyman orthogonality, high-quality machine learning estimation and sample splitting. Estimation of nuisance components can be performed by various state-of-the-art machine learning methods that are available in the mlr3 ecosystem. DoubleML makes it possible to perform inference in a variety of causal models, including partially linear and interactive regression models and their extensions to instrumental variable estimation. The object-oriented implementation of DoubleML enables a high flexibility for the model specification and makes it easily extendable. This paper serves as an introduction to the double machine learning framework and the R package DoubleML. In reproducible code examples with simulated and real data sets, we demonstrate how DoubleML users can perform valid inference based on machine learning methods.</p>2024-02-18T00:00:00+00:00Copyright (c) 2024 Philipp Bach, Malte S. Kurz, Victor Chernozhukov, Martin Spindler, Sven Klaassenhttps://www.jstatsoft.org/index.php/jss/article/view/v108i04gcimpute: A Package for Missing Data Imputation2023-02-28T14:19:11+00:00Yuxuan Zhaoyz2295@cornell.eduMadeleine Udelludell@stanford.edu<p>This article introduces the Python package gcimpute for missing data imputation. Package gcimpute can impute missing data with many different variable types, including continuous, binary, ordinal, count, and truncated values, by modeling data as samples from a Gaussian copula model. This semiparametric model learns the marginal distribution of each variable to match the empirical distribution, yet describes the interactions between variables with a joint Gaussian that enables fast inference, imputation with confidence intervals, and multiple imputation. The package also provides specialized extensions to handle large datasets (with complexity linear in the number of observations) and streaming datasets (with online imputation). This article describes the underlying methodology and demonstrates how to use the software package.</p>2024-02-18T00:00:00+00:00Copyright (c) 2024 Yuxuan Zhao, Madeleine Udellhttps://www.jstatsoft.org/index.php/jss/article/view/v108i05melt: Multiple Empirical Likelihood Tests in R2022-11-16T16:02:45+00:00Eunseop Kimkim.7302@osu.eduSteven N. MacEachernsnm@stat.osu.eduMario Peruggiaperuggia@stat.osu.edu<p>Empirical likelihood enables a nonparametric, likelihood-driven style of inference without relying on assumptions frequently made in parametric models. Empirical likelihood-based tests are asymptotically pivotal and thus avoid explicit studentization. This paper presents the R package melt that provides a unified framework for data analysis with empirical likelihood methods. A collection of functions are available to perform multiple empirical likelihood tests for linear and generalized linear models in R. The package melt offers an easy-to-use interface and flexibility in specifying hypotheses and calibration methods, extending the framework to simultaneous inferences. Hypothesis testing uses a projected gradient algorithm to solve constrained empirical likelihood optimization problems. The core computational routines are implemented in C++, with OpenMP for parallel computation.</p>2024-02-18T00:00:00+00:00Copyright (c) 2024 Eunseop Kim, Steven N. MacEachern, Mario Peruggiahttps://www.jstatsoft.org/index.php/jss/article/view/v108i06PUMP: Estimating Power, Minimum Detectable Effect Size, and Sample Size When Adjusting for Multiple Outcomes in Multi-Level Experiments2023-01-30T18:24:09+00:00Kristen B. Hunterkristen.hunter@unsw.edu.auLuke Miratrixlmiratrix@g.harvard.eduKristin Porterkristin.porter@keporterconsulting.com<p>For randomized controlled trials (RCTs) with a single intervention's impact being measured on multiple outcomes, researchers often apply a multiple testing procedure (such as Bonferroni or Benjamini-Hochberg) to adjust p values. Such an adjustment reduces the likelihood of spurious findings, but also changes the statistical power, sometimes substantially. A reduction in power means a reduction in the probability of detecting effects when they do exist. This consideration is frequently ignored in typical power analyses, as existing tools do not easily accommodate the use of multiple testing procedures. We introduce the PUMP (Power Under Multiplicity Project) R package as a tool for analysts to estimate statistical power, minimum detectable effect size, and sample size requirements for multi-level RCTs with multiple outcomes. PUMP uses a simulation-based approach to flexibly estimate power for a wide variety of experimental designs, number of outcomes, multiple testing procedures, and other user choices. By assuming linear mixed effects models, we can draw directly from the joint distribution of test statistics across outcomes and thus estimate power via simulation. One of PUMP's main innovations is accommodating multiple outcomes, which are accounted for in two ways. First, power estimates from PUMP properly account for the adjustment in p values from applying a multiple testing procedure. Second, when considering multiple outcomes rather than a single outcome, different definitions of statistical power emerge. PUMP allows researchers to consider a variety of definitions of power in order to choose the most appropriate types of power for the goals of their study. The package supports a variety of commonly used frequentist multi-level RCT designs and linear mixed effects models. In addition to the main functionality of estimating power, minimum detectable effect size, and sample size requirements, the package allows the user to easily explore sensitivity of these quantities to changes in underlying assumptions.</p>2024-03-18T00:00:00+00:00Copyright (c) 2024 Kristen B. Hunter, Luke Miratrix, Kristin Porterhttps://www.jstatsoft.org/index.php/jss/article/view/v108i07Holistic Generalized Linear Models2022-09-15T11:40:42+00:00Benjamin Schwendingerbenjaminschwe@gmail.comFlorian SchwendingerFlorianSchwendinger@gmx.atLaura Vanalaura.vana.guer@tuwien.ac.at<p>Holistic linear regression extends the classical best subset selection problem by adding additional constraints designed to improve the model quality. These constraints include sparsity-inducing constraints, sign-coherence constraints and linear constraints. The R package holiglm provides functionality to model and fit holistic generalized linear models. By making use of state-of-the-art mixed-integer conic solvers, the package can reliably solve generalized linear models for Gaussian, binomial and Poisson responses with a multitude of holistic constraints. The high-level interface simplifies the constraint specification and can be used as a drop-in replacement for the stats::glm() function.</p>2024-02-18T00:00:00+00:00Copyright (c) 2024 Benjamin Schwendinger, Florian Schwendinger, Laura Vana