Data Science in Stata 16: Frames, Lasso, and Python Integration

Anson T. Y. Ho, Kim P. Huynh, David T. Jacho-Chávez, Diego Rojas-Baez

Main Article Content

Abstract

Stata is one of the most widely used software for data analysis, statistics, and model fitting by economists, public policy researchers, epidemiologists, among others. Stata's recent release of version 16 in June 2019 includes an up-to-date methodological library and a user-friendly version of various cutting edge techniques. In the newest release, Stata has implemented several changes and additions that include:
• Lasso
• Multiple data sets in memory
• Meta-analysis
• Choice models
• Python integration
• Bayes-multiple chains
• Panel-data ERMs
• Sample-size analysis for CIs
• Panel-data mixed logit
• Nonlinear DSGE models
• Numerical integration


This review covers the most salient innovations in Stata 16. It is the first release that brings along an implementation of machine-learning tools. The three innovations we considered are: (1) Multiple data sets in Memory, (2) Lasso for causal inference, and (3) Python integration.

Article Details

Article Sidebar