Near-Far Matching in R: The nearfar Package

Joseph Rigdon, Michael Baiocchi, Sanjay Basu

Main Article Content


Estimating the causal treatment effect of an intervention using observational data is difficult due to unmeasured confounders. Many analysts use instrumental variables (IVs) to introduce a randomizing element to observational data analysis, potentially reducing bias created by unobserved confounders. Several persistent problems in the field have served as limitations to IV analyses, particularly the prevalence of "weak" IVs, or instrumental variables that do not effectively randomize individuals to the intervention or control group (leading to biased and unstable treatment effect estimates), as well as IV-based estimates being highly model dependent, requiring parametric adjustment for measured confounders, and often having high mean squared errors in the estimated causal effects. To overcome these problems, the study design method of "near-far matching" has been devised, which "filters" data from a cohort by simultaneously matching individuals within the cohort to be "near" (similar) on measured confounders and "far" (different) on levels of an IV. To facilitate the application of near-far matching to analytical problems, we introduce the R package nearfar and illustrate its application to both a classical example and a simulated dataset. We illustrate how the package can be used to "strengthen" a weak IV by adjusting the "near-ness" and "far-ness" of a match, reduce model dependency, enable nonparametric adjustment for measured confounders, and lower mean squared error in estimated causal effects. We additionally illustrate how to utilize the nearfar package when analyzing either continuous or binary treatments, how to prioritize variables in the match, and how to calculate F statistics of IV strength with or without adjustment for measured confounders.

Article Details

Article Sidebar