A Recipe for inferference : Start with Causal Inference. Add Interference. Mix Well with R .

In causal inference, interference occurs when the treatment of one subject aﬀects the outcome of other subjects. Interference can distort research conclusions about causal eﬀects when not accounted for properly. In the absence of interference, inverse probability weighted (IPW) estimators are commonly used to estimate causal eﬀects from observational data. Recently, IPW estimators have been extended to handle interference. Tchetgen Tchetgen and VanderWeele (2012) proposed IPW methods to estimate direct and indirect (or spillover) eﬀects that allow for interference between individuals within groups. In this paper, we present inferference , an R package that computes these IPW causal eﬀect estimates when interference may be present within groups. We illustrate use of the package with examples from political science and infectious disease.


Introduction
Interference occurs when the treatment (or exposure) of one subject affects the outcome of other subjects (Cox 1958). Without accounting for interference, measuring only a treatment's direct effect may be misleading. For example, a vaccine's direct effect on an individual in a group with a large proportion of vaccinated individuals can be small. However, the protective, indirect effect from other group members' vaccinations may be large. In this case the vaccine may be judged to be ineffective based on the direct effect despite possibly having great public health utility due to the indirect effect (Clemens, Shin, and Ali 2011). Other areas where interference may be present include criminology (e.g., Sampson 2010; Verbitsky-Savitz and Raudenbush 2012), developmental psychology (e.g., Duncan, Boisjoly, Kremer, Levy, and Eccles 2005;Foster 2010), econometrics (e.g., Sobel 2006;Manski 2013), education (e.g., Hong and Raudenbush 2006;VanderWeele, Hong, Jones, and Brown 2013), imaging (e.g., Luo, Small, shan R. Li, and Rosenbaum 2012), political science (e.g., Sinclair, McConnell, and Green 2012;Bowers, Fredrickson, and Panagopoulos 2013), social media and network analysis (e.g., VanderWeele and An 2013; Toulis and Kao 2013;Eckles, Karrer, and Ugander 2017;Kramer, Guillory, and Hancock 2014), sociology (e.g., Aronow and Samii 2017), and spatial analyses (e.g., Zigler, Dominici, and Wang 2012;Graham, McCoy, and Stephens 2013).
Inverse probability weighted (IPW) methods are often used to estimate causal effects when interference is absent (Rosenbaum 1987;Robins, Hernan, and Brumback 2000;Lunceford and Davidian 2004;Cole and Hernan 2008). Recent developments have extended IPW estimators to estimate causal effects when interference may be present in either randomized or observational studies. Tchetgen Tchetgen and VanderWeele (2012) proposed estimators for observational studies assuming partial interference (Sobel 2006), i.e., individuals can be partitioned into groups where there may be interference between individuals in the same group but not between individuals in different groups. Partial interference could be reasonable, for example, in study of bovine disease where physical separation of herds precludes pathogen transmission between herds. On the other hand, if birds or farm workers could spread the pathogen between herds, then partial interference may be questionable. The Tchetgen Tchetgen and VanderWeele (2012) IPW estimators require a model for the group-level propensity score (i.e., the probability of a group's observed treatment allocation). The large sample properties of these estimators were derived by Perez-Heydrich, Hudgens, Halloran, Clemens, Ali, and Emch (2014).
To date, software for analysis of causal effects in the presence of interference is limited. Without interference, the R (R Core Team 2017) package ipw provides tools to compute IPW estimators (Van der Wal and Geskus 2011). Existing interference-related R packages, including interferenceCI (Rigdon 2015) and blockTools (Moore 2016), were designed for analysis of randomized experiments. In this paper, we present the R package inferference (Saul 2017) which computes the Tchetgen Tchetgen and VanderWeele (2012) IPW estimators and the large sample variance estimators developed by Perez-Heydrich et al. (2014). Package inferference is available from the Comprehensive R Archive Network (CRAN) at https://CRAN.R-project.org/package=inferference. The outline for the remainder of this paper is as follows. The next section provides background on interference and an overview of the mathematical concepts and notation. Section 3 describes the package's main features. Sections 4 and 5 demonstrate the software with examples from public health and political science. The example in Section 5 shows advanced features of the package. We discuss computational issues with IPW estimators when groups have large numbers of individuals in Section 6. We conclude with a brief discussion and future directions in Section 7.

A brief history of interference
Much of causal inference assumes that the exposure of one individual does not affect the outcomes of other individuals, i.e., there is no interference between individuals. Rubin (1980) bundled no interference with the assumption that treatments for all units are comparable (no hidden forms of treatment) into the "stable unit treatment value assumption" (SUTVA). Despite sporadic efforts, the research community gave little attention to this assumption until the early 2000s. One approach to relaxing the no interference assumption is to assume partial interference. Under this assumption, space, time, and/or social network groupings preclude interference between individuals in different groups, but interference may occur within a group. Sobel (2006), Halloran (2008), andVanderWeele (2012) developed methods based on the assumption of partial interference to estimate causal effects in the presence of interference. When an experimenter randomizes units -by design -at the group and individual levels, Hudgens and Halloran (2008) defined estimators that, under certain assumptions, are unbiased for a treatment's direct and indirect (or spillover) effects.
Observational studies complicate estimation of interference effects. Tchetgen Tchetgen and VanderWeele (2012) proposed IPW estimators of causal effects based on group-level propensity scores for non-randomized treatment allocation. They showed these estimators to be unbiased when the propensity score is known. Perez-Heydrich et al. (2014) derived the large sample properties of these estimators when the propensity scores are unknown but correctly modeled. They applied these results to draw inference about the direct and indirect effects of cholera vaccination in Matlab, Bangladesh.

Basic partial interference setup
Consider N individuals partitioned into m groups, each with n i individuals for i = 1, . . . , m. The triplet (Y ij , A ij , X ij ) represents the observed outcome, treatment, and baseline covariate vector, respectively, for individual j in group i. We let capitalized letters denote random variables, and lowercase letters (e.g., (y ij , a ij , x ij )) denote observed or realized values. Let X i and A i be the matrix of baseline covariates and vector of treatment allocations for members of group i. Let A i,−j = (A i1 , . . . , A ij−1 , A ij+1 , . . . , A in i ) represent a group's treatment allocation excluding the jth subject. Let Y ij (a ij , a i,−j ) = Y ij (a i ) be the potential outcome for individual j in group i if, possibly contrary to fact, group i received a i . By causal consistency, Y ij = Y ij (A i ) (Pearl 2010). Let Y i be the vector of potential outcomes for group i. By assuming no interference between groups, an individual's potential outcome may depend only on the treatment allocation of its group. The set A(n i ) contains all of group i's possible treatment vectors. With a binary treatment, A ij ∈ {a 1 , a 2 }, this set has 2 n i elements.

Estimands
Without interference, researchers often estimate an average treatment effect, which contrasts the average outcome for two treatment allocations: the entire population treated versus the entire population untreated. With interference, causal estimands may be defined in terms of the continuum of treatment allocation strategies between those extremes. In inferference, we consider Bernoulli-type allocation strategies proposed by Tchetgen Tchetgen and VanderWeele (2012), where individuals independently receive treatment with probability α. For this allocation strategy, the probability of a group's treatment vector is de- The analyst may compute the estimators described below over a range of α's to explore hypothetical underlying treatment allocations. In their analysis of a cholera vaccine study, Perez-Heydrich et al. (2014) examined strategies between α = 0.3 and α = 0.6 because 75% of the groups had observed vaccine coverages in that range. Define an individual's average potential outcome when assigned treatment a under strategy In words, Y ij (a; α) is a weighted average of individual j's potential outcomes under possible treatment vectors of the other n i − 1 subjects in group i weighted by the probability of each treatment vector. Similarly, define the marginal individual average potential outcome by Here, the weighted average of individual j's potential outcomes is across all group treatment vectors in A(n i ).
A simple mean of individual average potential outcomes within a cluster defines group average potential outcomes. Then group-level estimands are averaged to make population-level esti- is the population-level average outcome when individuals receive treatment a and their group adopts allocation strategy α.
is the population-level average outcome when groups adopt allocation strategy α.
Contrasts of the population average potential outcomes define causal effects. Hudgens and Halloran (2008) describe four causal effects: direct, indirect, total, and overall (see also Tchetgen Tchetgen and VanderWeele 2012). The direct (or unit-level treatment) effect compares average potential outcomes within a single allocation strategy: An indirect effect compares a treatment's average potential outcomes under different allocation strategies: For a binary treatment, there are two indirect effects for a fixed (α, α ) pair: one for a 1 and one for a 2 . If interference is not present, then the indirect effect equals zero. The total effect accounts for both the direct and indirect effects: The overall effect contrasts the marginal average potential outcomes for two allocation strategies: See Hudgens and Halloran (2008) and Tchetgen Tchetgen and VanderWeele (2012) for further discussion of these causal estimands.

IPW estimation
Tchetgen Tchetgen and VanderWeele (2012) proposed IPW estimators of the causal estimands defined above assuming partial interference. Their estimator weights an individual's outcome by the inverse of the group-level propensity score, P(A i |X i ), the probability of a group's treatment allocation given the covariates of the group's individuals. Tchetgen Tchetgen and VanderWeele (2012) showed the IPW estimators to be unbiased when the group-level propensities are known, under the following assumptions: The true propensity scores are not generally known in observational studies and must be estimated. Tchetgen Tchetgen and VanderWeele (2012) suggested estimating P(A i |X i ) using a generalized mixed effects model. We denote these models as where θ x represents fixed effects parameters and θ s a group random effect parameter. Model parameters may be estimated by maximum likelihood methods, which we denote θ = (θ x ,θ s ). For a binary treatment, a model for the group's propensity score might be: is the density of a normal random variable with mean 0 and variance θ s . This is the default group-level propensity score model in inferference; the examples below show how the user can modify the default group propensity score model. Validity of inferences drawn using the methods described below requires correct specification of the group propensity score model. Therefore, it is important in practice to conduct diagnostics to assess the fit of the model employed. For example, if (1) is assumed, then the Tchetgen Tchetgen and Coull (2006) diagnostic test can be used to assess whether the random effects are normally distributed.
The IPW estimator for the group-level average potential outcomes is a straightforward weighted sum, as is the estimator for group-level marginal potential outcomes, From (2) and (3), one constructs population-level average potential outcome and marginal population-level average potential outcome estimators by Estimators for direct, indirect, total, and overall effects simply contrast population-level estimators, Perez-Heydrich et al. (2014) derived asymptotic distributions of the IPW estimators using standard estimating equation theory. Briefly, the IPW estimators above are consistent and asymptotically normal as the number of groups m tends to infinity. When the group-level propensity scores are known (as in the case of simulation or randomized studies), a large sample estimator of the variance of DE(α) is

IPW variance estimation
Results for IE, TE, and OE are analogous. As explained below, when variance_estimation = "naive" in the interference function, this formula is used to compute standard errors and Wald-type confidence intervals.
When the propensity scores are unknown and instead estimated using a parametric model, computing variance estimators is more complicated and involves derivatives of the group propensity with respect to each parameter and derivatives of the propensity model's loglikelihood. The supplementary materials in Perez-Heydrich et al. (2014) contain the mathematical details, and this method is available with the variance_estimation = "robust" option. The "robust" option computes consistent variance estimates which account for the estimation of the weights, whereas the "naive" option computes variance estimates described in the preceeding paragraph which ignore estimation of the weights and are conservative (i.e., tend to be too large).

User's guide
To start, install the package from CRAN using install.packages("inferference"). The list below details the arguments for interference, the primary function in inferference. Special attention should be given to the propensity_integrand and formula arguments.
• formula: formula used to define the causal model. formula has a minimum of 4 parts, separated by | and~in a specific structure: outcome | exposure~covariates | group. The order matters, and the pipes (|) split the data frame into corresponding pieces (Zeileis and Croissant 2010). The exposure~covariates piece is passed as a single formula to the chosen model_method (defined below) used to estimate or fix propensity parameters.
-The following includes a random effect for the group: outcome | exposurec ovariates + (1 | group) | group. In this instance, the group variable appears twice.
-If the study design includes a "participation" variable (as in both examples below), this is easily added to the formula: outcome | exposure | participationc ovariates | group.
• propensity_integrand: a function, which may be created by the user, used to compute the IP weights. This defaults to the function logit_integrand(), which calculates the product of inverse logits for individuals in a group: is a N (0, θ s ) density, and r is a known constant. In an observational study typically r = 1. The examples below include individual randomized experiments in which case r denotes the randomization probability among trial participants. logit_integrand() is the integrand of (1) where h ij (b i ) is scaled by a constant r term. If no random effect is included in the formula, logit_integrand() ignores the random effect. IP weights are computed by numerically integrating propensity_integrand over the random effect distribution using stats::integrate() to which arguments may be passed via ... (see below). The default logit_integrand() also takes the following argument that can be passed via the ... argument in interference(): randomization: a scalar. This is the r in the formula just above. It defaults to 1 in the case that a participation vector is not included. The vaccine study example in Section 4 demonstrates use of this argument.
• loglihood_integrand: a function, which may be created by the user, that defines the log-likelihood of the propensity score model. This should generally be the same function as propensity_integrand, which is the default. Increasing the number of elements of the allocation vector increases computation time; however, a larger number of allocations will make plotted effect estimates smoother. A minimum of two allocations is required. • data: the analysis data frame. This must include all the variables defined in the formula. • model_method: the method used to estimate or set the propensity model parameters.
Must be one of "glm", "glmer", or "oracle". For a fixed effects only model use "glm", or to include random effects use lme4's "glmer" (Bates, Mächler, Bolker, and Walker 2015). logit_integrand only supports a single random effect for the grouping variable, corresponding to b i . When the propensity parameters are known (as in simulations) or if estimating parameters for the propensity model outside of interference, use the "oracle" option. See model_options for details on how to pass the oracle parameters. Defaults to "glmer". • model_options: a list of options passed to the function in model_method. Defaults to list(family = binomial(link = "logit")). When model_method = "oracle", the list must have two elements, fixed.effects and random.effects. If the model does not include random effects, set random.effects = NULL. • causal_estimation_method: currently only supports and defaults to "ipw".
• causal_estmation_options: a list with a single item variance_estimation, which is either "naive" or "robust". See Section 2.3 for details. Defaults to "robust". • conf.level: level for confidence intervals. Defaults to 0.95. • rescale.factor: a scalar multiplication factor by which to rescale outcomes and effects.
Defaults to 1. • integrate_allocation: indicator of whether the integrand function uses the allocation parameter. Defaults to TRUE. • runSilent: If FALSE, prints status of computations. Defaults to TRUE. • ...: used to pass additional arguments to internal functions such as numDeriv::grad() or stats::integrate(). Arguments can also be passed to the propensity_integrand and loglihood_integrand functions.

The 'interference' object
An interference() call results in an S3 object of class 'interference' which contains: • estimates: a data frame of causal effect estimates; • models$propensity_model: the object returned by glm or glmer; • summary: a list of objects summarizing the causal model such as the number of groups, number of allocations, and the formula used in the interference call; • weights: (# of groups) × (# of allocations) matrix of group-level weights: If variance_estimation = "robust", then the object also includes: • weightd: (# of groups) × (# of allocations) × (# of parameters) array of weights computed using derivatives of the propensity function with respect to each parameter; • scores: (# of groups) × (# of parameters) matrix of derivatives of the log-likelihood.

Utility functions
The package includes tools to extract effect estimates of interest from the S3 object. The functions direct_effect, indirect_effect (or ie), total_effect (or te), and overall_effect (or oe) select appropriate records from the estimates data frame in the 'interference' object. Section 4 shows an example.

Example: Vaccine study
This section illustrates the use of inferference with an example drawn from vaccine research. The package includes a single dataset based on the same set of parameters used in the simulation study by Perez-Heydrich et al. (2014). The vaccinesim dataset consists of 3000 units in 250 groups and contains two covariates (X1 = age in decades and X2 = distance to river), a vaccination indicator (A), a participation indicator (B), a binary outcome (Y) indicating cholera infection (1 yes, 0 no), and the unit's group. Like the original study (Ali et al. 2005) that inspired the simulation, individuals were randomized to vaccine with a known probability of 2/3, but subjects could opt to not participate in the trial. In essence, there are both experimental and observational aspects to the data. The interference function handles this design when logit_integrand's randomization argument is used and a participation variable is included in the formula.
The print.interference function provides an overview of the causal effect estimates, estimated standard errors, and Wald-type confidence intervals. In the output, alpha1 and alpha2 refer to α and α , while trt1 and trt2 refer to a 1 and a 2 , respectively.

Plotting effect estimates
Plots of effect estimates over a range of α levels may be helpful in summarizing results. Perez-Heydrich et al. (2014) present several such graphical displays. Here we demonstrate how to generate similar plots of effect estimates using inferference.
First, we estimate the effects over a dense sequence of allocations so that lines will be smooth.

Diagnostics
IPW estimators are known to be unstable if the weights range greatly. The package includes a basic utility to check the performance of the group-level weights, w i,k , for multiple allocations. The function diagnose_weights plots histograms of weights for chosen allocation levels. If the allocations argument is left NULL, the function plots histograms for five allocation levels used in the interference call. Figure 2 shows the resulting histogram for a single allocation. The analyst should examine groups with extreme weights, which may unduly influence population-level estimates.

Example: Voting experiment
The preceding example used the default logit_integrand function to define the group-level propensities. The following example demonstrates how to customize the propensity score function. Nickerson (2008) reported an experiment on voter behavior to examine peer-to-peer indirect effects on voting participation. The experiment randomized households with only two registered voters in Denver and Minneapolis to receive one of three assignments: voting encouragement, recycling encouragement, or nothing. Canvassers knocked on doors of households randomized to the voting or recycling groups a week before the 2002 primary. If a registered voter answered the door, the canvassers delivered a scripted message about voting (treatment) or recycling (control). The researchers used voter turnout records to determine if each member of the household then voted in the election. Nickerson was interested in the potential spillover effect of the voting encouragement to the untreated individual via the treated individual. From analysis of the observed data, he concluded there was a "secondary effect" where the household members not contacted by the canvassers voted more often in the treatment groups compared to the control groups.
The dataset voters contains information for 3861 households, 2549 in Denver and 1312 in Minneapolis, including covariates such as age, gender, previous voting history, and party affiliation. Our estimand of interest involves average voting outcomes when households receive voting encouragement compared to when households receive the recycling message, hence we exclude households not contacted by canvassers. We also exclude the single household where a canvasser appears to have contacted both registered voters.

Household-level propensity
Unlike the vaccine study example, in this data set randomization occurred at the group level but individual level treatment was not randomized. With only two subjects, A i = (A i1 , A i2 ) is the treatment allocation for group i and X i = (X i1 , X i2 ) is the matrix of individuals' covariate matrices for group i. Let B i = (B i1 , B i2 ) be indicators of being reached by a canvasser in group i. Since we only consider households where someone answered the door, B i ∈ {(1, 0), (0, 1)} and P(B i1 = 1|X i ) + P(B i2 = 1|X i ) = 1. Let h ij = P(B ij = 1|X i ; θ x ) = logit −1 (X i θ x ). Let G i ∈ {0, 1} be the indicator that group i is randomized to treatment (1) or control (0). By design, P(G i = 1) = 0.5. Since can arbitrarily be defined in terms of either household member. By convention we use the first subject (subject one) of each group found in the dataset. Among treated groups, the probability of subject one being treated is the probability that a canvasser reached subject one. That is, . Thus, the group-level propensity can be expressed: Thus, h i1 is sufficient to determine the group-level propensity. If we know whether or not the first subject was reached by a canvasser, then we know if the second was. Therefore, we can estimate parameters for h i1 with a dataset that includes only subject one from each group.
To do this, we must estimate the parameters outside of inferference and use model_method = "oracle". We include centered age (in decades) in the propensity model for demonstration purposes.

Coding the propensity function
Custom propensity_integrand and loglihood_integrand functions must have at least one argument: • b: the first argument is the variable for which the integrate function computes the integral. As in this example, the function can be written so that the integral evaluates to 1 and has no effect.
For example, the following function will fix the group-level propensity to 0.5 for all groups when variance_estimation = "naive": R> fixed_propensity <-function(b) return(0.5 * dnorm(b)) For more realistic models, additional arguments may be passed to the custom function: • X: the covariate matrix (determined by the formula) for the ith group; • A: the vector of treatment indicators for the ith group; • parameters: vector of estimated parameters from the model_method; • allocation: the allocation level for which the propensity is currently being estimated; • ...: other arguments can be passed via the ellipsis in interference.

Evidence of a peer influence effect
The influence of the door opener on the non-door opener's voting behavior corresponds to an indirect effect. Though the Bernoulli-type parametrization of the estimands allows us to look at a range of allocations, IE(0.5, 0) makes the sensible comparison between a world where individuals receive a voting message with probability 0.5 to a world where individuals have zero probability of receiving the voting message.
For comparison, suppose that a flip of a fair coin determined which registered voter opened the door. We exclude age as a covariate and instead set h i1 = 0.5. Here we assume to know the propensity score, so we use variance_estimation = "naive".

Computational issues with IPW estimators
We show in this section how computation of the group-level weights may affect estimation as the number of individuals in groups grows. To illustrate, consider the IPW estimator of the overall effect, which weights individual outcomes in group i with: While mathematically equivalent, these weights may be computationally dissimilar. In the case of w 1i , the product term within the integral entails multiplying probabilities and thus will tend to 0 as n i increases, causing the denominator of w 1i to get arbitrarily large. In contrast, the product term in w 2i entails multiplying values which may be less than or greater than 1 and thus tends to be less susceptible to numerical instability. Summing log(h ij /α) or log(1 − h ij )/(1 − α)) and then exponentiating the result may provide greater numerical stability. Internally, inferference uses w 3i .
When group sizes are small, the differences between these weights tend to be infinitesimal, but as group sizes grow the differences become important. To be specific, consider the code below comparing w 1i , w 2i , and w 3i for increasing group sizes where α = 0.5, all units are treated, there is no random effect, and h ij is fixed at 0.5.

Discussion
The R package inferference computes inverse probability weighted estimators of causal effects in the presence of interference. The package currently supports the IPW methods of Tchetgen Tchetgen and VanderWeele (2012) and Perez-Heydrich et al. (2014). These methods require a model for the group-level propensity scores. The package provides useful defaults for the propensity models but allows for non-standard models.
Development and application of statistical methods for inferring causal effects in the presence of interference is an active area of research. Future versions of inferference may incorporate other estimation methods, such as doubly robust methods and stratification. Also, additional methods for estimating variances and effect bounds may be incorporated into the software.