ipw : An R Package for Inverse Probability Weighting

We describe the R package ipw for estimating inverse probability weights. We show how to use the package to ﬁt marginal structural models through inverse probability weighting, to estimate causal eﬀects. Our package can be used with data from a point treatment situation as well as with a time-varying exposure and time-varying confounders. It can be used with binomial, categorical, ordinal and continuous exposure variables.


Introduction
We describe the R (R Development Core Team 2011) package ipw, for estimating inverse probability weights. These weights are typically used to perform inverse probability weighting (IPW) to fit a marginal structural model (MSM). The package is available from the Comprehensive R Archive Network at http://CRAN.R-project.org/package=ipw . MSMs are used to estimate causal effects from observational data, by correcting for confounding. When using IPW to fit an MSM, there is minimal risk of (1) adjusting away part of the effect (Robins 1997;Robins, Greenland, and Hu 1999), (2) non-collapsibility, (Greenland, Robins, and Pearl 1999), or (3) Berksons bias (Hernán, Hernández-Díaz, and Robins 2004). In contrast, when using conditioning to correct for confounding (1) and (3) can occur in a longitudinal study, and (2) can occur with any statisticical model that does not have a linear or log-linear link function.
The use of IPW to fit an MSM was described in detail, e.g., in , Hernán and Robins (2006) and Cole and Hernán (2008). Currently available software to fit MSMs includes CausalGAM, an R package for the estimation of causal effects with generalized additive models in a point treatment with a binary exposure (Glynn and Quinn

Inverse probability weighting
As was shown by Robins (1998), the parameters of MSMs can be estimated using inverse probability weighting (IPW) to correct both for confounding (illustrated in the examples below) and for forms of selection bias such as informative censoring (illustrated in the example in Section 4.2). This amounts to the fitting of a model regressing the outcome of interest on the exposure of interest using observational data, with each observation weighted by the inverse of the probability of the observed exposure level given the observed value of the confounders.

IPW in a point treatment
In a point treatment situation we can adjust for a set of confounders C when estimating the effect of discrete exposure A by weighting observations i by the inverse probability weights . (1) We indicate the observed exposure and confounder status with a and c, respectively. The denominator of (1) contains the probability of the observed exposure level given the observed values of covariates C. When C includes all relevant confounders, and we estimate P (A i = a i |C i = c i ) using a correctly specified exposure allocation model, weighting by w i creates a pseudopopulation in which C no longer predicts A and in which the causal association between A and the outcome of interest is the same as in the original study population 1 . Weighting observations i by w i , one can fit a causal model, for instance the MSM with a continuous outcome Y . The response variable Y a is the potential outcome that could have been observed in a unit under study, when that unit would have received, perhaps contrary to the fact, a specific treatment level a . The expectation E(Y a ) 1 Note that with unsaturated exposure allocation models, IPW estimators are less efficient than likelihoodbased estimators (Clayton, Spiegelhalter, Dunn, and Pickles 1998), and may be unstable when certain strata defined by C have low response probabilities (Little and Rubin 1987;Cole and Hernán 2008;Lefebvre, Delaney, and Platt 2008).
is the mean response, when all units under study would have received a specific treatment level a. Parameter β 1 then quantifies the causal effect of A on Y .
To increase statistical efficiency and attain better coverage of confidence intervals, it is recommended to use stabilized weights Cole and Hernán 2008), e.g., The numerator of (3) contains the probability of the observed exposure level, which is just the observed frequency. This introduces an association between the numerator and denominator, which means that on average the difference between the numerator and denominator becomes smaller, as compared to unstabilized weights. This means in turn that stabilized weights will have a narrower distribution than unstabilized weights. To increase the association between the numerator and denominator, further stabilizing the weights, one can condition both in the numerator and denominator of (3) on a set of time-fixed covariates V that are related to A. For instance, when a researcher believes that sex does not influence the outcome of interest, but the distribution of exposure level varies between both sexes, sex could be included in V .
It must be noted that confounding caused by baseline covariates that are used as stabilization factors, is not adjusted for (Cole and Hernán 2008). Adjustment for such covariates could be made by including them in the MSM, at the cost of possibly inducing non-collapsibility. Also, note that stabilization can be done not only by including a linear term of V , but that more complex functions can also be used, when that would improve the estimation of P (A = a).
With a continuous exposure variable A, one can use the stabilized weights where f (a i ) is the marginal density function of A, evaluated at the observed value in unit i, a i , and f (a i |c i ) is the conditional density function of A given C, evaluated at the observed values in unit i, {a i , c i }. With a continuous exposure variable, unstabilized weights cannot be used, since they would have infinite variance .
The denominators of (1), (3) and (4) can be estimated by using exposure allocation models regressing A on C. Similarly, the numerators of (3) and (4) can be estimated by using exposure allocation models regressing A on the constant only. When using additional stabilization variables V , those variables V can be included in the exposure allocation models as well.

IPW in a longitudinal study
Suppose that a discrete exposure A may change over time, and a decision to allocate a certain exposure level is made and recorded within each unit i at time points t ij . Timevarying confounders for the effect of A ij on the outcome of interest, measured right before each time point t ij in each unit i, are contained in C ij . In addition, C ij can also contain time-fixed confounders. Let A ij and C ij indicate the observed longitudinal history, i.e., all measurements up to time point t ij within unit i, of A and C respectively. V i are measured time-fixed covariates, that are not confounders but that are associated with the exposure. One can adjust for time-varying confounders C by weighting observations at t ij by the stabilized weights Equation (5) is a product over all time points from baseline up to time point t ij , within each unit i. The factors in the numerator of (5) contain the probability of the observed exposure status at each time point, a ik , given the observed exposure history up to the previous time point, a ik−1 , and the observed time-fixed covariates, v i . The factors in the denominator of (5) contain the probability of the observed exposure status at each time point, given the observed exposure history up to the previous time point, the observed history of time-varying confounders up to each time point, c ik , and the observed time-fixed covariates. Note that time-fixed covariates V are included both in the numerator and denominator of (5), to further stabilize the weights. To estimate the causal effect of A on the exposure of interest, one can fit an MSM to the observations made at time points t ij , weighted by sw ij , as was done e.g., by Hernán et al. (2000).
With a continuous exposure A, one can use the stabilized weights analoguously to (4), as described in Cole and Hernán (2008). The numerator f (a ik |a ik−1 , v i ) is the conditional density function of A at time point t ik given the history of A up to the previous time point and the time-fixed covariates, evaluated at the observed values in unit The elements in the denominator of (5) and (6) can be estimated by using exposure allocation models regressing time-varying exposure A ij on follow-up time t ij , the history of A up to but not including t ij , A i(j−1) , the observed history of confounders C ij up to and including t ij , C ij , and the time-fixed covariates V i . The elements in the numerator of (5) and (6) can be estimated from similar models, not including C.
Note that when the effects of A i(j−1) and C ij on A ij are fully expressed through A i(j−1) and C ij , only the latter need to be included in the exposure allocation models. Often, A ij will be constant after a certain switch is made, e.g., after a switch from exposure level 0 to exposure level 1 subjects will always remain on exposure level 1. When A ij is deterministically constant after such a switch, the elements in the numerator and denominator of (5) and (6) can be set to 1 after the switch. Time to event models can then be used as exposure allocation models.
Note that a continuously varying exposure A(t) such as disease status can change at any time, not just at certain time points t ij . With such a continuously time-varying exposure A(t), it is necessary to choose the time points that are used to fit an MSM. A logical choice is either to (1) use time points at which changes in exposure status are observed, and time points at which the outcome is observed (e.g., the event times with a survival outcome) or (2) use regularly spaced intervals, with a sufficiently fine discretization. In both cases, the value of time-varying confounders right before each time point may need to be imputed from a longitudinal model that was fitted on the original measurements (e.g., see Section 4.3).

Inference
When using IPW, observations can have weights unequal to each other, which introduces clustering in the weighted dataset. When this is not taken into account, the standard error of the causal effect estimate could be underestimated. Therefore, when using IPW to fit an MSM, it is necessary to use a robust standard error estimator for inference .

The R package ipw
The R package ipw comes with a namespace. It contains the following functions: ipwpoint, for estimating inverse probability weights in a point treatment situation.
ipwtm, for estimating inverse probability weights for a time-varying exposure with timevarying confounders.
ipwplot, to plot the distribution of inverse probability weights.
tstartfun, to compute the starting time for intervals of follow-up, when Cox proportional hazards models are used to model the exposure allocation.
We describe these functions below. Package ipw also contains the simulated datasets haartdat, basdat and timedat, which are described and analyzed in the examples given in Section 4.

Function ipwpoint
The function ipwpoint can be used to estimate inverse probability weights similar to (1), (3) and (4), to fit MSMs in a point treatment situation. The exposure of interest can be binomial, multinomial, ordinal or continuous. Both stabilized and unstabilized weights can be estimated. It is used as: ipwpoint(exposure, family, link, numerator = NULL, denominator, data, trunc = NULL, ...) and takes the following arguments: exposure is a vector, representing the exposure variable of interest. Both numerical and categorical variables can be used. A binomial exposure variable should be coded using values 0 and 1.
family is used to specify a family of link functions, used to model the relationship between the variables in numerator or denominator and exposure, respectively. Alternatives are "binomial", "multinomial", "ordinal" and "gaussian". A specific link function is then chosen using the argument link, as explained below. Regression models are fitted using the R functions glm (stats, see R Development Core Team 2011), multinom (nnet, see Venables and Ripley 2002), polr (MASS, see Venables and Ripley 2002) or glm, respectively.
numerator is a formula, specifying the right-hand side of the model used to estimate the elements in the numerator of the inverse probability weights. When left unspecified, unstabilized weights with a numerator of 1 are estimated.
denominator is a formula, specifying the right-hand side of the model used to estimate the elements in the denominator of the inverse probability weights. This typically includes the variables specified in the numerator model, as well as confounders for which to correct.
data is a dataframe containing exposure and the variables used in numerator and denominator.
trunc is an optional truncation fraction for the weights (between 0 and 0.5). E.g. when trunc = 0.01, the left tail is truncated to the 1st percentile, and the right tail is truncated to the 99th percentile. When specified, both un-truncated and truncated weights are returned.
... are further arguments passed to the function that is used to estimate the numerator and denominator models (the function is chosen using family).
With numerator specified, stabilized weights are computed, otherwise unstabilized weighs with a numerator of 1 are computed. With a continuous exposure, using family = "gaussian", weights are computed using the ratio of predicted densities. Therefore, for family = "gaussian" only stabilized weights can be used, since unstabilized weights would have infinite variance . The output returned by ipwpoint is a list containing the following elements: ipw.weights is a vector containing inverse probability weights for each unit under observation. This vector is returned in the same order as the measurements contained in data, to facilitate merging.
weights.trunc is a vector containing truncated inverse probability weights for each unit under observation. This vector is only returned when trunc is specified.
call is the original function call to ipwpoint.
num.mod is the numerator model, only returned when numerator is specified.
den.mod is the denominator model.
Currently, the exposure variable and the variables used in numerator and denominator should not contain missing values.

Function ipwtm
The function ipwtm can be used to estimate inverse probability weights to fit MSMs, with a time-varying exposure and time-varying confounders. Within each unit under observation i (e.g., patients), this function computes inverse probability weights at each time point t ij during follow-up, similar to (5) and (6). The exposure can be binomial, multinomial, ordinal or continuous. Both stabilized and unstabilized weights can be estimated. It is used as: ipwtm(exposure, family, link, numerator = NULL, denominator, id, tstart, timevar, type, data, corstr = "ar1", trunc = NULL, ...) and takes the following arguments: exposure is a vector, representing the exposure of interest. As in ipwpoint, both numerical and categorical variables can be used. A binomial exposure variable should be coded using values 0 and 1.
family is used to specify a family of link functions, used to model the relationship between the variables in numerator or denominator and exposure, respectively. Alternatives are "binomial", "survival", "multinomial", "ordinal" and "gaussian".
link is the specific link function between the variables in numerator or denominator and exposure, respectively. For family = "binomial" (fitted using glm) alternatives are "logit", "probit", "cauchit", "log" and "cloglog". For family = "survival" this argument is ignored, and Cox proportional hazards models are always used (fitted using coxph). For family = "multinomial" this argument is ignored, and multinomial logistic regression models are always used (fitted using multinom). For family = "ordinal" (fitted using polr) alternatives are "logit", "probit", "cauchit", and "cloglog". For family = "gaussian" this argument is ignored, and GEE models with an identity link are always used (fitted using geeglm).
numerator is a formula, specifying the right-hand side of the model used to estimate the elements in the numerator of the inverse probability weights. When left unspecified, unstabilized weights with a numerator of 1 are estimated.
denominator is a formula, specifying the right-hand side of the model used to estimate the elements in the denominator of the inverse probability weights.
id is a vector, uniquely identifying the units under observation within which the longitudinal measurements are taken.
tstart is a numerical vector, representing the starting time of follow-up intervals, using the counting process notation. This argument is only needed when family = "survival", otherwise it is ignored. The Cox proportional hazards models are fitted using longitudinal data, coded using the counting process notation. When modeling exposure allocation at the start of follow-up (for timevar = 0), tstart should be negative. For this first interval, the particular value of tstart is not important, only that it is smaller than zero.
timevar is a numerical vector, representing follow-up time, starting at 0. This variable is used as the end time of follow-up intervals, using the counting process notation, when family = "survival".
type specifies the type of exposure. Alternatives are "first" and "all". With type = "first", weights are estimated up to the first switch from the lowest exposure value (typically 0 or the first factor level) to any other value. After this switch, weights will then be constant. Such a weight is e.g., used when estimating the causal effect of the initiation of highly active anti-retroviral therapy (HAART) on mortality (see Sections 4.2 and 4.3). With type = "all", all time points are used to estimate weights. Currently, only "first" is implemented for "survival", "multinomial" and "ordinal" families.
Only "all" is implemented for the "gaussian" family. Both type = "first" and type = "all" are implemented for the "binomial" family.
data is a dataframe containing exposure, the variables used in numerator and denominator, and variables id, tstart and timevar.
trunc is an optional truncation fraction for the weights (between 0 and 0.5). E.g. when trunc = 0.01, the left tail is truncated to the 1st percentile, and the right tail is truncated to the 99th percentile. When specified, both un-truncated and truncated weights are returned.
... are further arguments passed to the function that is used to estimate the numerator and denominator models (the function is chosen using family).
With numerator specified, stabilized weights are computed, otherwise unstabilized weights with a numerator of 1 are computed. As in ipwpoint, with a continuous exposure, using family = "gaussian", weights are computed using the ratio of predicted densities at each time point. Therefore, for family = "gaussian" only stabilized weights can be used, since unstabilized weights would have variance . The output returned by ipwtm is a list containing the following elements: ipw.weights is a vector containing inverse probability weights for each observation. This vector is returned in the same order as the observations contained in data, to facilitate merging.
infinityweights.trunc is a vector containing truncated inverse probability weights for each observation. This vector is only returned when trunc is specified.
call is the original function call to ipwtm.
selvar is a selection variable. With type = "first", selvar = 1 within each unit under observation, up to and including the first time point at which a switch from the lowest value of exposure to any other value is made, and selvar = 0 after the first switch. For type = "all", selvar = 1 for all measurements. The numerator and denominator models have been fitted only on observations with selvar = 1. This vector is returned in the same order as the observations in data, to facilitate merging.
num.mod is the numerator model, only returned when numerator is specified.
den.mod is the denominator model.
Currently, the exposure variable, the variables used in numerator and denominator, and variables id, tstart and timevar should not contain missing values.

Function ipwplot
The function ipwplot can be used to plot inverse probability weights. and takes the following arguments: weights is a numerical vector of inverse probability weights to plot.
timevar is a numerical vector representing follow-up time. When specified, boxplots within strata of follow-up time are displayed. When left unspecified, a density plot is displayed.
binwidth is a numerical value indicating the width of the intervals of follow-up time; for each interval a boxplot is made. Ignored when timevar is not specified.
logscale is a logical value. If TRUE, weights are plotted on a logarithmic scale.
xlab is the label for the horizontal axis.
ylab is the label for the vertical axis.
main is the main title for the plot.
ref is a logical value. If TRUE, a reference line is plotted at y = 1.
... are additional arguments passed to boxplot (when timevar is specified) or plot (when timevar is not specified).
See the examples below for actual plots (Figures 1, 2 and 3).

Function tstartfun
Function tstartfun can be used to compute the starting time for intervals of follow-up, when using the counting process notation. Within each unit under observation, this function computes a starting time equal to: the time of the previous record, when there is a previous record.
-1 for the first record.
The function is used as:

tstartfun(id, timevar, data)
and takes the following arguments: id is a numerical vector, uniquely identifying the units under observation, within which the longitudinal measurements are taken.
timevar is a numerical vector, representing follow-up time, starting at 0.
data is a dataframe containing id and timevar.

Examples
In the following examples we will illustrate the use of functions ipwpoint, ipwtm, ipwplot and tstartfun, contained in ipw. We also describe the datasets haartdat, basdat, and timedat, contained in ipw, which are used in the examples. The following three examples are ordered by increasing complexity.

Point treatment example
We will illustrate the use of IPW in a point treatment using simulated data. First we will simulate point treatment data with measurements made in 1000 individuals on a continuous confounder L, a dichotomous exposure A and a continuous outcome Y , using: L ∼ N (10, 5), logitP (A = 1) = −10 + L, Y = 10A + 0.5L + N (−10, 5).

Causal effect of HAART use on mortality in HIV-infected patients
Dataset haartdat is a simulated dataset, with survival data measured in 1200 HIV-infected patients. Start of follow-up is HIV seroconversion. Each row corresponds to a 100 day period of follow-up time. Patients can initiate highly active anti-retroviral therapy (HAART) during follow-up. We will estimate the causal effect of HAART on mortality using this dataset, while adjusting both for possible confounding by CD4 count, and for informative censoring due to the effect of CD4 count on dropout, using IPW. In this example, CD4 count is a time-varying covariate. We load the package ipw and dataset haartdat, and look at the first 10 rows of haartdat: Dataset haartdat contains the following variables: patient is the patient ID, tstart is the starting time for each interval of follow-up, measured in days since HIV seroconversion; note that the first interval of follow-up is (−100, 0], this is used to allow for the modeling of the initiation of HAART at t = 0 (as explained in Section 3.2), fuptime is the end time for each interval of follow-up measured in days since HIV seroconversion, haartind is an indicator for the initiation of HAART therapy at the end each interval (0 = HAART not initiated/1 = HAART initiated), event is an indicator for death at the end of the interval (0 = alive/1 = died), sex is sex (0 = male/1 = female), age is age at the start of follow-up (years), cd4.sqrt is the square root of CD4 count, measured at the end of each interval, but before haartind. Note that in each row, corresponding to time point j in individual i, cd4.sqrt has an effect on haartind in the same row, including at time 0.
dropout is an indicator for dropout of the study, at the end of the interval (0 = did not drop out/1 = dropped out).
To adjust for confounding by time-varying CD4 count, we estimate the stabilized inverse probability weights similar to (5), with H ij and L ij indicating HAART status and the square root of CD4 count in patient i at measurement j, respectively. V i is the vector of time-fixed covariates, containing sex and age. h ij , l ij and v i are the observed values of variables H ij , L ij and V i , respectively. For time points after the initiation of HAART within each patient, the elements in the numerator and denominator of (9) are set to 1. For time points up to the time point of the initiation of HAART within each patient, we estimate the elements in the denominator of (9) using the Cox proportional hazards model with t follow-up time. We estimate the numerator of (9) using a model similar to (10) but without L(t) as a predictor. We can estimate, and examine sw ij using: R> temp <-ipwtm(exposure = haartind, family = "survival", + numerator =~sex + age, denominator =~cd4.sqrt + sex + age, + id = patient, tstart = tstart, timevar = fuptime, type = "first", + data = haartdat) R> summary(temp$ipw.weights)
In this example, CD4 count also has an effect on dropout from the study. Since CD4 count has an effect on mortality, this can cause informative censoring. Note that in this example, censoring for other reasons than dropout is regarded as non-informative. We can estimate inverse probability of censoring weights sw ij , to correct for the effect of CD4 count on dropout, similarly to (10), but replacing H ij with an indicator for dropout, as: Figure 2: Weights distribution plot for the inverse probability weights that are used to adjust for confounding in example 2, made using ipwplot.
We can now use the inverse probability weights sw ij and inverse probability of censoring weights sw ij to fit an MSM, quantifying the causal effect of the initiation of HAART on mortality. To combine the adjustment for confounding and for informative censoring, the observations indexed by ij are weighted by the product sw ij × sw ij . Similarly to Hernán et al. (2000), we fit the MSM

Causal effect of tuberculosis on mortality in HIV-infected patients
Our third example is similar to example in Section 4.2 but with measurements made at irregular intervals of follow-up time. We estimate the causal effect of active tuberculosis (TB) on mortality in HIV-positive individuals, adjusted for possible confounding by time-varying CD4 count using IPW. We smooth time-varying CD4 using a random effects model, because it is the underlying "true" CD4, separate from short-term fluctuations and measurement error, that is a confounder for the effect of TB. The simulated datasets basdat and timedat are used in this example. We load package ipw and the datasets, and explore the datasets: R> library("ipw") R> data("basdat") R> data("timedat") R> basdat[1:4,] id id is the patient ID, fuptime is follow-up time, in days since HIV seroconversion, cd4count is CD4 count, measured at fuptime.
Note that these data were simulated using the algorithm described in Van der Wal, Prins, Lumbreras, and Geskus (2009). Therefore, CD4 count at a certain time point is affected by the TB status right before that time point. TB status at a certain time point is affected by CD4 count at that specific time point.
Some processing of the original data is necessary. We check if there is more than one CD4 measurement taken on the same day within each patient: R> table(duplicated(timedat[, c("id", "fuptime")]))

FALSE 6291
which is not the case. Because of skewness, we compute the square root of CD4 count: timedat$cd4.sqrt <-sqrt(timedat$cd4count) Add the time of first active TB to timedat, and compute tb.lag, the time-varying TB status one day before the measurement time (which is necessary for reasons that are explained below): R> timedat <-merge(timedat, basdat[,c("id","Ttb")], by = "id", all.x = TRUE) R> timedat$tb.lag <-ifelse(with(timedat, !is.na(Ttb) & fuptime > Ttb), 1, 0) To be able to impute CD4 count at time points other than the measurement times, which is necessary when fitting the MSM (see below), and to smooth the original measurements, we fit the random effects model with t follow-up time (days since HIV seroconversion), L(t) CD4 count, and A(t) time-varying TB status. Random effects ξ i and η i are assumed to be normally distributed with mean β = (β 0 , β 1 ) and covariance matrix Σ = V 1 V 12 V 12 V 2 . The model includes a fixed effect for TB, β 2 . Because CD4 is affected by the TB status right before t, we include A(t − 1), the TB status one day before t in (12). We can fit model (12) using: R> cd4.lme <-lme(cd4.sqrt~fuptime + tb.lag, random =~fuptime | id, + data = timedat) We will construct a new dataframe startstop, which will be used to estimate inverse probability weights and to fit an MSM, to quantify the causal effect of TB on mortality. Let T T B be all time points at which the TB-status switches, in any individual. Let T end be all individual end times. Then, to (1) be able to compute inverse probability weights similar to (5) using a Cox proportional hazards model and (2) be able to fit the MSM, the dataframe startstop should contain, for each individual, rows corresponding to both T T B and T end . For each individual we include these time points only up to his or her individual end time. We also sort the time points chronologically within each individual. The dataframe construction is done as follows: R> times <-sort(unique(c(basdat$Ttb, basdat$Tend))) R> startstop <-data.frame( + id = rep(basdat$id, each = length(times)), + fuptime = rep(times, nrow(basdat))) R> startstop <-merge(startstop, basdat, by = "id", all.x = TRUE) R> startstop <-startstop[with(startstop, fuptime <= Tend), ] We compute the starting time for each interval of follow-up using tstartfun: R> startstop$tstart <-tstartfun(id, fuptime, startstop) Then we compute tb, the TB status at each time point for each individual, and tb.lag, the time-varying TB status one day before each time point for each individual. We also compute event, an indicator for death, and impute time-varying CD4 count cd4.sqrt, using (12) Note that for each row in startstop, cd4.sqrt contains imputed CD4 count that predicts tb in the same row. To correct for confounding by time-varying CD4 count, we can estimate the stabilized inverse probability weights For time points up to the time point of the first instance of active TB within each patient, we estimate the elements in the denominator of (13) using the Cox proportional hazards model We estimate the numerator of (13) using a model similar to (14) but only including the constant. Therefore, we can estimate sw ij using: Figure 3: Weights distribution plot for example 3, made using ipwplot.

*100))
To estimate the marginal causal effect of TB on mortality, we fit the MSM λ T a (t) = λ 0 (t) exp{β 1 a(t)}, adjusted for confounding by CD4 count using IPW, and using a using a robust variance estimator, as follows: R> summary(coxph(Surv(tstart, fuptime, event)~tb + cluster(id), + data = startstop, weights = temp$ipw.weights)) The estimated hazard ratio corresponding to the causal effect of TB on mortality is 2.25 (95% CI 1.35-3.75). Note that the estimate from an unadjusted model of 4.46 (95% CI 3.13-6.36) is an overestimate, since both TB and death are more likely at lower CD4 counts. The estimate from the conditional model of 1.28 (95% CI 0.79-2.06) is an underestimate, since the indirect effect of TB through CD4 count is "conditioned away", as explained e.g., in Robins (1997) and Robins et al. (1999).

Conclusion
We have demonstrated how IPW can be performed to fit MSMs using our R package ipw, both in point treatment studies and in longitudinal studies, correcting for confounding and informative censoring. The package can accomodate for a wide range of exposure allocation models. Our package is easily used and does not involve extensive programming. We have also demonstrated how robust standard errors can be used for inference when fitting an MSM using IPW. Our contribution of the package ipw will make the MSM methodology more accessible to applied researchers. The package will also be useful to those using inverse probability weighting for other purposes such as missing data problems (see e.g., Rao, Sigurdson, Doody, and Graubard 2005) or correcting for informative censoring (see e.g., Hernán et al. 2000 and. In future updates of the package, the functions will also be applicable to other situations in which IPW is used, such as estimation and inference in competing risks survival analysis (Geskus 2011).