frailtyEM : An R Package for Estimating Semiparametric Shared Frailty Models

When analyzing correlated time to event data, shared frailty (random eﬀect) models are particularly attractive. However, the estimation of such models has proved chal-lenging. In semiparametric models, this is further complicated by the presence of the nonparametric baseline hazard. Although recent years have seen an increased availabil-ity of software for ﬁtting frailty models, most software packages focus either on a small number of distributions of the random eﬀect, or support only on a few data scenarios. frailtyEM is an R package that provides maximum likelihood estimation of semiparametric shared frailty models using the expectation-maximization algorithm. The implementation is consistent across several scenarios, including possibly left truncated clustered failures and recurrent events in both calendar time and gap time formulation. A large number of frailty distributions belonging to the power variance function family are supported. Several methods facilitate access to predicted survival and cumulative hazard curves, both for an individual and on a population level. An extensive number of summary measures and statistical tests are also provided.


Introduction
Time-to-event data are very common in medical applications. Often, these data are characterized by incomplete observations. For example, the phenomenon of right censoring occurs when the actual event time is not observed, but the only thing that is known is that the event has not taken place by the end of follow-up. Sometimes, individuals enter the data set only if they have not experienced the event before a certain time point. This is known as left truncation, which, if not accounted for correctly, leads to bias. Regression models for such data have been developed in the field of survival analysis. The most popular is the Cox proportional hazards model (Cox 1972), which is semiparametric in nature: The effect of the covariates is assumed to be time-constant and fully parametric, while the time-dependent probability of observing an event arises from the nonparametric baseline hazard. Cox regression has been the standard in survival analysis for a few reasons. First, it does not require any a priori assumptions about the baseline hazard. Second, under the proportional hazards assumption, maximum likelihood estimation can be carried out efficiently using Cox's partial likelihood. Nowadays, such models may be estimated with most statistical software, such as R (R Core When individuals belong to clusters, or may experience recurrent events, the observations are correlated. In this case the Cox model is not appropriate for modeling individual risk. A natural extension is represented by random effect "shared frailty" models. Originating from the field of demographics (Vaupel, Manton, and Stallard 1979), these models traditionally assume that the proportional hazards model holds conditional on the frailty, a random effect that acts multiplicatively on the hazard. The variance of the frailty is usually indicative of the degree of heterogeneity in the data. This makes the choice of the random effect distribution relevant. However, the simplicity that made the Cox model so popular does not carry over to such models.
Arguably the most popular way of fitting semiparametric shared frailty models is via the penalized likelihood method (Therneau, Grambsch, and Pankratz 2003), available for the gamma and log-normal frailty distributions. This is the standard in the survival package (Therneau and Grambsch 2000;Therneau 2019) in R, in the PHREG command in SAS and the streg procedure in Stata. This method has the advantage that it is generally fast and the Cox model is contained as a limiting case when the variance of the frailty is 0. However, this algorithm can not be used for estimating other frailty distributions or left truncated data, and the provided standard errors are presented under the assumption that the estimated parameters of the frailty distribution are fixed. Log-normal frailty models may also be estimated in R via Laplace approximation in coxme (Therneau 2015), h-likelihood in frailtyHL (Do Ha, Noh, and Lee 2012) or Monte Carlo expectation-maximization in phmm (Donohue and Xu 2013;Vaida and Xu 2000;Donohue, Overholser, Xu, and Florin 2011). Parametric and spline based shared frailty models are implemented for the gamma and log-normal distributions in the frailtypack package (Rondeau, Mazroui, and Gonzalez 2012;Rondeau and Gonzalez 2005).
In Hougaard (2000), the power variance function (PVF) family was proposed for modeling the frailty distribution. This family of frailty distributions includes the gamma, positive stable (PS), inverse Gaussian (IG) and compound Poisson distributions with mass at 0. Each choice of the distribution for the frailty implies a different marginal model, with some emphasizing early dependence of the observations (IG) and others late dependence (gamma). Of particular interest is the PS distribution: With assumed proportional hazards conditional on the frailty, the PS implies proportional hazards also unconditional on the frailty. This is unlike the other distributions which imply non-proportional hazards at the marginal level. Therefore, this is the only distribution where the potential violation of the proportional hazards is not confounded with a frailty effect.
The software implementation of the the PVF family of distributions so far has been limited. At this time, two R packages incorporate a larger number of distributions from this family: The frailtySurv package (Monaco, Gorfine, and Hsu 2017;Gorfine, Zucker, and Hsu 2006) implements the above mentioned distributions except the PS via a pseudo full likelihood ap-proach and the parfm package (Munda, Rotolo, and Legrand 2012) estimates fully parametric gamma, IG, PS and log-normal frailty models.
In this paper we present frailtyEM (Balan and Putter 2019b), an R package which uses the general expectation-maximization (EM) algorithm (Dempster, Laird, and Rubin 1977) for fitting semiparametric shared frailty models and which is available from the Comprehensive R Archive Network (CRAN) at https://CRAN.R-project.org/package=frailtyEM. This implementation comes to complete the landscape of packages that may be used for such models, with support for the whole PVF family of distributions for the scenarios of clustered failures, clustered failures with left truncation and recurrent events data. In the latter case, different time scales are supported, such as calendar time (time since origin of the recurrent event process) and gap time (time since previous recurrent event). Point estimates for regression coefficients are provided with confidence intervals that take into account the estimation of the frailty distribution parameters, and plotting methods facilitate the visualization of both conditional and marginal survival or cumulative hazard curves with 95% confidence bands, marginal covariate effects, and empirical Bayes estimates of the random effects. A comparison with respect to functionality between frailtyEM and other R packages is provided in Table 1.
The rest of this paper is structured as follows. In Section 2 we present a brief overview the semiparametric shared frailty model, and the implications of left truncation. In Section 3 we discuss the estimation method and its implementation. In Section 4 we illustrate the usage of the functions from the frailtyEM package on three classical data sets available in R.

Shared frailty models
In frailtyEM, the general framework is of I clusters with J i individuals within cluster i, i = 1, . . . , I. The event history of individual j from cluster i is represented by a counting process N ij , with N ij (t) representing the number of events observed until time t. The "atrisk" process Y ij (t) is defined as 1 when individual (ij) is under observation and 0 otherwise, and a vector of possibly time dependent covariates is denoted as x ij (t).
The clustered failures scenario is represented when the N ij (t) ≤ 1 and Y ij (t) = 0 after an event or right censoring. The data in cluster i consists of J i possibly right censored survival times. If N ij (t) exceeds 1, the case of recurrent events is obtained. In this scenario, it is considered that each cluster contains only one individual (J i ≡ 1, with the corresponding counting process N i ). Calendar time (also known as Andersen-Gill) models, when the time scale is "time since origin" and gap time models, where the time scale is "time since the previous event" are commonly employed (Cook and Lawless 2007). When subject i is no longer under observation, the last time point is typically considered right censored.
The intensity of N ij (or hazard, in the clustered failure scenarios) is specified as where Z i is an unobserved random effect common to all observations from cluster i (the "shared frailty"), β a vector of unknown regression coefficients and λ 0 (t) ≥ 0 an unspecified baseline intensity function. It is assumed that the Z i are i.i.d. random variables with a frailtyEM survival coxme frailtySurv frailtyHL frailtypack parfm phmm  Table 1: Comparison of R packages for frailty models. Versions: frailtyEM 0.8.3, survival 2.40-1, coxme 2.2-5, frailtyHL 1.1, frailtypack 2.10.5, parfm 2.7.1, phmm 0.7-5. The recurrent events here refer, more precisely, to the calendar time (or Andersen-Gill) formulation. Recurrent events in gap time are supported by all packages. The distinction between the two is detailed in Section 2. distribution referred to as Z, and that event times are independent given Z i . A stratified model (1) may also be specified by specifying different baseline intensities for different groups of observations. In this case, if individual (i, j) belongs to strata s, λ 0 (t) is replaced by λ 0s (t).
We consider the general case where the Z follows a distribution with positive support from the infinitely divisible family, i.e., they are i.i.d. realizations of a random variable described by the Laplace transform with α > 0 and γ > 0. This formulation includes several distributions, such as the gamma, positive stable, inverse Gaussian and compound Poisson distributions. This so-called powervariance-function (PVF) family of distributions have been extensively studied in Hougaard (2000). As detailed in Appendix A, we assume that an identifiability constraint is imposed on the parameters α and γ and that the distribution of Z is indexed by a scalar parameter θ.

Likelihood
Henceforth, we consider the problem of estimating β, λ 0 and θ via maximum likelihood. This is achieved by maximizing the marginal likelihood, based on the observed data and obtained by integrating over the random effect. For simplicity, we omit potential strata in this section. From model (1), the marginal likelihood is obtained as the product over clusters of expected marginal contributions, i.e., The first part reduces to a product of contributions from the observed event times of the counting processes from cluster i. Denote the kth observed time corresponding to the counting process N ij as t ijk and δ ijk = 1 if t ijk is an event time and 0 otherwise.
the number of observed events in cluster i. The marginal likelihood can be written as By using (2), the last term may be expressed in terms of the n i th derivative of the Laplace transform, i.e., In frailtyEM, the Breslow estimator is employed for the baseline hazard, i.e., λ 0 (t) ≡ λ 0t for t an event time, and 0 otherwise. This is equivalent with estimating t 0 λ 0 (s)ds as a step function with "jumps" of size λ 0t at event times.

Ascertainment and left truncation
The problem of ascertainment with random effect time-to-event data is usually difficult. If Z i is the distribution of the frailty of cluster i and A i denotes the event of selecting the observations in cluster i, the random effect distribution of cluster i given the ascertainment is of the form Z i |A i . The Laplace transform of Z i |A i follows from Bayes' rule as Expressing P(A i |Z i ) depends on the type of the study at hand and on the way the data were collected.
In frailtyEM an option is included to deal with the scenario of left truncation for clustered failures. Consider that from a cluster of sizeJ i , J i ≤J i individuals are are selected and A i is the event "selecting J i individuals with left truncation times t L,i = {t L,i1 . . . t L,iJ i }". Then A i can be expressed as A hidden assumption here is that the true cluster sizeJ i does not depend on the frailty. For example, if a high frailty is associated with both a high rate of events and smaller cluster sizes, then the distribution ofJ i |Z must also be considered (Jensen, Brookmeyer, Aaby, and Andersen 2004).
Assume that, given Z i , the left truncation times t L,i are independent. In this case, A difficulty here is that the values of the covariate vector and of the baseline intensity must be known prior to the entry time in the study. Therefore, only cases when x i is time constant are considered.
The marginal likelihood may be obtained from (3), (4) and (5) as It can also be seen that, if the frailty distribution is degenerate and has no variability (i.e., E θ may be removed), then the contribution ofΛ L,i cancels out. In particular, under left truncation, the Laplace distribution of Z|A i is given by This distribution is often referred to as the frailty distribution of the survivors (Hougaard 2000). If Z is from the PVF family, it can be shown that Z|A is also in the PVF family. As a result, if Z is gamma distributed, then also Z|A is gamma distributed.
Note that, in general, the ascertainment scheme does not have a simple description and P(A i |Z i ) may or may not be available in closed form. For example, in family studies, the families may be selected only when a number of individuals live long enough (Rodríguez-Girondo, Deelen, Slagboom, and Houwing-Duistermaat 2018). In this case, (5) does not hold.
In the case of registry data on recurrent events, individuals (clusters) may be selected only if they have at least one event during a certain time window (Balan, Jonker, Johannesma, and Putter 2016b). These specific cases are not currently accommodated by frailtyEM.

Inference
In frailtyEM, inference from the likelihood (3) is based on the non-parametric information matrix. This is obtained by treating each λ 0 (t) ≡ λ 0t as a finite-dimensional parameter. Even though its dimension grows with the number of event time points in the data, this has been shown to lead to consistent variance estimators (Andersen, Klein, Knudsen, and Tabanera y Palacios 1997).
For assessing whether the frailty model is a better fit than the Cox proportional hazards model, the likelihood ratio test may be used. With the parametrizations described in Appendix A, this is a problem of testing on the edge of the parameter space, and the test statistic under the null hypothesis follows asymptotically a mixture of χ 2 (0) and χ 2 (1) distribution (Zhi, Grambsch, and Eberly 2005). This test is provided as standard output in frailtyEM.
The Commenges-Andersen score test for heterogeneity (Commenges and Andersen 1995) is implemented in frailtyEM. It may be applied to a proportional hazards model as fitted by the coxph function or automatically calculated when estimating a frailty model. If the null hypothesis of no unobserved heterogeneity is not rejected, it might be preferable to employ simpler Cox-type models.

Marginal and conditional quantities
Several quantities are of interest in the context of frailty models. For a group of individuals with covariate vector x ij (t) and frailty Z i , the cumulative intensity (hazard) is defined as The survival function for such individual is given by S ij (t|Z i ) = exp (−Λ ij (t|Z i )). These quantities are conditional on the random effect Z i .
The population-level, or marginal quantities may be obtained by integrating out the frailty from the conditional ones. The marginal survival is given by The marginal cumulative intensity is then given by Λ ij (t) = − log S ij (t). The "baseline" intensities or survival refer to an individual with x ij (t) ≡ 0.
In the simple case of only one binary covariate, we assume that there are two groups, the baseline with x = 0 and "treatment" group with x = 1. In this case, the estimated β may be interpreted as the conditional intensity ratio (hazard ratio) between two individuals with the same frailty. Under a frailty model, the observed hazard ratio between these two groups is typically attenuated in time (Aalen, Borgan, and Gjessing 2008, Chapter 6). This marginal intensity ratio is calculated as the ratio of the corresponding marginal cumulative intensities Λ ij (t).
Several measures of dependence are implemented in frailtyEM. The first is the variance of the estimated frailty distribution Z, which is useful for the gamma and the PVF family. The variance of log Z is also useful for the positive stable distribution for which the variance is infinite. Other measures of association include Kendall's τ and the median concordance. A thorough discussion and comparison of these measures can be found in Hougaard (2000).

Goodness of fit
Given a large choice of distributions for the frailty, the question comes in selecting the most suitable one. A comparison of the PVF family of frailty distributions can be found in Hougaard (2000, Chapter 7.8). In frailtyEM, all the frailty distributions depend on a positive parameter θ (see Appendix A). Given that all the distributions are part of the same family (with gamma and positive stable being limiting cases in the PVF family), the likelihood of different models is comparable across distributions. This argument suggests that it makes sense, within the PVF family, to select the model with the distribution that has the highest likelihood.
An explicit assumption of model (1) is that the censoring is non-informative on the frailty. This assumption is usually difficult to test. In frailtyEM, a correlation score test is implemented for the gamma distribution, following Balan, Boonk, Vermeer, and Putter (2016a). This can also be used, for example, for testing whether a recurrent event event process and a terminal event are associated.
Martingale residuals have been used to assess goodness of fit in terms on functional form of the covariates (Therneau, Grambsch, and Fleming 1990;Lin, Wei, and Ying 1993). These are provided by the residuals() function. For Cox models, there are several methods for assessing the proportional hazards assumption (Therneau and Grambsch 2000, Chapter 6). Graphical methods involve plotting estimated survival or cumulative intensity curves. The plotting capabilities of frailtyEM are discussed in Section 3.4. A second method is based on Schoenfeld residuals (Grambsch and Therneau 1994). In R, this is implemented for Cox models in the cox.zph function from the survival package. In frailtyEM, this is provided as part of the output and may be used to test whether the conditional proportional hazards model (1) holds. This is detailed in Section 3.5.

Syntax
The main model fitting function in frailtyEM is emfrail: emfrail(formula, data, distribution, control, ...) The formula argument contains a 'Surv' object as left hand side and a + cluster() statement on the right hand side, specifying the column of data that defines the different clusters (this is common to other packages such as frailtypack). This formulation, that is common to most survival analysis packages, allows for the representation of clustered failures with left truncation, recurrent events in both calendar time and gap time, time dependent covariates and discontinuous intervals at risk (Therneau and Grambsch 2000, Chapters 3.7 and 8). Two other statements may be used in the right hand side: + strata() for defining a column with a stratifying variable, and + terminal() for defining an event status column for dependent censoring (e.g., a terminal event in the case of recurrent events; this triggers the score test for dependent censoring described Section 2.5).
The distribution argument determines the frailty distribution. It may be generated by emfrail_dist(): R> str(emfrail_dist(dist = "gamma", theta = 2)) List of 4 $ dist : chr "gamma" $ theta : num 2 $ pvfm : num -0.5 $ left_truncation: logi FALSE -attr(*, "class")= chr "emfrail_dist" where dist may be one of "gamma", "stable" or "pvf". For "pvf", the m parameter determines the precise distribution: m = −1/2 for the IG, m ∈ (−1, 0) for the so-called Hougaard distribution and m > 0 for a compound Poisson distribution with mass at 0. The theta parameter determines the starting value of the optimization. The left_truncation argument, if TRUE, leads to the calculation described in Section 2.3. The control argument may be generated by the emfrail_control() function and regulates parameters regarding to the estimation.

Profile EM algorithm
In frailtyEM, a general full likelihood estimation procedure is implemented for the gamma, positive stable and PVF frailty models, using a semiparametric Breslow estimator for the baseline intensity. The goal is to find θ, β, λ 0 (·) that maximize L(θ, β, λ 0 (·)) (3). This can be achieved in two steps, as The profile EM algorithm refers to using a two-stage maximization procedure: the "inner problem" which involves calculating L(θ) (maximizing L(β, λ 0 |θ) for fixed θ with the EM algorithm), and the "outer problem", maximizing the profile likelihood L(θ) over θ.
The inner problem. Maximizing the likelihood for fixed θ has been proposed for the gamma frailty in Nielsen, Gill, Andersen, and Sørensen (1992) and Klein (1992), and generalizations are discussed in Hougaard (2000). The crucial observation is that the E step involves calculating the empirical Bayes estimates of the frailties z i = E[Z i |data]. This expectation is taken with respect to the "posterior" distribution of the random effect. This is detailed in Appendix B. The M step involves estimating a proportional hazards model with the log z i as offset for each cluster. This is done via the agreg.fit() function in the survival package, which obtains estimates of β via Cox's partial likelihood. Subsequently, λ 0 andΛ i (andΛ L,i , in the case of left truncation) are calculated.
The EM algorithm is guaranteed to increase L(β, λ 0 |θ) with every iteration and to converge to a local maximum. Convergence is achieved when the difference in L(β, λ 0 |θ) between two consecutive iterations is smaller than ε.
The outer problem. The "outer" problem involves maximizing L(θ). For this, a general purpose Newton-type algorithm is employed (nlm from the stats package).

Standard errors and confidence intervals
The non-parametric information matrix is not directly obtained by the estimation procedure described in Section 3.2. From the inner problem, the standard error of the estimates for β and λ 0 (·) are calculated with Louis' formula (Louis 1982), under the assumption that θ is fixed to the maximum likelihood estimate. The standard errors obtained in this way are included in the output as se and are comparable to the ones from other semiparametric frailty models (survival or coxme packages) that assume that θ is fixed. However, this leads to an underestimate of the variability of β and λ 0 (·).
In frailtyEM, adjusted standard errors, presented in the column adj se, are calculated by "propagating" the uncertainty from the estimation of θ to β, λ 0 (·). This is described in more detail in Appendix C.
From the outer problem, standard errors for θ (more precisely, of log θ, since the maximization takes place on the log-scale for numerical stability) are directly obtained from the numeric Hessian calculated by nlm. The delta method, as implemented in the msm package (Jackson 2011), is employed for calculating the standard errors for θ and the measures of dependence that are detailed in Appendix A.
Two types of confidence intervals for θ (and for the frailty variance, which, in the cases where it exists, is 1/θ) are provided. The first are derived from symmetric confidence intervals on the log-scale. The resulting asymmetric confidence interval has been shown to provide good coverage (Balan et al. 2016b). The second, more computationally intensive, are referred to as "likelihood-based confidence intervals". Under the null hypothesis, the likelihood ratio test statistic follows a χ 2 (0) + χ 2 (1) distribution. The critical value associated with this test statistic is approximately 1.92. Based on L(θ), a one-dimensional search is performed to find the confidence interval around the maximum likelihood estimate θ within which log L(θ) ≥ log L( θ) − 1.92. The advantage of this type of confidence interval is that it is transformation invariant (with the same coverage for all derived dependence measures) and it has a 1-to-1 correspondence with the likelihood ratio test.
The summary() method returns an object of class 'emfrail_summary()', the printing of which contains general fit information, covariate estimates and distribution-specific measures of dependence and goodness of fit, discussed in Section 2.5. Arguments to summary() may be used to show confidence intervals based on either the likelihood function or the delta method, as described in Section 3.3. Other arguments control the amount of information that is printed and may be used when less output is desirable.
The method for prediction of survival curves and cumulative intensity curves is implemented in predict(). Both conditional and marginal curves defined in Section 2.4 may be produced. The prediction is made for individuals with covariate values specified in a data frame (via the newdata argument) or for a fixed linear predictor (via the lp argument). For stratified models, the strata must also be specified. By default, the predict function creates predictions for each row of newdata or for each value of lp separately. With the individual argument, predicted curves may be produced for individuals with specific at-risk patterns (for example, if an individual is not at risk during a certain time frame), or for individuals with time dependent covariates. After x ij (t) is specified to predict(), Λ ij (t|Z = 1) is calculated as in (7) and from this the other quantities are derived, including the conditional survival, the marginal survival (8) and the marginal cumulative intensity. Confidence bands are based on the asymptotic normality of the estimated λ 0 , and are produced both adjusted and unadjusted for the uncertainty of θ.

Plotting and additional features
Two plot methods are provided based on both graphics package via plot() and the ggplot2 package (Wickham 2016), via autoplot(), both with identical syntax. Behind the scenes, they use calls to predict(). The type argument determines the type of plot: • type = "hist" for a histogram of the posterior estimates of the frailties; • type = "pred" for plotting marginal and conditional cumulative hazard or survival curves; • type = "hr" for plotting marginal or conditional estimated hazard ratios between two groups of individuals. The marginal hazard ratio is determined as the ratio of the marginal intensities, as described in Section 2.4; • type = "frail" for a scatter plot of the ordered posterior estimates of the frailties (also called a "caterpillar plot"). For the gamma distribution, quantiles of the posterior distribution are also included. Only available with the autoplot() method.
The Commenges-Andersen score test for heterogeneity is by default calculated every time emfrail is called and is part of the standard output. A separate function ca_test() is also provided, that may be used independently on Cox models produced by coxph() from the survival package. While martingale residuals may be obtained with the residuals() method, the test for conditional proportional hazards, based on Schoenfeld residuals described in Section 2.5 may be accessed in the $zph field of the fit. This is an object of class 'cox.zph' borrowed from the survival package and equivalent to calling cox.zph on a Cox model with the estimated log-frailties as offset. The structure and plot methods are described in ?cox.zph.
An additional function is provided to calculate the marginal log-likelihood for a vector of values of θ, emfrail_pll(), without actually performing the outer optimization. This may be useful for visualizing the profile log-likelihood or when debugging (e.g., to see if the maximum likelihood estimate of θ lies on the boundary).

Illustration
The features of the package will now be illustrated with three well-known data sets available in R: the CGD data set (recurrent events, calendar time), the kidney data set (recurrent events, gap time) and the rats data set (clustered failures).

CGD
The data are from a placebo controlled trial of gamma interferon in chronic granulotomous disease (CGD) and is available in the survival package. It contains the time to recurrence of serious infections observed, from randomization until end of study for each patient (i.e., the time scale is calendar time). For the purpose of illustration, we will use treat (treatment or placebo) and sex (female or male) as covariates, although a larger number of variables are recorded in the data set.

.326 Confidence intervals based on the likelihood function
The first two parts of this output, about regression coefficients and fit summary, exist regardless of the frailty distributions. The last part, "Frailty summary", provides a different output according to the distribution.
Both the Commenges-Andersen test for heterogeneity and the one-sided likelihood ratio test deems the random effect highly significant. This is also suggested by the confidence interval for the frailty variance, which does not contain 0.
The cumulative hazard in this case can be interpreted as the expected number of events at a certain time. It can be seen that the frailty "drags down" the marginal hazard. This is a well-known effect observed in frailty models, as described in Aalen et al. (2008, Chapter 7). All prediction results could also be obtained directly: R> dat_pred <-data.frame(sex = c("male", "male"), + treat = c("rIFN-g", "placebo")) R> predict(gam, dat_pred) For a hypothetical individual that changes treatment from placebo to rIFN-g at time 200, predictions may also be obtained: R> dat_pred_b <-data.frame(sex = c("male", "male"), + treat = c("placebo", "rIFN- This plot is shown in Figure 2. A positive stable frailty model can also be fitted by specifying the distribution argument. R> stab <-emfrail(Surv(tstart, tstop, status)~sex + treat + cluster(id), + data = cgd, distribution = emfrail_dist(dist = "stable"))

R> summary(stab)
Call: emfrail(formula = Surv(tstart, tstop, status)~sex + treat + cluster(id), data = cgd, distribution = emfrail_dist(dist = "stable")) The coefficient estimates are similar to those of the gamma frailty fit. The "Frailty summary" part is quite different. For the positive stable distribution, the variance is not defined. However, Kendall's τ is easily obtained, and in this case it is smaller than in the gamma frailty model. Unlike the gamma or PVF distributions, the positive stable frailty predicts a marginal model with proportional hazards where the marginal hazard ratios are an attenuated version of the conditional hazard ratios shown in the output. The calculations are detailed in Appendix A.
The conditional and marginal hazard ratios from different distributions can also be visualized easily. We also fitted an IG frailty model on the same data, and plots of the hazard ratio between two males from different treatment arms created below are shown in Figure 3.

Kidney
The kidney data set is also available in the survival package. The data, presented originally in McGilchrist and Aisbett (1991), contains the time to infection for kidney patients using a portable dialysis equipment. The infection may occur at the insertion of the catheter and at that point, the catheter must be removed, the infection cleared up, and the catheter reinserted. Each of the 38 patients has exactly 2 observations, representing recurrence times from insertion until the next infection (i.e., the time scale is gap time). There are 3 covariates: sex, age and disease (a factor with 4 levels). A data analysis based on frailty models is described in Therneau and Grambsch (2000, Chapter 9.5.2). For the purpose of illustration, we do not include the disease variable here.

R> summary(m_gam, print_opts = list(frailty = FALSE))
Call: emfrail(formula = Surv(time, status)~age + sex + cluster(id), data = kidney, control = zph_t) However, the LRT is not significant for the positive stable frailty model (which does not have a defined frailty variance, for comparison). Furthermore, the estimated regression coefficients are different.

R> summary(m_ps, print_opts = list(frailty = FALSE))
Call: emfrail(formula = Surv(time, status)~age + sex + cluster(id), data = kidney, distribution = emfrail_dist("stable"), control = zph_t) Commenges-Andersen test for heterogeneity: p-val 0.00238 no-frailty Log-likelihood: -184.657 Log-likelihood: -184.657 LRT: 1/2 * pchisq(0), p-val>0.5 The test for proportional hazards described in Section 2.5 reveals an insight into how the two models work. The gamma frailty model specifies conditional proportional hazards and marginal non-proportional hazards, while the positive stable model specifies proportional hazards at both levels. Therefore, the gamma frailty model appears to explain the marginal non-proportionality, while the positive stable frailty model does not. Such a phenomenon may be observed if, for example, the PS marginal model is a bad fit for the data. Further research is being carried out on this topic (Balan and Putter 2019a).

Rats data
These is an example of clustered failure data from Mantel, Bohidar, and Ciminera (1977) Three rats were chosen from each of 100 litters, one of which was treated with a drug (rx = 1) and the rest with placebo (rx = 0), and then all followed for tumor incidence. The data are also available in the survival package.
R> data("rats", package = "survival") R> head(rats) While often used to illustrate frailty models, the gamma frailty fit shows a relatively large, yet not significant frailty variance R> summary( + emfrail(Surv(time, status)~rx + sex + cluster(litter), data = rats)) Call: emfrail(formula = Surv(time, status)~rx + sex + cluster(litter), data = rats) The 'Surv' object takes two arguments here: time of event and status. This implicitly assumes that each row of the data (in this case, each rat) is under follow-up from time 0 to time. This is very similar to the representation of the recurrent events in gap-time, where each recurrent event episode is "at risk" from time 0 (time since the previous event).
We artificially simulated left truncation from an exponential distribution with mean 50, which is now an entry time to the study. The rats with a follow-up smaller than the entry time are removed.
R> m1 <-emfrail(Surv(time, status)~rx + cluster(litter), data = rats_lt) The second model, m2, is what happens when the at-risk indicator is correctly adjusted for, with the entry time also present. Refering back to Section 2.3, this is equivalent to considering P(Z) instead of P(Z|A).
R> m2 <-emfrail(Surv(tstart, time, status)~rx + sex + cluster(litter), + data = rats_lt) As may be seen from Equation 6, this is correct only if there is in fact no left truncation, or if there is no variability in Z (i.e., Z is degenerate at 1). Therefore, this formulation is correct, for example, when the 'Surv' object represents recurrent events in calendar time, as is the case in Section 4.1. This is, for example, what is returned by the frailty models in the survival package.
The third model, m3, specifies the correct time at risk but also the fact that the distribution of the frailty must be taken conditional on the entry time. Under this (artificial) left truncation problem, this would be the correct way of analyzing this data.
R> m3 <-emfrail(Surv(tstart, time, status)~rx + sex + cluster(litter), + data = rats_lt, distribution = emfrail_dist(left_truncation = TRUE)) In this case, the output shows little difference between models. This is because the frailty, even in the complete data set, is not significant. In this case, the frailty distribution is also not significant in either m2 or m3 and they lead to estimates very close to each other. In a limited unpublished simulation study, we have seen that applying the correction in m3 leads to approximately unbiased estimates of the regression coefficients, unlike m1 or m2.

Conclusion
In the current landscape for modeling random effects in survival analysis, frailtyEM is a contribution that focuses on implementing classical methodology in an efficient way with a wide variety of frailty distributions. We have shown that the EM based approach has certain advantages in the context of frailty models. First of all, it is semiparametric, which means that it is a direct extension of the Cox proportional hazards model. In this way, classical results from semiparametric frailty models (for example, based on the data sets in Section 4) can be replicated and further insight may be obtained by fitting models with different frailty distributions. Until now, the Commenges-Andersen test, positive stable and PVF family, have not all been implemented in a consistent way in an R package. Another advantage of the EM algorithm is that, by its nature, it is a full maximum likelihood approach, and the estimators have well known desirable asymptotic properties.
To our knowledge, no other statistical package provides similar capabilities for visualizing conditional and marginal survival curves, or the marginal effect of covariates. Since this is implemented across a large number of distributions, this might come to the aid of both applied and theoretical research into shared frailty models. While the question of model selection with different random effect distributions is still an open one, the functions included frailtyEM may be useful for further research in this direction.
Evaluating goodness of fit for shared frailty models is still a complicated issue, particularly in semiparametric models. However, tests based on martingale residuals, such as that of Com-menges and Rondeau (2000), should be now possible by extracting the necessary quantities from an emfrail fit.
Regarding the left truncation implementation in frailtyEM, it is very similar to that from the parfm package. However, performing a larger simulation study to assess the effects of left truncation in clustered failure data with semiparametric frailty models is now possible. In a limited simulation study we have seen that correctly accounting for this phenomenon leads to unbiased estimates. The scenario of time dependent covariates and left truncation is not supported at this time. This is because this would require also specifying values of these covariates from time 0 to the left truncation time, which would likely involve some speculation.
Technically, extending the package to other distributions is possible, as long as their Laplace transform and the corresponding derivatives may be specified in closed form. An interesting extension would be to choose discrete distributions from the infinitely divisible family for the random effect, such as the Poisson distribution. The newest features will be implemented in the development version of the package at https://github.com/tbalan/frailtyEM.

A. Results for the Laplace transforms
We consider distributions from the infinitely divisible family (Ash 1972, Chapter 8.5) with the Laplace transform L Y (c) = exp(−αψ(c; γ)).
We now consider how α and γ can be represented as a function of a positive parameter θ.
Kendall's τ is then τ = 1 − θ θ+1 and the median concordance is κ = 2 2−2 θ θ+1 − 1. Furthermore, In the case of the PS distribution, the marginal hazard ratio is an attenuated version of the conditional hazard ratio. If the conditional log-hazard ratio is β, the marginal hazard ratio is equal to β θ θ+1 .
The PVF distributions. For Y a PVF distribution with fixed parameter m ∈ R, m > −1 and m = 0, where sign(·) denotes the sign. This is the same parametrization as in Aalen et al. (2008). The derivatives of ψ are The expectation of this distribution can be calculated as minus the first derivative of the Laplace transform calculated in 0, i.e., The second moment of the distribution can be calculated as the second derivative of the Laplace transform at 0, For identifiability, we set EY = 1. The distribution is parametrized through a parameter θ > 0 which is determined by γ = (m + 1)θ and α = sign(m) m+1 m θ. This results in VARY = θ −1 . A slightly different parametrization is presented in Hougaard (2000), dependent on the parameter η H . The correspondence is obtained by setting η H = (m + 1)θ.
The PVF family of distributions includes the gamma as a limiting case when m → 0. When γ → 0 the positive stable distribution is obtained. When m = −1 the distribution is degenerate, and with m = 1 a non-central gamma distribution is obtained. Of special interest is the case m = −0.5, when the inverse Gaussian distribution is obtained. With m > 0, the distribution is compound Poisson with mass at 0. In this case, P(Y = 0) = exp(− m+1 m θ). For m < 0, closed forms for Kendall's τ and median concordance are given in Hougaard (2000, Section 7.5).
For the gamma distribution, we havẽ ψ(c; γ, Λ L ) = log(γ + Λ L + c) − log(γ + Λ L ), which implies that the frailty of the survivors is still gamma distributed, but with a change in the parameter γ.
For the positive stable we haveψ which is not a positive stable distribution any more.
For the PVF distributions, we havẽ which is not PVF any more (however, it stays in the same infinitely divisible family).

A.2. Closed forms
The gamma distribution leads to a Laplace transform for which the derivatives can be calculated in closed form. It can be seen that L(c; α, γ) = γ α (γ + c) −α .
This can be exploited also in the case of left truncation, since the gamma frailty is preserved, as shown in the previous section.
The kth derivative of this can be written as where K is the modified Bessel function of the second kind.
The function emfrail() uses the closed form formulas when possible, by default.

B. The E step
For the E step β and λ 0 are fixed, either at their initial values or at the values from the previous M step. Let n i = j,k δ ijk be the number of events in cluster i. The conditional distribution of Z i given the data is described by the Laplace transform The E step reduces to calculating the expectation of this distribution, i.e., the derivative of (9) in 0: The marginal (log-)likelihood is also calculated at this point to keep track of convergence of the EM algorithm. It can be seen that (3) involves the denominator of (9) in addition to a straight-forward expression of β and λ 0 .
The E step is generally the expensive operation of the EM algorithm. In a few scenarios (10) may be expressed in a closed form: for the gamma and the inverse gaussian distributions. In these scenarios, the E step is calculated with the fast_estep() routine. For all other cases, the E step is calculated via a recursive algorithm with an internal routine which is described here. For easing the computational burden, this is implemented in C++ and is interfaced with R via the Rcpp package (Eddelbuettel and François 2011;Eddelbuettel 2013).
We implemented a recursive algorithm in C++ which resides in emfrail_estep.cpp which loops through these partitions, calculates the corresponding derivatives of ψ and the coefficients.

C. Standard errors
Considering the vector of parameters η = (β, λ 0 (·)), and consider that for a given θ, η θ is the maximizer of the "inner problem" described in Section 3.2, i.e., η(θ) = argmax η L(η|θ). Further, for a given θ, the variance-covariance matrix VAR(η(θ)) is obtained with Louis' formula (Louis 1982). The resulting standard errors for η are underestimated because they do not factor in the uncertainty in estimating θ, as is noted also in Therneau and Grambsch (2000, Section 9.5). Below is the sketch of how this is addressed in frailtyEM, following Hougaard (2000, Appendix B.3).
Confidence intervals for the conditional cumulative hazard are obtained from the part of the variance-covariance matrix corresponding to λ 0 (·), and confidence intervals for Λ 0 (t) = s≤t λ 0 (t) are obtained with the usual formula. For confidence intervals, the delta method is used to calculate a symmetric confidence interval for log Λ 0 (t) for all t, which is then exponentiated.