pexm: a JAGS module for applications involving the piecewise exponential distribution

In this study, we present a new module built for users interested in a programming language similar to BUGS to fit a Bayesian model based on the piecewise exponential (PE) distribution. The module is an extension to the open-source program JAGS by which a Gibbs sampler can be applied without requiring the derivation of complete conditionals and the subsequent implementation of strategies to draw samples from unknown distributions. The PE distribution is widely used in the fields of survival analysis and reliability. Currently, it can only be implemented in JAGS through methods to indirectly specify the likelihood based on the Poisson or Bernoulli probabilities. Our module provides a more straightforward implementation and is thus more attractive to the researchers aiming to spend more time exploring the results from the Bayesian inference rather than implementing the Markov Chain Monte Carlo (MCMC) algorithm. For those interested in extending JAGS, this work can be seen as a tutorial including important information not well investigated or organized in other materials. Here, we describe how to use the module taking advantage of the interface between R and JAGS. A short simulation study is developed to ensure that the module behaves well and a real illustration, involving two PE models, exhibits a context where the module can be used in practice.


Introduction
The piecewise exponential (PE) modeling is a simple approach extensively used in survival analysis and reliability to approximate the distribution of event-time data; possibly investigating the association of the time response with explanatory variables. Applications related to clinical data can be easily found in the literature; for example: leukemia (Breslow, 1974), gastric cancer (Gamerman, 1991), kidney infection (Sahu et al., 1997;Ibrahim et al., 2001), breast cancer (Sinha et al., 1999), melanoma (Demarqui et al., 2014) and hospital mortality (Clark and Ryan, 2002). In the context of reliability, consider the maintenance analysis in Rigdon and Basu (2000) and the telecommunication study in Kim and Proschan (1991) and Gamerman (1994), among many other references.
The PE models are classified as semiparametric due to their flexibility to represent the hazard function without making strong and restrictive parametric assumptions regarding the shape of this component; this point makes the PE distribution very attractive for survival analysis. Its semiparametric nature allows considerable generality and enough structure for useful interpretations in any application. The popularity of semiparametric methods for univariate survival data started with Cox (1972) on the proportional hazards model. Breslow (1972) and Breslow (1974) proposed the use of the PE distribution to replace the baseline term and Kalbfleisch and Prentice (1973) investigated the grid configuration. In this paper, we are focused on PE models under the Bayesian approach for inference, which has been widely considered in the literature. See Ibrahim et al. (2001) for a broad presentation of Bayesian survival models and Sinha and Dey (1997) for a review on Bayesian semiparametric frameworks (including the PE) based on either the hazard or the intensity function. Implementing a PE model is an obstacle for those interested in a Bayesian analysis and having limited programming skills. In many cases, the main source of this difficulty is the use of an indirect method to draw samples from the unknown joint posterior distribution. The Markov Chain Monte Carlo (MCMC) algorithm Gibbs sampling (see Geman and Geman, 1984;Gelfand and Smith, 1990;Gamerman and Lopes, 2006) is widely applied for this task, and its implementation can be very demanding due to the sequential simulations from complete conditional posterior distributions. This is particularly true when the conjugate analysis is not possible leading to a Gibbs sampler whose steps depend on other methods -e.g., Metropolis-Hastings (Metropolis et al., 1953;Hastings, 1970), Adaptive Rejection Sampling (Gilks and Wild, 1992;Gilks, Gilks), Adaptive Rejection Metropolis Sampling (Gilks et al., 1995) and Slice Sampling (Neal, 2003).
As an alternative to simplify the Gibbs sampler implementation for a given model, one can take advantage of programs based on a dialect of the BUGS language (Thomas et al., 1992;Lunn et al., 2012) using the mathematical formalism of directed acyclic graphs to define joint densities. These programs build the whole structure of complete conditionals and their corresponding sampling strategies requiring from the user only the essential information related to the model fit (priors, likelihood and the MCMC setup). In this category, WinBUGS (Spiegelhalter et al., 2003), OpenBUGS (Spiegelhalter et al., 2014) and JAGS (Plummer, 2003) are three well known options. In particular, JAGS is an appealing case since it is open-source (released under a free copyleft license) and extensible allowing users with C++ knowledge to create functions, monitors, distributions and samplers for a new module. These features enable the potential formation of a future JAGS contributor community similar to that of the software R (Team, 2020).
Other alternatives are available as a support for the Bayesian computing. A well known case is the platform Stan (mc-stan.org), having an interface with R through the package rstan (Stan Development Team, 2019). The MCMC method applied here is called No-U-Turn Sampling (Hoffman and Gelman, 2014), being an extension of the Hamiltonian Monte Carlo method. Implementing new distributions in Stan is straightforward and it does not require the installation of new modules. However, due to the Hamiltonian dynamics to improve sampling, one cannot set discrete prior distributions. This is perhaps the greatest weakness of Stan. Another possibility to deal with Bayesian modeling in R is the package BayesX (Brezger et al., 2005). However, this alternative was designed for the framework of generalized linear models, where the distribution of the response variable belongs to the exponential family. This is not the case for survival models assuming the PE distribution. Bayesian computing can also be developed via INLA (Rue et al., 2009); further details in r-inla.org. This option performs approximate Bayesian inference on the class of latent Gaussian models. It handles non-Gaussian likelihoods, but the presence of non-Gaussian latent components can be a limitation; for example, assuming a discrete or a multimodal distribution for a random effect would be problematic.
The main contribution of this work is to present a new JAGS module (pexm) for Bayesian analyses involving the PE distribution. An R package, having the same name, was developed to install and load the tool. The proposed module was created to fill a gap existing in JAGS related to the absence of the PE distribution. The pexm allows a friendlier and cleaner model implementation, being attractive for many users. Computational speed is also an advantage of the proposal with respect to the usual implementation based on the Poisson zeros-trick. Another important contribution is a description to clarify some aspects related to the topic "extending JAGS", which is not well documented in the literature. There are no formal tutorials exploring the technical elements detailed in Appendix A. The JAGS manuals (Plummer, 2010(Plummer, , 2017 do not currently contain the mentioned information, however, these materials can be updated in the future. The outline of this paper is as follows: in Section 2, we describe the elements of the PE distribution implemented in the proposed JAGS module. In Section 3, we present an overview on how the module was created, explain how to use it from within R, discuss the main simplifications with respect to an implementation without the module and develop a simulation study to verify the performance. Section 4 shows results (with and without the module) in a real application involving two frailty models to fit a kidney infection data (Ibrahim et al., 2001); the rates and log-rates in adjacent intervals are correlated via gamma and normal priors, respectively. Finally, Section 5 presents the conclusions and final remarks.

The piecewise exponential distribution
The PE distribution requires the specification of a grid τ = {a 1 , a 2 , . . . , a m } partitioning the time axis into m intervals. Let a 1 = 0, a m < +∞ and a m+1 = +∞; the latter is defined only for notation purposes and it is not expressed in τ . Conditional on this time grid, the failure rate λ j is assumed constant inside the corresponding interval: I j = (a j , a j+1 ] for j = 1, . . . , m − 1 and I m = (a m , +∞) for j = m. This step function formulation provides a discrete version of the true unknown hazard function in a survival model.
Assume that T is a continuous non-negative random variable representing the survival times of subjects in a population. Let f (t) denote the probability density function (pdf) of T , F (t) is the associated cumulative distribution function (cdf), S(t) = 1 − F (t) is the survival function, h(t) is the hazard function and, finally, H(t) represents the cumulative hazard function. If T ∼ PE(λ, τ ) with λ = {λ 1 , . . . , λ m }, then: (1) Naturally, the quality of the approximation to the real hazard function depends on a suitable choice of the time grid τ . In Breslow (1972) the author suggests the use of the observed failure times as the limits of the intervals I j ; however, depending on the sample size, this can lead to a non-parsimonious model with many rates λ j to be estimated. In contrast, Kalbfleisch and Prentice (1973) indicates that the grid should be selected independently of the data. The study explores different interval sizes configuring a regular grid. When choosing τ without using the data, one might consider two aspects: small m leads to a poor discretization and a large m determines a non-parsimonious model. Irregular grids are also allowed in the analysis and this could be a reasonable strategy in applications assuming higher frequency of failures in certain parts of the time axis; i.e., wider intervals would be specified for time regions where few events are expected to be observed.
The cumulative hazard function is expressed as follows: (2) We can use (2) to evaluate the survival function at a time point t through the well known equality S(t) = exp{−H(t)}. The pdf of T can be easily identified by simply taking the derivative of The quantile is an information of interest in many data analyses. In the context of the PE distribution, this quantity can be calculated via the relationship H(t) = − ln(1 − F (t)). Given a probability p, the corresponding quantile is the value t * such that H(t * ) = − ln(1 − p). Let w(p) = − ln(1 − p) and recall that H(a m+1 ) = +∞. Then, the calculation of t * can be expressed by: Note that we can easily compute the median of the PE distribution by setting p = 0.5 in (3). Those readers interested in details about the first two moments of the PE distribution can refer to Barbosa et al. (1996).
3 The JAGS module.
JAGS ("Just Another Gibbs Sampler"; see Plummer, 2003) is a general-purpose, open-source, cross-platform graphical modeling program using a set of MCMC methods for stochastic simulations to generate samples from the joint posterior distribution of the parameters in a Bayesian model. These samples are further used for inferences regarding the target parameters and the model fit. A recent version of JAGS can be downloaded from http://mcmc-jags.sourceforge.net and it is also available as a package for different Linux distributions. A special feature of JAGS is that it can be modular and extensible with new functionalities, which can be loaded whenever required. A module can be created to accommodate all types of probability distributions (continuous or discrete, univariate or multivariate). In order to write these extensions, the knowledge of C++ and object oriented programming is necessary; however, we emphasize that installing the extensions can be extremely automatic depending on the operational system.
Technical manuals providing a comprehensive coverage on how to build a JAGS module were not available in the literature at the time of writing this material. We follow the steps indicated in Wabersich and Vandekerckhove (2014), which can be seen as a short tutorial exploring the key aspects to add custom distributions to JAGS. Those aspects not covered in this reference, but important to implement the PE distribution, are given careful attention in Appendix A. For instance, the authors in Wabersich and Vandekerckhove (2014) work with two examples: the Bernoulli distribution for didactic purposes and the Wiener diffusion first-passage time distribution. In both cases the input values are scalar quantities; i.e., the paper does not explore distributions with vectors as arguments, which is the case for the PE(λ, τ ). This issue can be easily addressed by simply modifying the parent class type expressed in the code from scalar to vector (see Appendix A).
The PE distribution implementation described in Appendix A is in fact a more complete example displaying the use of important resources available to create a JAGS module. Besides implementing the log-density and the sample generating process allowing the MCMC simulations in a Gibbs sampler, we have built auxiliary functions returning the pdf, cdf, quantile, hazard and cumulative hazard of the PE(λ, τ ) at a given time t or probability p. In addition, the module includes a routine to check whether the values in λ and τ are consistent with the specifications shown in Section 2. Hence, Appendix A contains essential complementary information (with respect to Wabersich and Vandekerckhove, 2014) to those readers interested in extending JAGS.
3.1 Using the module in the R environment.
In this section, we discuss how to use the proposed module in association with R. This is achieved through the package rjags (Plummer, 2016) providing an interface between R and JAGS. Installing the new module is straightforward as indicated in Appendix A. The procedure basically involves the installation of an R package called pexm. The package was designed for this task and also contains tools to help loading the module into the R environment. The pexm source can be downloaded from two repositories: GitHub and Sourceforge (see more details in Appendix A.3). Hereafter, we assume that a recent version of JAGS, R, rjags and the proposed pexm module are correctly installed in the computer. A confirmation that the module is ready to be used in R can be obtained by simply typing the following commands in the console: In order to illustrate how to use the module, we have developed a simple example taking into account the R package msm (Jackson, 2011) to generate the data. This package provides functions to fit continuoustime Markov and hidden Markov multi-state models and, in particular, includes the function rpexp to draw samples from the PE distribution. We can also use dpexp, ppexp and qpexp to explore the pdf, cdf and quantiles.
Let T i be the survival time of the i-th individual, i = 1, . . . , n. In our illustration, we set the real rates λ = (0.3, 0.6, 0.8, 1.3) , the time grid τ = {0, 2, 3, 5}, n = 1,000 and assume T i ∼ PE(λ, τ ) to generate the data. The following objects are loaded into the R workspace: t is the observed vector (T 1 , . . . , T n ) , n is the length of t, tau is the grid vector, m is the length of tau, pq is a vector of probabilities for which the quantile should be evaluated, and nq is the length of pq.
The content of the JAGS model script Model_pex.R is presented in the Code chunk CC.2. This code indicates the likelihood T i ∼ PE(λ, τ ) and the prior specifications λ j ∼ Ga(0.01, 0.01) for j = 1, . . . , m (mean 1 and variance 100). Note that in each iteration of the MCMC, the survival function St[i] and the log-density loglik[i] are evaluated for each t i together with the quantile q[k] for the probabilities in pq. In addition, the cummulative hazard function Ht100 and the hazard function ht100, related to the time point t 100 , are monitored in this example. As it can be seen, the elements defined in Model_pex.R explore all functionalities implemented for the PE distribution in the new module.

Strategies to indirectly specify the likelihood.
Without the proposed pexm module, one can use either the Poisson or the Bernoulli distribution to indirectly specify the likelihood function. These strategies are commonly known as zeros-trick (Lunn et al., 2012) and ones-trick (Spiegelhalter et al., 2003), respectively. Let l i = ln[f (t i |λ, τ )], then the likelihood can be written as: where X i ∼ Pois(−l i ). In order to ensure a positive Poisson mean, we can select a constant C > 0 and use: We must choose C such that −l i + C > 0 for all i = 1, . . . , n. Note that adding C does not affect the inference results, since it is equivalent to multiplying the posterior distribution by a constant term. This strategy can be applied in JAGS using the syntax shown in Code chunk CC.3. Note that the loglikelihood of the PE(λ, τ ) for the i-th observation must be specified for the object l[i]. This task for the PE context is not simple compared to other distributions. A possible approach is to replace the vector t and tau by other objects accounting for the position of each observed time point in the grid. These objects must be created in the R environment and passed to JAGS as a data argument (see data in the Code chunk CC.1) together with zero[i] = 0 for all i. They must allow the calculation of H(t) in JAGS, which in turn can be used to obtain ln[f (t)] = λ j − H(t) (if t ∈ I j ) and S(t) = exp{−H(t)}. After the MCMC simulation, the quantiles can be calculated in R using the posterior samples of λ and the function qpexp (package msm).
Again, we call attention to the fact that the Bernoulli distribution can also be used in a similar strategy expressing the model likelihood as . Although both approaches have the same goal, the Poisson-zeros option is more popular in the literature.
The main conclusion of this section is that the pexm module provides a simpler and more attractive implementation of a Bayesian PE model in JAGS, since it avoids the use of alternative strategies to indirectly specify the likelihood. In the next section, we develop two types of studies using artificial data sets. The first one compares the results from the model in Code chunk CC.2 and those obtained via the msm package. The second study is intended to evaluate the performance of pexm with respect to the Poisson-zeros strategy. In Section 4, the analysis "pexm versus zeros-trick" is based on a real data set and it accounts for scripts (zeros-trick case) suggested in Ibrahim et al. (2001).

Simulation study.
The first goal of the present section is to verify whether the proposed JAGS module is correctly implemented and working as expected. In order to carry out this type of analysis, we consider the configuration described in Section 3.1 to generate a sample of 1,000 time points from the PE distribution; the function rpexp (package msm) is applied here. In each interval defined by τ , we have observed the following number of time points: 443 in I 1 , 247 in I 2 , 245 in I 3 and 65 in I 4 .
The R script shown in Code chunk CC.1 is used to run the MCMC for the simple PE model indicated in Code chunk CC.2. The argument data is set to evaluate nq = 23 different quantiles corresponding to probabilities in pq ranging from 0.01 to 0.99. A good behavior is confirmed, if all posterior estimates for λ are close to their corresponding real values (see Section 3.1) and the chain built for each log-density, survival function and quantile is centered around the true value calculated through one of the existing PE functions of the package msm. Figure 1 (a) clearly indicates this mentioned configuration. The three graphs show all true values (grey points) inside the 95% HPD (Highest Probability Density) intervals obtained based on the results from JAGS. In this paper, all HPD intervals are calculated through the R package coda. Other interesting aspects can be observed in Panel (a): (i) the shape and decay speed of the survival function; (ii) the log density formed by four different parts corresponding to each interval I j ; (iii) the higher posterior uncertainty expressed in the log-density related to I 4 . The behavior indicated in (iii) is justified by the fact that only 65 time-points belong to the last grid interval, meaning less information to estimate λ 4 . Figure 1 (b) shows the chains generated for each λ j after the burn-in period. As it can be seen, the convergence to the real values (horizontal lines) is visually confirmed and low autocorrelation is observed for all cases (recall that lag = 1). Here, all real values can be found within the corresponding 95% HPD intervals, and the posterior mean and median approximate well their targets. In terms of variability, the standard deviations can be ordered as follows: 0.013 (λ 1 ) < 0.037 (λ 2 ) < 0.050 (λ 3 ) < 0.165 (λ 4 ). Again, a larger number of observations in I j leads to a smaller posterior variability for λ j .
The Code chunk CC.2 contains the pexm functions hpex and hcpex, which are alternatives to calculate during the MCMC run the hazard function (1) and the cumulative hazard function (2), respectively. Note that this JAGS model script is set to evaluate these functions for t 100 = 3.483. Since τ = {0, 2, 3, 5}, we have t 100 ∈ I 3 and thus h(t 100 ) = λ 3 . The posterior sample related to the object ht100 provides the mean 0.778 (the true λ 3 is 0.8). The posterior mean of the cumulative hazard Ht100 is 1.529 (the true value is 1.587). This average can also be obtained using the relationship H(t) = − ln[S(t)]. A final remark related to these functions is that the present example can be extended to the survival regression setting. One might be interested in evaluating h(t) or H(t) under the effect of some fixed configuration of covariates. This can be easily adapted to hpex and hcpex by including the covariates effects in the argument lambda; Section 4 discusses two models having a regression setting. Monte Carlo scheme.
The second investigation developed in this section is based on a Monte Carlo (MC) scheme with 100 replications, i.e., we evaluate 100 data sets generated under the same conditions. The main aim here is to compare the performances of a Bayesian model implemented with the module pexm and another version using the Poisson-zeros strategy (zeros-trick). We emphasize that the difference here is only the JAGS syntax. The model to be fitted is a simplification of the one exhibited in the Code chunk CC.2. We will not save results related to the quantiles, survival, hazard and cumulative hazard functions. In the pexm case, we only specify the likelihood via t[i] dpex(lambda[], tau[]) and the prior lambda[j] dgamma(0.01, 0.01). On the other hand, the Poisson-zeros case is defined as indicated in Code chunk CC.3, where the log-likelihood related to the PE distribution must be written for the object l[i]; we also set λ j ∼ Ga(0.01, 0.01). Using this simple model, we evaluate "pexm versus zeros-trick" based on the MCMC outputs associated to the rates λ j 's.
For both implementations, the MCMC is configured with burn-in = 1,000, lag = 1 and 2,000 posterior samples collected after the warm-up period. Two chains are obtained for each parameter. These chains are combined to calculate all descriptive statistics; except the effective sample size (using only the first chain). The starting values in the algorithm are also the same for both strategies; we highlight that the arguments .RNG.name and .RNG.seed, see the rjags documentation, were fixed to impose the same settings of random number generators in JAGS. The algorithms for all combinations of scenarios and sample sizes were executed in the same computer (Intel Core i7 processor and 16.00 GB memory) dedicated exclusively for this task. shows the computational times required by JAGS to run the MCMC (Column 1 indicates n = 100, Column 2 indicates n = 1,000). Panel (b) presents the effective sample sizes of the chains generated for each λ j (Column 1 indicates n = 100, Column 2 indicates n = 1,000). In both panels, the scenarios S1, S2 and S3 are represented in the rows 1, 2 and 3, respectively. Figure 2 shows boxplots summarizing some of the most interesting results from the MC scheme. Thys type of graph is a good option in studies involving replications, since it indicates the variability of repeated measurements (Monte Carlo error) through the length of the boxes. Panel (a) presents the computational times (in seconds) needed to run the Gibbs sampling via JAGS. Theses values were determined through the existing function proc.time in R. The graphs in Panel (b) indicate the effective sample size of the chains obtained for the λ j 's. This quantity was calculated using the function effectiveSize available in the R package coda. The light blue color represents the implementation based on the module pexm; light red is related to the Poisson-zeros strategy. Given that the same data set is being fitted by the same model with different implementations, this is done in a controlled setting for reproducibility, we have observed the exact same results from the corresponding MCMC runs. In other words, the outcomes of pexm and zeros-trick are the same in the comparison of bias, coverage percentages and autocorrelation. This aspect is clearly noted in Panel (b).
In many studies, the computational time is measured in terms of the processor time used by each implementation to achieve a given effective sample size. This alternative cannot be applied here due to the mentioned compatibility of results. Instead, we focus on the required time to run 3,000 MCMC iterations for each artificial data set. Figure 2 (a) presents the main difference between pexm and the zeros-trick. Naturally, the computational times related to n = 100 are smaller than those for fitting n = 1,000; please, note the distinct scales in the y-axis to allow visualization. For all combinations (scenario, n), the boxplots associated with pexm are located in a lower level than those for the zeros-trick. This suggests that the computational speed is an advantage of the module. Figure 2 also indicates that the differences between the scenarios S1, S2 and S3 are more evident for the case n = 100. Note that the variability exhibited by the graphs are higher in (S3, n = 100). Few distinctions are detected between the scenarios when n = 1,000. Panel (b) suggests that the effective sample size related to λ 1 tend to be higher than those for the other rates. Recall, from the previous study, that the true λ in S1 provides few time points generated in the last interval of the grid (t i > 5). This feature also occurs for the choices of λ in S2 and S3. When comparing the number of cases t i > 5, we have the following order S1 ¿ S2 ¿ S3. The small number of observations in the last interval (especially for S3) explains the worst performance of λ 4 in terms of effective sample size.

Real application.
The main goal of this section is to explore the pexm module in a Bayesian analysis via JAGS for a real data set. Two frailty PE models are considered here; they were proposed in Sahu et al. (1997) and reported in Ibrahim et al. (2001) for the kidney catheter data in McGilchrist and Aisbett (1991). The comparison "pexm versus zeros-trick" is also developed in this analysis.
The kidney catheter is a well known data set often used to illustrate survival models with random effects; it is also available in the R package survival (Therneau, 2015). The response variable is the time to infection from the insertion of a catheter in a patient using portable dialysis equipment. The time of first and second infections are registered for each patient; there are 38 subjects and thus 76 time measurements for the analysis (18 of them are right-censored). After the occurrence (or censoring) of the first event, enough time is allowed to cure the infection before starting the second insertion. The data set includes an indicator variable identifying the event status (1 for infection, 0 for censoring) and three covariates: gender, age in years and disease type (with four categories). Following Ibrahim et al. (2001), the covariate "disease type" is not included in our study.
Let T ik be the random variable representing the infection time for the i-th patient and the k-th insertion; i = 1, . . . , 38 and k = 1, 2. In addition, consider the fixed covariate vector x ik = (x sex i , x age ik ) where x sex i is the gender (1 = female, 0 = male) and x age ik is the age of the i-th patient in the k-th insertion. Given x ik and the unobserved positive random variable z i (frailty), the conditional hazard function of T ik in a shared frailty model can be written as: with β = (β sex , β age ) and h 0 (.) being the unknown baseline hazard function for all patients. The noninformative censoring mechanism is assumed in this application. Note that the random effect z i establishes a dependency between the first and second infection times measured for the same patient. This modeling structure is common in the literature and it can be seen as an extension to the Cox proportional hazards model (Cox, 1972). Frailty models were first motivated by Vaupel et al. (1979) and then developed by Clayton and Cuzick (1985), Oakes (1986) and Oakes (1989). The PE distribution, discussed in Section 2, can be used to represent the baseline hazard in (4); that is h 0 (t ik ) = λ j for t ik ∈ I j , j = 1, . . . , m. For simplicity, one can assume a prior configuration treating the λ j 's as independent; however, Sahu et al. (1997) and Aslanidou et al. (1998) argue that in a frailty model the ratio of marginal hazards is similar to the ratio of baseline hazards for near time points (given the same covariates). In this case, it would be more appropriate to consider a prior that correlates the λ j 's in adjacent intervals. Following Ibrahim et al. (2001), we shall explore two approaches: • Model I: (λ j |λ j−1 ) ∼ Ga(α j , α j /λ j−1 ) with λ 0 = 1; • Model II: ξ j = ln(λ j ) and (ξ j |ξ j−1 ) ∼ N (ξ j−1 , ν) with ξ 0 = 0.
The authors in Ibrahim et al. (2001) also discuss a sensitivity analysis comparing m = 5, 10 and 20 intervals partitioning the time axis in the PE distribution. They indicate that m = 5 provides the worst model fit, and the results for m = 20 are not substantially better than m = 10; therefore, for parsimony, m = 10 is the choice selected for this analysis.
Due to the presence of right-censored observations, we use the standard strategy in JAGS based on the existing dinterval distribution to input the missing infection times; other details can be found in Section A.1, Appendix A. The JAGS scripts based on pexm for the models I and II are presented in the next three code chunks identified as CC.4 (a), (b) and (c). The code in (a) shows the main script containing: the time grid (line 1), the likelihood function (lines 3-12) and the prior distributions (line 13-18) of the frailties and the regression coefficients. Recall that the precision is the second parameter of the normal distribution, according to the JAGS parameterization. Note that the code in line 1 provides τ = {0, 56.2, 112.4, 168.6, 224.8, 281.0, 337.2, 393.4, 449.6, 505.8}, which is an equally spaced time grid built with respect to the largest infection time (562)   We can handle right-censored observations in JAGS by defining three variables to be inserted as data: (i) a binary indicator isCensored with 1 for censored values, (ii) a time response vector (say t_res) with "NA" replacing the censored cases and (iii) a censored time vector (say t_cen) having 0 for the non-censored subjects. The core of the JAGS model for censored data is represented by the two commands: isCensored Here, JAGS automatically imputes a random value for a missing observation t_res[i] = NA. The corresponding response isCensored[i] = 1 determines that the observation to be imputed must be greater than t_cen[i]; this is implied by the existing JAGS distribution dinterval (see Plummer, 2003). In this example, the new observation is then generated from the PE(λ, τ ) with lower bound t_cen[i] and unchanged upper bound +∞.
The MCMC configuration for the present study is as follows: burn-in = 10,000 iterations, lag = 1, posterior sample composed by 10,000 observations and two chains being generated for each parameter. In terms of initial values, we set for both models the arguments: .RNG.name = "base::Super-Duper" and .RNG.seed = 5 for the first chain and .RNG.name = "base::Wichmann-Hill" and .RNG.seed = 2 for the second chain. See the rjags documentation (Plummer, 2016) for further details about starting values in this tool.  Table 1: Posterior estimates and 95% HPD intervals for regression coefficients (β) and variance (κ) of the frailties. Comparisons: "Model I vs. II" and "pexm vs. zeros-trick". Table 1 presents the posterior mean, median, standard deviation and 95% HPD credible intervals for β sex , β age and κ. Here, we can compare the results from models I and II implemented with pexm and the Poisson-zeros strategy. In contrast with the simulation study, we do not observe here the exact same result when comparing the same model and parameter for both implementation types. However, the estimates obtained in this comparison are very similar; in some cases the distinction occurs in the third decimal place. This small difference can be justified by numerical approximation errors affecting the computations related to the more complex models under investigation in this section. Another interesting aspect to be highlighted is the similarity between the estimates from models I and II. As a result, one cannot choose one of the models based on a more reasonable interpretation of a parameter. Note that all credible intervals for β age include zero, indicating that the age has no significant effect over the infection time. On the other hand, all intervals for β sex are below zero, meaning that a female patient has lower infection risk than a male patient for a given time point.
According to Ibrahim et al. (2001), the posterior means of κ in Table 1 should be considered high, providing evidence of a strong positive association between two infection times for the same patient (T i1 and T i2 ). A large κ suggests that a major part of the variability in the data is explained by the clusters (the patients) rather than the insertions for the same individual.
The similarity of the estimates in Table 1, for the comparison "pexm versus zeros-trick", suggests that the chains are converging to the same region in the parameter space and exhibiting equivalent variability. This behavior is expected since we fit the same model. The trace plots in Figure 3 confirm this interpretation and also indicate that the level of autocorrelation obtained via pexm tends to reflect the one produced via the Poisson-zeros strategy. The effective sample sizes of the chains, closely related to the autocorrelations, are shown at the top of the graphs. The values are not high, suggesting some strong autocorrelation that can be reduced by setting lag ¿ 1 in the MCMC configuration. Although small ESS are reported, great discrepancies are not detected when comparing the implementations. The three horizontal lines, included in each graph, represent the mean and the 95% credible interval reported in Ibrahim et al. (2001) for the corresponding parameter. Note that all chains are converging to the same region indicated in the literature.    Figure 4 shows the estimated kernel densities obtained in R for the posterior samples generated in this real application. Note that the curves in red (PE module) and black (Poisson-zeros) can be considered similar in all cases, which reinforces the idea of two algorithms working as expected to fit the same model. Very few discrepancies are detected in these graphs, especially around their modes; however, this can be significantly reduced when assuming a larger posterior sample size (say 100,000 iterations after the burn-in). In Figure 5, we compare the estimated baseline hazard functions h 0 (t ik ) obtained via pexm (red) and the zeros-trick (black). As it can be seen, the results from both cases tend to agree. A slightly difference is detected in the time interval related to λ 10 . Such small distinction almost disappears when evaluating the posterior medians.  Table 2: Posterior standard deviations for each λ j . Comparison: "Model I vs. II" and "pexm vs. zeros-trick". Table 2 presents the posterior standard deviations of each λ j . Note that the variability tend to be higher when the index j is large. This aspect is related to the choice of the time grid τ . Assuming m = 10 and the equally spaced configuration, the number of observed time points (18 censored cases not included) in each interval I j are: 30, 5, 9, 5, 1, 3, 0, 2, 0 and 3 for j = 1, . . . , 10, respectively. Therefore, the last intervals (and their neighbors) have fewer observations, which determines higher posterior uncertainty for the corresponding rates.
The module pexm allows a simpler implementation of a Bayesian PE model, but this is not the only advantage observed in this paper. The MC study in Section 3.3 clearly indicates that the computational speed is another positive aspect of the proposed tool. The gain in terms of processor time can also be noted when running the MCMC related to the models I and II in the present real illustration. Using the same computer mentioned in the MC study (Section 3.3), the Poisson-zeros strategy is again considerably slower than the pexm implementation. The function proc.time in R provides the following times (

Conclusions.
The PE distribution has been widely used in semiparametric settings to model data from applications in survival analysis and reliability. Implementing a PE model is a difficult task for anyone aiming a Bayesian analysis without the necessary programming skills to write all MCMC steps. In order to circumvent this issue, the programs based on the BUGS language are seen as attractive alternatives by many users. However, none of these programs has a module designed to deal with the PE distribution. In this case, a well known implementation strategy is to consider the Poisson or the Bernoulli distributions to indirectly specify the likelihood, which again poses some programming challenges.
JAGS deserves a special attention in the group of programs similar to BUGS, since it is free and extensible. The main goal of this paper was to present a new JAGS module (pexm) to fit Bayesian models accounting for the PE distribution. We have provided a clear picture of how the new module can be used from within the R environment (integrated with JAGS), which is important to guide the interested users. The discussion emphasizes that the module simplifies the JAGS model script.
A study based on artificial data was developed to compare results using pexm and those from an R package (msm) containing utilities for the PE distribution. The investigation concludes that the module works well. An MC simulation scheme with 100 replications, using a simple Bayesian model, was also conducted to evaluate the behavior of pexm with respect to the usual zeros-trick implementation. The main conclusion here is that the module provides the exact same results with the advantage of having higher computational speed.
In a real data application, we have considered a data set with two covariates and the response "time to infection" for patients using a portable dialysis equipment. Two PE survival frailty models in the literature were explored in this part. They assume different structures of association between adjacent intervals in the time grid. The presence of right censored data allowed a more in depth evaluation of the PE module; censoring was not considered in the simulation study. Discussions were again devoted to the comparison "pexm versus zeros-trick". As expected for two algorithms fitting the same model, their results were quite similar. The small estimation distinctions are possibly due to numerical approximation errors related to complexity of the tested models. In addition, the posterior estimates of the parameters resembles the ones reported in the literature using the same data set.
One last contribution of this work is the compilation of important information for those readers (with some C++ knowledge) interested in extending JAGS. The obstacles, solutions and other aspects related to the module implementation are fully described in the Appendix.
The first lines of this file references with #include the header files containing other functions to be used by the module. The first one is the existing JAGS header file Module.h. The second one is the PE distribution class file DPex.h. The remaining cases indicate function class files required to compute the density (DPexFun.h), cdf (PPexFun.h), quantile (QPexFun.h), hazard function (HPexFun.h) and cumulative hazard function (HCPexFun.h) given a time point t or a probability p of interest. These distribution and function class files are described ahead.
Note that we have chosen different names for the distribution class file (DPex.h) and the function class file to compute the pdf (DPexFun.h). This distinction is necessary in the module implementation, otherwise the system will indicate errors when creating the installation files. This is an important aspect to be clarified, because these class files are related to two JAGS model script commands having the same name but applied in different situations. As an illustration, consider: • T ∼ dpex(lambda, tau) is related to DPex.h and indicates that T ∼ PE(λ, τ ); • Z <-dpex(t, lambda, tau) is related to DPexFun.h and it allocates to the object Z the pdf of the PE(λ, τ ) evaluated at the time point t. This equal name configuration is not mandatory, but it is in accordance with the naming standards adopted for the existing distributions available in JAGS.

A.1 The distribution class files.
The file PEXM_PATH/src/distributions/DPex.h contains the class prototype of the PE distribution. Near the top of the code, the JAGS parent class VectorDist is specified as the base class. This choice is appropriate for those distributions parameterized by vectors, which is the case of the PE(λ, τ ). The examples in Wabersich and Vandekerckhove (2014) are associated with distributions taking scalar quantities as input arguments and thus related to the alternative parent class ScalarDist. The process of building a JAGS module is easier for scalar valued distributions as the RScalarDist class can be inherited from, which partly automates the availability of d/p/q functions.
The following specific functions were implemented in the pexm C++ classes: • logDensity: calculates the log-density; • randomSample: generates random samples; • typicalValue: returns a typical value in the support of the distribution; • checkParameterValue: indicates if {λ, τ } are in the allowed parameter space; • canBound: indicates if the distribution can be bounded; • isDiscreteValued: indicates if the PE distribution is discrete (false); • isSupportFixed: indicates if the upper/lower limits of the PE distribution support are fixed (true, its support does not depend on λ or τ ); • checkParameterLength: indicates if λ and τ have the same length; • support: returns the unbounded support (0, +∞) of the PE distribution.
Some of these functions are not required when using the simpler ScalarDist parent class; see Wabersich and Vandekerckhove (2014) for this and other details related to the code.
The file PEXM_PATH/src/distributions/DPex.cc contains the actual implementation of the PE distribution. It includes the codes for the nine mentioned specific functions. In the beginning, note that we quote the name dpex and the number of parameters (i.e. 2) to be used in any JAGS script. The implementation of logDensity, randomSample and typicalValue takes into account the elements described in Section 2. In particular, typicalValue returns the median of the PE distribution, which is simpler to implement and uses the same structure presented ahead for the quantile function.
The function randomSample has two conditions related to the lower and upper bounds of the PE distribution. The default support is (0, +∞), however, this domain can be bounded to allow applications involving censored data (a very common situation in survival analysis and reliability). Given the existence of new boundaries, the commands implemented within if(lower){...} and if(upper){...} will replace the default interval (0, 1) by a shorter one, where a random number is uniformly generated and then transformed into a time point via cdf inversion. The specific function canBound is necessary to enable this restriction.
The function checkParameterValue allows JAGS to alert the user about the wrong specification of λ or τ . The following points are verified: (i) λ j ≥ 0 for all j, (ii) a 1 = 0 in τ , (iii) a 1 < a 2 < . . . < a m in τ . When running JAGS, a warning message will be displayed if at least one of these items is false.

A.2 The function class files.
The additional functions related to the PE(λ, τ ) are located in PEXM_PATH/src/functions within the project. The PE class prototype files associated with the pdf, cdf, quantile, hazard and cumulative hazard functions are denoted by DPexFun.h, PPexFun.h, QPexFun.h, HPexFun.h and HCPexFun.h, respectively. Their structures are quite similar, changing only the names from one case to the other. Near the top of each code, we set the JAGS function base ScalarVectorFunction.h allowing the simultaneous specification of one scalar (time t or probability p) and two vectors (λ and τ ) as input arguments.
The directory PEXM_PATH/src/functions also contains the ".cc" files (they are DPexFun.cc, PPexFun.cc, QPexFun.cc, HPexFun.cc and HCPexFun.cc), where the actual implementations of the mentioned functions are found. The JAGS script names to call these functions (dpex, ppex, qpex, hpex and hcpex) and their number of input arguments (three) are defined at the beginning of each code. As an example, in a JAGS model script the user can allocate the corresponding outcome to the generic objects Z1, Z2, Z3, Z4 and Z5 as follows: The Code chunk CC.6 shows the JAGS script for the zeros-trick implementation of the Bayesian models discussed in Section 4. The prior specification for the rates λ j 's is supposed to be included in last line; consider the Code chunk CC.4 (b) for Model I and CC.4 (c) for Model II. Line 2 specifies the time grid, the object d [i, k, j] indicates whether the event-time belongs to the interval I j and the object le [i, k, j] receives the length of the intersection between the intervals (0, t ik ) and I j .
Code chunk CC.