SmoothHazard : An R Package for Fitting Regression Models to Interval-Censored Observations of Illness-Death Models

The irreversible illness-death model describes the pathway from an initial state to an absorbing state either directly or through an intermediate state. This model is fre-quently used in medical applications where the intermediate state represents illness and the absorbing state represents death. In many studies, disease onset times are not known exactly. This happens for example if the disease status of a patient can only be assessed at follow-up visits. In this situation the disease onset times are interval-censored. This article presents the SmoothHazard package for R . It implements algorithms for simultaneously ﬁtting regression models to the three transition intensities of an illness-death model where the transition times to the intermediate state may be interval-censored and all the event times can be right-censored. The package parses the individual data structure of the subjects in a data set to ﬁnd the individual contributions to the likelihood. The three baseline transition intensity functions are modelled by Weibull distributions or alternatively by M -splines in a semi-parametric approach. For a given set of covariates, the estimated transition intensities can be combined into predictions of cumulative event probabilities and life expectancies.


Introduction
The irreversible illness-death model is a multi-state model which has many applications in various areas of research, for example in the medical field. The model describes the transitions from an initial state (e.g., alive and disease-free) to an absorbing state (e.g., death) either directly or via an intermediate state (e.g., disease, Figure 1). The transition intensities α 01 , α 02 , and α 12 are positive functions of time which can also depend on covariates. In some applications it happens for some or all subjects that the transition times from the initial state to the intermediate state are interval-censored. This occurs for example when the status of the intermediate state can only be determined at a sequence of visit times. In this case, if a subject is diagnosed as diseased at one of the visit times, say R, then it is only known that the subject was last seen disease-free at the previous visit time, say L, and hence the time of the onset of the disease is interval-censored between L and R for this subject. Furthermore, both the process of visit times and the observation of the time of the transition into the absorbing state are usually right-censored, i.e., limited to the individual follow-up period of the subjects. This yields a rather complex general observational pattern, because for a subject who died without being diagnosed as diseased at earlier visit times, it may or it may not be possible to determine retrospectively if and when the subject became diseased between the last visit time and the time of death.
The SmoothHazard package (Touraine, Joly, and Gerds 2017) provides estimates of the baseline transition intensities and of covariate effects when the data fall into one of the 6 cases that are displayed in Figure 2. Thus, the case of left-truncated event times (delayed entry) is covered, as well as the case where for some subjects the transition time into the intermediate state is observed exactly and for others it is interval-censored. Finally, the special case is covered where for some or all subjects no intermediate information is available about the disease status such that it is only known whether or not the subjects became diseased between the start and the end of follow-up. The latter occurs in Figure 2 when E = L and R = min(T, C) in cases 3 or 6.
The aim is to estimate covariate effects on the three transition intensities. To achieve this regression models are implemented which assume proportional transition intensities and a non-homogeneous Markov process. The user chooses between a fully parametric model where each of the baseline intensities is described by the parameters of a Weibull distribution and a semi-parametric model where the baseline intensities are left unspecified and approximated by M -splines. For the parametric model, the regression coefficients and Weibull parameters are estimated by maximizing the likelihood; for the semi-parametric model, the coefficients of the M -splines and the regression coefficients are estimated by maximizing a penalized likelihood. The SmoothHazard package then allows predictions of transition probabilities, cumulative The letters E and C denote the start and end of follow-up, respectively, and the letters L and R the visit times between which the transition into the intermediate happened.
probabilities of event and life expectancies to be obtained for a given set of covariates, based on estimated baseline transition intensities and on estimated covariate effects.
The SmoothHazard package is available from the Comprehensive R Archive Network (CRAN) at https://CRAN.R-project.org/package=SmoothHazard.

Comparison to other packages
If the exact transition times of an illness-death model are observed, standard procedures can be used to estimate transition intensities, regression coefficients and functionals thereof (Putter, Fiocco, and Geskus 2007). In particular, in R (R Core Team 2017) the packages survival (Therneau and Grambsch 2000;Therneau 2017) and rms (Harrell 2017) estimate the regression coefficients using the Cox partial likelihood (Cox 1975) without the need to model the baseline intensities. Besides, the package simPH (Gandrud 2015) is useful to simulate and plot quantities of interest and their uncertainty for coefficient estimates from Cox models (even interactive and nonlinear effects). Moreover, a number of packages facilitate the use of multi-state models and illness-death models. The packages etm (Allignol, Schumacher, and Beyersmann 2011) and mstate (de Wreede, Fiocco, and Putter 2010, 2011Putter et al. 2007) can be used to estimate transition probabilities for the Aalen-Johansen estimator but the etm package does not account for the influence of covariates. The p3state.msm package (Meira-Machado and Roca-Pardiñas 2011) obtains non-Markov estimates for the transition probabilities. The TPmsm (Araújo, Meira-Machado, and Roca-Pardiñas 2014) package implements several estimators for the transition probabilities, including the Aalen-Johansen estimator and estimators that are consistent even without Markov assumption or in case of dependent censoring.
However, when transition times to the intermediate event are interval-censored, it is then generally not possible to arrive at consistent estimates with the software provided by the packages listed above. Indeed, a common approach for handling subjects who died with unknown disease status consists of artificially ending their follow-up at the last time they were seen without disease and subsequently treating them as right-censored. However, this approach can lead to a systematic bias in the estimates of transition intensities and of regression coefficients (Joly, Commenges, Helmer, and Letenneur 2002;Leffondré, Touraine, Helmer, and Joly 2013). The bias will be especially pronounced if the risk of death is higher for diseased subjects than the risk of death for disease-free subjects.
To our knowledge, there is only one package that focus on fitting illness-death models to interval-censored data, the coxinterval package (Boruvka and Cook 2015) that implements a sieve estimator for the Cox regression model. The msm package (Jackson 2011) is able to fit Markov multi-state models to panel data where the status of the subjects is known at a finite series of inspection times. As a special case the setting includes the illness-death model and it can be used with interval-censored disease times and exact death times. However, in this package the likelihood is calculated using the Kolmogorov differential equations that relate the transition probabilities and the transition intensities and in order for there to be an analytic solution a time-homogeneity assumption is made where all transition intensities are constant or piecewise-constant between two successive observation times.
The SmoothHazard package fits also survival models to data that might be interval-censored. However, other packages can be used in this case. The available packages include the ICsurv package (McMahan and Wang 2014) that implements a semi-parametric proportional hazard regression model, the intcox package that implements an algorithm for extending the Cox regression model (Pan 1999), the MIICD package (Delord 2016) that implements a multiple imputation approach to Cox regression and the coarseDataTools package that implements the parametric accelerated failure time model (Reich, Lessler, Cummings, and Brookmeyer 2009;Reich, Lessler, and Azman 2016). The interval package (Fay and Shaw 2010) implements tests for comparing survival distributions for interval-censored data.

Outline of SmoothHazard
To sum up, SmoothHazard fits survival models or illness-death models to data observed in either continuous times (leading to exactly observed or right-censored data) or in discrete times (leading to interval-censored or right-censored data). In the illness-death model, only • idm: for fitting illness-death regression models based on possibly interval-censored disease times and right-censored times.
A fitted illness-death model as produced by idm can be used to calculate predictions with: • S3 predict method for 'idm' objects: for estimating transition probabilities, cumulative event probabilities and life expectancies for a given set of covariates.
In this paper, we focus on the main goal of the package which is fitting illness-death models to interval-censored data. Section 2 presents the model and the likelihood. Section 3 presents the estimation methods. Section 4 briefly presents predictions that can be made in an illnessdeath model. Section 5 provides some examples illustrating SmoothHazard. We discuss limitations and further extensions in Section 6.

Model and likelihood
We consider an illness-death process X = (X(t), t ≥ 0) which takes values in {0, 1, 2} (Figure 1). Subjects are initially disease-free (X(0) = 0) and may become diseased (transition 0 → 1) and die (transition 1 → 2), or die directly without disease (transition 0 → 2). For more in-depth treatment of the topic, see Andersen and Keiding (2012) and Keiding (2014). X is assumed to be a non-homogeneous Markov process which means that the future evolution of the process {X(t), t > s} depends on the current time s and only on the current state X(s). Thus, the distribution of X is fully characterized by the set of transition probabilities: The transition probabilities are related to the instantaneous transition intensities α hl shown in Figure 1 by the relation: We introduce covariate effects separately for each transition through proportional transition intensities regression models which are a natural extension of the Cox proportional hazard model: Here α 0,hl are baseline transition intensities, Z hli are covariate vectors for subject i and β hl are vectors of regression parameters for transition h → l.
In the situation where the time to disease and the time to death are not interval-censored but either observed exactly or right-censored, the regression coefficients could be estimated by the partial likelihood method without the need to specify and estimate the baseline transition intensity functions α 0,hl (t). For interval-censored transition times to the intermediate state, the situation is more complex. It turns out that we have to estimate all parameters simultaneously and that we need a model for the baseline transition intensity functions. This can be seen by inspecting the likelihood function.
For subject i, denote the conditional disease-free survival function by where A hl (·|Z hli ) is the conditional cumulative intensity function of transition h → l: Note that the conditional probability of surviving time t given a transition into the intermediate state at time s is given by exp{−A 12 (t|Z 12i ) + A 12 (s|Z 12i )}.
We allow that the event times are left-truncated, i.e., that subjects enter the study at the delayed entry time E > 0. The left-truncation condition X(E i ) = 0 implies that subject i has survived in state 0 until time E i .

In addition to the covariate vectors
is the minimum between the transition time into the absorbing state T i and the right censoring time C i and δ 2i = 1{T i ≤ C i }. Also, δ 1i = 1 if we know for sure that subject i was diseased between E i andT i and δ 1i = 0 otherwise. The visit times L i and When the transition time into the intermediate state is observed exactly, we have δ 1i = 1 and L i = R i . In the latter case we also denote I i for the transition time into the intermediate state.
We now detail the likelihood contributions according to the different observational patterns shown in Figure 2 in the special case where there is no left-truncation i.e., E i = 0. Lefttruncated event times are taken into account by simply dividing the above likelihood contributions by the term S(E i |Z 01i , Z 02i ).

Estimation
The idm function computes estimates for the three baseline transition intensities and for the regression parameters using the Levenberg-Marquardt's algorithm (Levenberg 1944;Marquardt 1963) to maximize the (penalized) likelihood. The algorithm is a combination of a Newton-Raphson algorithm and a gradient descent algorithm (also known as the steepest descent algorithm). It has the advantage of being more robust than the Newton-Raphson algorithm while preserving its fast convergence property.

Parametric estimation
A Weibull parametrization for the baseline transition intensities is assumed in idm's default estimation method: where a hl and 1 b hl are shape and scale parameters. The Weibull parameter estimatesâ hl andb hl and the vectors of regression parameter estimatesβ hl are obtained simultaneously by maximizing the likelihood which is the product over the subjects' contributions according to (2): Confidence intervals for the regression parameters are obtained using standard errors estimated by inverting the Hessian matrix of the log-likelihood, that is the matrix of the second partial derivatives of log L given in (3). Pointwise confidence intervals for the baseline transition intensities are obtained using a simulation-based approach explained in Section 4.1.

Semi-parametric estimation
In situations where it is suspected that the Weibull distribution does not fit the data very well one can think of extending the model and leaving the baseline intensity functions completely unspecified, as in the Cox regression model. Unfortunately, in interval-censored data there is no direct analogue to the partial likelihood and the Breslow estimator of the Cox model in right-censored data. The function idm implements a semi-parametric model where the three baseline transition intensities are approximated by linear combinations of M -splines. In this section we explain the basic steps of this approach.

The penalized likelihood
To control the smoothness of the estimated intensity functions, we penalize the log-likelihood by a term which specifies the curvature of the intensity functions. It is given by the square of the second derivatives. The penalized log-likelihood (pl) is defined as: where l is the log-likelihood and κ 01 , κ 02 and κ 12 are three positive parameters which control the trade-off between the data fit and the smoothness of the functions. It is proposed that the penalization parameters are chosen by maximizing a cross-validated likelihood score. Here, leave-one-out is appealing as the result does not depend on the random seed as it would, e.g., for 10-fold cross-validation. However, since leave-one-out requires as many maximizations of the likelihood as there are subjects in the data set, this can be computationally very expensive. To avoid extremely long run times we have implemented the following algorithm: Step 1. We ignore the covariates and use a grid search method to find the values for the parameters (κ 01 , κ 02 , κ 12 ) based on an approximation of the leave-one-out log-likelihood score. This approximate score requires the computation of the Hessian matrix and one step of the Newton-Raphson algorithm. Thus it reduces the number of calculations considerably. The approximate leave-one-out log-likelihood score is similar to an Akaike information criterion (for more details, see Commenges, Joly, Gégout-Petit, and Liquet 2007). This approach was proposed by O'Sullivan (1988) for survival models and studied by Joly et al. (2002) in an illness-death model with interval-censored data.
Step 2. We use the results of Step 1, i.e., the optimized value of (κ 01 , κ 02 , κ 12 ) to maximize the penalized likelihood (see Equation 4) with covariates. The parameters being maximized are the regression coefficients and the coefficients of the linear combinations of the M -splines defined below.

M -splines
We where a hl,i are unknown parameters. The n hl M -splines are integrated in order to produce a family of monotone splines, these are called I-splines. Thus, with each M -spline M hl,i we associate an I-spline I hl,i : For given values of the parameters a hl,i , we can approximate the cumulative baseline transition intensities A hl by a linear combination of I-splines: Because M -splines are non-negative, the positivity constraint on (a hl,i ) 2 ensures thatÃ 0,hl is monotone increasing.
Confidence intervals of the regression parameters are obtained using estimated standard errors which are obtained by inverting the Hessian matrix of the penalized log-likelihood.
Confidence intervals for the baseline transition intensities α 0,hl (t) are obtained using the Bayesian approach proposed in O' Sullivan (1988) for survival analysis where the standard errors are estimated by M hl (t) H −1 hl M hl (t), where H hl denotes the specific part for transition from h to l of the Hessian matrix of the penalized log-likelihood.

Predictions
Often in illness-death models the functions of interest are the transition intensities. However, other quantities (transition probabilities, cumulative probabilities and life expectancies) which can be expressed in terms of the transition intensities (Touraine, Helmer, and Joly 2016) may provide additional information and have a more natural interpretation.
For example, given a set of covariates Z 01,i , Z 02,i , Z 12,i for a subject i who is diseased at time s, one could be interested in the probability that the subject is still alive at some time t > s, or the subject's life expectancy. Given a set of covariates Z 01,j , Z 02,j , Z 12,j for a subject j who is disease-free at time s, one could be interested in lifetime risk of disease or in healthy life expectancy (expected remaining sojourn time in the healthy state). Since these quantities can be written in terms of the transition intensities, SmoothHazard provides estimates of them using estimates of the transition intensities. Confidence intervals are calculated using the simulation-based method described in the following.

Confidence regions
A simulation based approach (Mandel 2013) is used to calculate confidence intervals for the transition intensities α hl (t) in the parametric approach and for the other quantities of interest in both parametric and semi-parametric approaches. To briefly outline how it works, we generically denote by θ the vector of all the parameters that characterize the likelihood and byθ the maximum (penalized) likelihood estimator. θ contains the Weibull parameters in the parametric model, the spline parameters in the semi-parametric model and the regression parameters in both models.
We assume asymptotic normality for the estimatorθ and denote byVθ the estimated covariance matrix ofθ. We consider a multivariate normal distribution with the parameter estimates as expectation andVθ as covariance matrix. We generate n vectors (n = 2000 in practice) from this distribution: θ (1) , . . . , θ (n) . Based on them, we can calculate n values for the transition intensities: α (1) hl (t), . . . , α (n) hl (t), and therefore n values for any quantity of interest written in terms of the transition intensities. The n values reflecting the sample variation (Aalen, Farewell, De Angelis, Day, and Gill 1997), we order them and the 2.5th and the 97.5th empirical percentiles are then used as lower and upper confidence bounds for 95% confidence intervals. This procedure can be repeated for any t, so we can obtain pointwise confidence limits for α hl (·). Table 1 shows how the program interprets the structure of the data set. In all cases, L i may be equal to the entry time. Some more details are necessary to distinguish the case where the ill status is known at the last follow-up time for death from the case where this is not possible.

How to prepare the data
• In case 2, if L i < C i then it is assumed that the subject may become ill between L i and C i . If L i = C i it is assumed that the subject is disease-free at time C i . In the latter case the integral in the second term of the likelihood equals zero.
Case Description δ 1 δ 2 L R T Remark 1 Exact ill time, right-censored death time.
No illness observed, right-censored death time.
Interval-censored ill time, right-censored death time.
Interval-censored ill time, death time observed.  • In case 5, if L i < T i then it is assumed that the subject may become ill between L i and T i . If L i = T i it is assumed that the subject is disease-free at time T i . In the latter case the integral in the second term of the likelihood equals zero.

Paquid study
In order to illustrate the functionality of the package we provide a random subset containing data from 1000 subjects that were enrolled in the Paquid study (Letenneur, Gilleron, Commenges, Helmer, Orgogozo, and Dartigues 1999), a large cohort study on mental and physical aging.

R> data("Paq1000", package = "SmoothHazard")
The population consists of subjects aged 65 years and older living in Southwestern France. The event of interest is dementia and death without dementia is a competing risk. Furthermore, the time to dementia onset is interval-censored between the diagnostic visit and the previous one and demented subjects are at risk of death. Thus, subjects who died without being diagnosed as demented at their last visit may have become demented between last visit and death.
In this subset 186 subjects are diagnosed as demented and 724 died from whom 597 without being diagnosed as demented before. Because of interval-censoring more than 186 should have been demented, more than 127 should have been dead with dementia and less than 597 should have been dead without dementia (see Figure 3).
Age is chosen as the basic time scale and subjects are dementia-free (and alive) at entry into study. Consequently, we need to deal with left-truncated event times. Each row in the data corresponds to one subject. The variables dementia and death are δ 1 and δ 2 , the status variables for dementia and death. The variable e contains the age of subjects at entry into the study. The variables l and r contain the left and right endpoints of the censoring intervals. For demented subjects, r is the age at the diagnostic visit and l is the age at the previous one. For non-demented subjects, l and r are the age at the latest visit without dementia (l = r). The variable t is the age at death or at the latest news on the vital status. There are two binary covariates: certif for primary school diploma (762 with diploma and 238 without diploma) and gender (578 women and 422 men).
The function idm computes estimates for the three transition intensities α 01 (·), α 02 (·), α 12 (·) which represent age-specific incidence rates of dementia, age-specific mortality rate of dementiafree subjects and age-specific mortality rate of demented subjects, respectively. Proportional transition intensities regression models allow for covariates on each transition. Covariates are specified independently for the regression models of the three transition intensities by the right hand side of the respective formula formula01, formula02 and formula12.
Interval-censoring and left-truncation must be specified at the left side of the formula arguments using the Hist function. For left-truncated data, the entry argument of Hist must contain the vector of delayed entry times. For interval-censored data, the time argument of Hist must contain a list of the left and right endpoints of the intervals. The data argument contains the data frame in which to interpret the variables of formula01, formula02 and formula12. The left side of formula12 argument does not need to be filled because all the data information is already contained in formula01 and formula02. The left side of formula12 argument is required only if we want the covariates impacting transition 1 → 2 different from those impacting transition 0 → 2.

Fitting the illness-death model based on interval-censored data
The main function idm computes estimates for the three baseline transition intensities and for the regression parameters of an illness-death model. The method argument by specifying the form of the transition intensities allows to select either the parametric or a semi-parametric estimation method: • With the default value "Weib", a Weibull distribution is assumed for the baseline transition intensities and the parameters are estimated by maximizing the log-likelihood.
• With the "Splines" value, the baseline transition intensities are approximated by linear combinations of M -splines and the parameters are estimated by maximizing the penalized log-likelihood.
We stop the iterations of the maximization algorithm when the differences between two consecutive parameter values, log-likelihood values, and gradient values is small enough. The default convergence criteria are 10 −5 , 10 −5 and 10 −3 , respectively and can be changed by means of the eps argument.
We now illustrate how to fit the illness-death model to the Paq1000 data set, based on interval-censored dementia times and exact death times.
In the following call, a Weibull parametrization is used for the three baseline transition intensities and we include two covariates on the transition to dementia, one covariate on the transition from no dementia to death and no covariates on the transition from dementia to death. Note that in case of missing formula12 argument the covariates on the 1 → 2 transition are taken to be the same as the ones specified in the formula02 argument.
The three baseline transition intensity functions can be displayed as functions of time, i.e., functions of age in our illustrative example (Figure 4).

Semi-parametric estimation method: Choice of smoothing parameters
Some optional arguments are specific to the semi-parametric approach (when using the option method = "Splines"): • n.knots contains a vector (by default c(7, 7, 7)) specifying the number of knots on the 0 → 1, 0 → 2 and 1 → 2 transitions, respectively; • knots contains the choice of the knots placement (equidistant by default or quantilebased placement) or a list of sequences of knots for transitions 0 → 1, 0 → 2 and 1 → 2 respectively, to be specified by the user; • CV (FALSE by default) is set to TRUE for using approximate leave-one-out cross-validation score to choose the smoothing parameters κ 01 , κ 02 , κ 12 ; • kappa contains the smoothing parameters if CV = FALSE (arbitrary choice of the smoothing parameters κ 01 , κ 02 , κ 12 ); the initial smoothing parameters for the grid search method which maximize the approximate leave-one-out cross-validation score if CV = TRUE.
By default the function idm selects equidistant sequences of 7 knots between the minimal and maximal event times (e, l and r for Paq1000). There must be a knot before or at the first time from which there are subjects at risk and after or at the last time of transition. The current implementation of our program requires a minimum of 5 knots for each transition intensity.
Consequently, the semi-parametric approach requires much more information than the parametric one to achieve convergence. The number of parameters to be estimated is larger, and enough observation times on each transition are required to fit the splines. In particular, in data sets where few 1 → 2 transitions times are observed, we do not recommend this approach. Increasing the number of knots does not deteriorate the estimates of the transition intensities: This is because the degree of smoothing in the penalized likelihood method is tuned by the smoothing parameters κ 01 , κ 12 and κ 02 . On the other hand, once a sufficient number of knots is established, there is no advantage in adding more. Moreover, the more knots, the longer the running time. Some numerical problems can arise, particularly for a large number of knots. So it is recommended to start with a small number of knots (e.g., 5 or 7) and increase the number of knots until the graph of the transition intensities function remains unchanged (from our own experience rarely more than 12 knots).
The default values for the smoothing parameters κ 01 , κ 02 , κ 01 , are suitable for the Paq1000 data set. However, these values can be expected to be very different depending on time scale, number of subjects and number of knots. The cross-validation option can be used to find appropriate smoothing parameters. However, the running time with cross-validation is very long and an empirical technique might be preferred. It consists of repeatedly running idm trying different smoothing parameters. After each estimation, the transition intensities are plotted. If the curves seem too smooth, it may be useful to reduce the smoothing parameter. Similarly, if the curves are too wiggly, the smoothing parameter may be increased.

Making predictions
An object returned from idm can be used as an argument to the predict function in order to obtain transition probabilities, cumulative probabilities of event and life expectancies with confidence intervals. For example, the following call gives predictions over a 10-year horizon for a 70-year-old male subject who has a primary school diploma: R> pred <-predict(fit.weib, s = 70, t = 80, + newdata = data.frame(certif = 1, gender = 1)) R> pred Predictions of an irreversible illness-death model with states (0,1,2). The output attributes are: • for a dementia-free 70-year-old subject: the probability of being still alive and dementia-free 10 years later p 00 (70, 80); the probability of being still alive but demented 10 years later p 01 (70, 80); the probability of dying in the next 10 years p 02 (70, 80) having been demented before (p 1 02 (70, 80)) or not (p 0 02 (70, 80)); -the absolute risk of dementia in the next 10 years (10 years later, the subject may be dead or not) F 01 (s, t); the absolute risk of exit from the no dementia state in the next 10 years F 0• (s, t) (due to either dementia or death); • for a 70-year-old demented subject: the probability of dying in the next 10 years p 12 (70, 80) or not p 11 (70, 80).
The following calls give life expectancies regarding an 80-year-old female subject who has a primary school diploma based on the transition intensities estimates from respectively the parametric approach and the semi-parametric approach: R> LE.weib <-predict(fit.weib, s = 80, + newdata = data.frame(certif = 1, gender = 0), lifeExpect = TRUE) R> LE.weib Predictions of an irreversible illness-death model with states (0,1,2). The output attributes of the predict function with lifeExpect = TRUE are: • for an 80-year-old dementia-free subject: the life expectancy in state 0 (healthy life expectancy); the life expectancy; • for an 80-year-old demented subject: the life expectancy.
The confidence intervals calculation using the simulation-based method may take time, especially using the splines estimates of the transition intensities. To suppress this calculation, the CI argument must be set to FALSE (see above). Note that to reduce the computation time of the confidence intervals, the number of simulations is 200 by default but, to improve precision, it can be modified using the nsim argument.

Discussion
In this article, we have described the methods implemented in the R package SmoothHazard for fitting illness-death models to interval-censored transition times to the intermediate state and exact transition times to the absorbing state. Note that the package also fits simple survival models (two-state models) and also models where the transition times are rightcensored but not interval-censored.
We have also explained and illustrated the actual use of SmoothHazard and note that several extensions are in the development phase. One extension is a model which assumes equality or proportionality between the two transition intensities α 02 and α 12 . Another is a model which assumes the same covariate effects for these transition intensities. At the moment, the illness-death model implemented in the package assumes a Markov process. However, several implementations are currently tested of semi-Markov models which allow that the transition intensity α 12 depends on both the current time and the time spent in the intermediate state.
We also plan to implement regression models for interval-censored observations of other multistate models, for example the progressive disease model. The development version of the package is available at https://github.com/tagteam/SmoothHazard.