Building a Nomogram for Survey-Weighted Cox Models Using R

Nomograms have become very useful tools among clinicians as they provide individualized predictions based on the characteristics of the patient. For complex design survey data with survival outcome, Binder (1992) proposed methods for ﬁtting survey-weighted Cox models, but to the best of our knowledge there is no available software to build a nomogram based on such models. This paper introduces an R package, SvyNom , to accomplish this goal and illustrates its use on a gastric cancer dataset. Validation and calibration routines are also included


Introduction
A nomogram is a graphical representation of a mathematical model involving several predictors to predict a particular endpoint based on traditional statistical methods such as Cox proportional hazards model for survival data or logistic regression for binary outcome (Kattan 2003a;Iasonos et al. 2008;Shariat et al. 2008, among others). Nomograms have been widely used for cancer prognosis, primarily because they are designed to provide estimates of the probability of an event, such as death or recurrence, which are tailored to the profile of individual patients. For survival data, the underlying model on which the nomogram is based is typically the Cox proportional hazards model which models the relationship between a set of covariates and the hazard function of a particular failure time. The model parameters are estimated using partial likelihood and most statistical software packages implement the Cox model making it a very attractive tool for survival data.
For survey data with a complex sampling design, fitting the standard partial likelihood method which ignores the design of the survey can lead to seriously misleading results (Lin 2000). Binder (1992) proposed a method for fitting the Cox proportional hazards model that takes into account the complex design of the survey sample. He derives weighted estimators for the Cox regression coefficients and their estimates of variance. Although there is available software for building nomograms based on the standard Cox model (see Harrell's rms package, Harrell, Jr 2001;Harrell 2014), to the best of our knowledge there are no available tools to build a nomogram in the context of survey-weighted Cox models. This article introduces the R (R Core Team 2014) package SvyNom (Gönen and Capanu 2015) that addresses this problem. Section 2 describes the Cox model and the survey-weighted Cox models while Section 3 summarizes the use of nomograms and introduces a procedure to build a nomogram for survey-weighted Cox models. Section 4 presents the function available in package SvyNom to accomplish this task and illustrates the method using a gastric cancer dataset. Section 5 concludes with a discussion.

Cox proportional hazards model
This section provides a brief overview of the Cox proportional hazards model in the context of non-survey and survey survival data.

Non-survey survival data
The Cox (1972) proportional hazards model assumes that the hazard function of the failure time T satisfies the relationship h(t|X) = h 0 (t)e β X(t) , where X is a vector of observable, possibly time dependent covariates, h 0 (·) is an unspecified baseline hazard function, and β is the vector-valued unknown regression parameter representing the log hazard ratio. Using similar notation to that of Binder (1992) and Lin (2000), let C denote the censoring time and let T = min(T, C), ∆ = I(T ≤ C) and Y (t) = I( T ≤ t), where I(·) is the indicator function. If { T i , ∆ i , X i (·)}, i = 1, . . . , N is a random sample from the joint distribution of { T , ∆, X(·)}, then β can be estimated by determining B to maximize the partial likelihood function so that where The Cox regression model can be implemented in most statistical software packages. The functions coxph in package survival (Therneau and Grambsch 2000;Therneau 2014) or cph in rms can be used to fit Cox models in R, although only the latter can be used to build a nomogram.

Survey-weighted Cox models
In the context of survey data, the sample is drawn from a finite population via a complex survey design. The partial likelihood function in Equation 2 is no longer suitable for estimating β as ignoring the survey design can result in misleading inferences. Binder (1992) developed a procedure for fitting proportional hazards models from survey data. Specifically, if we assume that a sample of size n is drawn from a survey population of size N using a complex design and denote the sampling weights by w i scaled so that w i = 1, then Binder's procedure estimates β from the estimating equation Binder (1992) also derived a design-based variance for the weighted estimatorB by assuming . . , N as fixed. This procedure can be implemented using the R package survey (Lumley 2004(Lumley , 2014.

Nomograms
Nomograms have become very popular tools among clinicians. A nice step by step guide for building, interpreting, and using nomograms to estimate cancer prognosis or other outcomes can be found in Iasonos et al. (2008). Briefly, nomograms create a simple graphical representation of a statistical predictive model mapping each predictor to a point scale. A clinician can then obtain the predicted probability of the event for a patient by accumulating the total points corresponding to the specific configuration of covariates for that patient. Nomograms have been shown to have high accuracy and discriminating ability for predicting outcomes in patients with cancer (Kattan 2003a,b;Shariat et al. 2006;Sternberg 2006;Chun et al. 2007;Shariat et al. 2008, among others). A nomogram's performance is usually evaluated in terms of discriminative ability and calibration. Discrimination refers to the ability to distinguish high-risk patients from low-risk patients and is commonly quantified via a concordance index which measures the level of concordance between the order of predicted probabilities and the order of the events of interest. One such index is Harrell's c index which, for survival data, is defined as the proportion of all pairs of subjects whose survival time can be ordered such that the subject with the higher predicted survival is the one who survived longer (see Harrell, Jr 2001, page 493). Calibration refers to whether the predicted probabilities agree with the observed probabilities and are usually assessed using calibration plots. To prevent over-fitting, validation methods such as cross-validation, bootstrap validation, or external validation are employed. This ensures that the nomogram will perform well when it is used in a new patient cohort.

Cox regression based nomograms
With independent sampling and time to event outcome, Cox proportional hazards model is the typical statistical model used to construct the nomogram. This can be easily done in R using the commands R> library("rms") R> phmodel <-cph(Surv(time, event)~formula(predictors), x = TRUE, + y = TRUE, surv = TRUE, se.fit = TRUE, time.inc = 24) to fit the Cox model and store the 2-year survival and standard error, followed by a call to the function that builds the nomogram R> mynom <-nomogram(phmodel, fun = surv2y, fun.at = ss, lp = TRUE) where R> surv <-Survival(phmodel) R> surv2y <-function(x) surv(2 * 12, lp = x) contains the 2-year predicted survival probabilities and R> ss <-c(0.05, 0.2, 0.4, 0.6, 0.7, 0.8, 0.9, 0.95, 0.99) indicates the probabilities to be listed on the horizontal axis of the graph (Harrell, Jr 2001

Building the nomogram
In the setting of complex design survey data -to the best of our knowledge -there is no software package available for building a nomogram. In this section we present the key steps in building a nomogram for survey-weighted Cox models. These steps will be further detailed in the next section when the implementation is illustrated on a real dataset. Note that the R package survey is needed for fitting the survey-weighted Cox model.
Step 1. Specify the complex survey design: With survey data, before one can fit the survey-weighted Cox model and then build the nomogram, one has to first specify the sampling design of the survey using the svydesign function of the survey design. Without going into details, for example, in calling dstr <-svydesign(id~1, strata, prob, fpc, data), id~1 indicates no clusters present, strata specifies the different strata, prob supplies the sampling probabilities, while fpc is specified as the total population size in each stratum or as the fraction of the total population that has been sampled. A particular specification is presented in Section 4 for the gastric cancer dataset, along with functions that perform these tasks automatically as part of the SvyNom package.
Note that this section is not intended to provide a comprehensive review of how to specify complex design surveys, but merely as an example to illustrate the methodology for building a nomogram in the setting of survey data. For more detailed information on survey design specification, the reader should consult the documentation of the survey package (Lumley 2014).
Step 2. Fit the survey-weighted Cox model: This is similar to fitting a regular Cox model except that now the survey design is accounted for via the design option.
Step 3. Obtain the linear predictors from the survey-weighted Cox model fitted above: R> pred_lp_cox <-predict(svy.cox.fit) Step 4. Since there is no link between the svycoxph function in the survey package and the nomogram function in the rms package, we have to create this link using the function ols in the rms package. This approximates the model by fitting ordinary least squares to regress the linear predictors on the same predictors used to fit the survey-weighted Cox model (for more details see Harrell, Jr 2001, Section 14.10). Note that the argument sigma = 1 is included to prevent numerical problems resulting from the mean squared error of zero: R> f <-ols(pred_lp_cox~formula(predictors), sigma = 1, x = TRUE, + y = TRUE) Step 5. Build the nomogram (at the desired time point chosen to be 2 years for this example): R> surveyNomogram <-nomogram(f, fun = surv2y, fun.at = ss3, + funlabel = "Prob of 2 year OS", lmgp = 0, lp = TRUE) Details on the computation of the 2-year survival predicted probabilities are provided in the next section.

Bootstrap validation
For each bootstrap sample (constructed by sampling with replacement from the original data) we fit the survey-weighted Cox model following the steps outlined above. We calculate Harrell's c index based on the normalized linear predictors from the model fitted on the bootstrap data and obtain the bias by subtracting the c index for the observed data.

Calibration
Once the predicted survival probabilities at 2 years are sorted, they can be grouped into a specific number of groups (usually 4 or 5 groups) and then the median of the 2-year predicted survival probabilities computed for each of the groups. The calibration graph plots these median estimates versus the 2-year survey weighted Kaplan-Meier estimate (obtained using the svykm function in the survey package) in each of the groups. Points close to the diagonal line indicate good calibration.

Application to gastric cancer data
We have implemented this methodology to build a nomogram that predicts 2-year survival for patients with metastatic gastric cancer (Power, Capanu, Kelsen, and Shah 2011). This study comprises all patients with metastatic gastric/gastroesophageal junction (GEJ) adenocarcinoma who received chemotherapy at Memorial Sloan-Kettering Cancer Center from January 1999 to July 2007. The majority of patients with metastatic gastric cancer die within one year of diagnosis, and fewer than 15 per cent survive for 2 or more years. The goal of this study was to better characterize these patients with exceptional survival. As obtaining the necessary information for all of these patients would require going through their medical records which would have been too time consuming, a random sample was drawn instead. To maximize the population of interest, amongst the cohort of patients meeting eligibility criteria (total of 985 patients), all patients surviving 2 years or longer (total of 132 patients, denoted as group ≥ 24) were included for detailed analysis and approximately 30 per cent of patients surviving less than 2 years were randomly selected (among the remaining 853 patients) for a total of 253 patients (denoted as group < 24). All patients had at least 2-year follow-up. To account for the sampling design we employed the survey-weighted Cox regression model as described in Section 2.2. Inverse probability weights were used. The final regression model underlying the nomogram was chosen based on the clinical and statistical significance of the predictors in univariate survey-weighted Cox models. More details on the statistical analysis are provided in Power et al. (2011). Section 4.1 presents the functions in package SvyNom that one can use for an automated implementation of the method, while Section 4.2 describes the step by step manual implementation.

Generic functions
We now present the general functions, that can automatically perform the steps outlined in Section 3 to create the nomogram, validate it using bootstrap as well as produce the calibration plots. These functions are packaged in the R package, SvyNom, which is available from the Comprehensive R Archive Network (CRAN) at http://CRAN.R-project.org/package= SvyNom. Note that the user needs to provide as input a survey design (as described in Step 1 of Section 3.2). The user also needs to ensure that the dataset does not include any missing values and this can be achieved using the na.omit command in R. Another requirement is to run the datadist option to store the distribution summaries for all potential variables and insure adequate plotting ranges.

R> library("SvyNom")
Function to build the nomogram The function svycox.nomogram which builds the nomogram is invoked as svycox.nomogram (.design, .model, .data, pred.at, fun.lab) and its arguments are described below: .design represents a survey design object; .model indicates a Cox model specification; .data contains the data on which the model is to be fit (is not allowed to contain NAs); pred.at specifies the time point at which the nomogram prediction axis will be drawn; fun.lab designate the label of the prediction axis.
The function returns a nomogram object which can be plotted using the function plot. Note that this nomogram object should be saved as it is required for the subsequent validation and calibration functions. To build the nomogram for the gastric cancer data one would call this function as below R> mynom <-svycox.nomogram(.design = dstr2, .model = + Surv(survival, surv_cens)~ECOG + liver_only + Alb + Hb + Age + + Differentiation + Gt_1_m1site + lymph_only, .data = noNA, pred.at = 24, + fun.lab = "Prob of 2 Yr OS") and plot it using plot(mynom$nomog). The resulting nomogram is displayed in Figure 1.

Function to validate the nomogram
The function svycox.validate validates the nomogram using bootstrap and uses as arguments .boot.index which includes a matrix of bootstrap sample indicators with the number of rows the same as the number of rows in the data and the number of columns being the number of bootstrap samples; .nom contains a nomogram object returned by svycox.nomogram, and .data is the dataset on which the validation will take place. The function prints the estimated optimism and returns the vector of optimism values for each bootstrap sample which can be used to summarize the validation with the measure of choice. Note that generating the bootstrap sample is design dependent and thus we did not make it part of the function svycox.validate. The user has to generate the bootstrap samples consistent with the design used. For example, to validate the nomogram for the gastric cancer data, we first use stratified bootstrap to generate the bootstrap data for the matched case-control design.  the Points axis to determine how many points toward the predicted probability of a 2-year OS that the patient receives for his or her ECOG level. Repeat this process for the other predictors, each time drawing a line straight up to the Points axis. Sum the points achieved for each predictor and locate this sum on the Total Points axis. Draw a line straight down to the Prob of 2 year OS axis to determine the patient's probability of surviving for 2 years. Variables with the greatest discriminatory value are those with the widest point range in the nomogram. For example, an ECOG performance status of 2 (55 points) and baseline Albumin of 2-3 g/dl (55-35 points) provide substantial discrimination for 2-year survival probability versus the alternative of an ECOG performance of 1 (0 points) and a normal serum Albumin of 4 g/dl or higher (<20 points).

Function to check calibration
The function svycox.calibrate uses the arguments .nom which stores a nomogram object from svycox.nomogram, .timept indicating the time point at which calibration will take place (it defaults to the time value of the prediction axis in the nomogram), and .ngroup specifying the number of groups to be formed for validation purposes. The function returns a calibration plot. To obtain the calibration plot in Figure 2 one can use the syntax R> svycox.calibrate(mynom)

Step by step implementation
The key steps in the implementation of this analysis in R follow the outline from Section 3.2 and are provided below to facilitate understanding.

Building the nomogram
First we load the R packages needed: rms, survival, survey and SvyNom.
R> library("rms") R> library("survival") R> library("survey") R> library("SvyNom") Then we load the dataset: R> data("noNA", package = "SvyNom") In this example dataset the observations with missing values are excluded. Alternatively one can impute the missing values, but we will not consider this here. The next step specifies the complex survey design which in this example is a stratified independent sampling design, as the sampling was conducted stratified based on whether the patient survived longer than 24 months or not.
R> dstr2 <-svydesign(id =~1, strata =~group, prob =~inv_weight, + fpc =~ssize, data = noNA) The sampling probabilities specified by prob are equal to 1 for the patients in the group of long term survivors (group ≥ 24) and are all equal to 253/853 for those that survived less than 2 years. The fpc option specifies the total population that has been sampled in each stratum and is equal to 132 in group ≥ 24 and is equal to 853 for patients in group < 24. To fit the survey-weighted Cox model containing the age of the patients, the ECOG status, the Albumin and Hemoglobin levels, the tumor differentiation, the presence of more than one metastatic sites, the presence or absence of metastasis only in the liver, the presence or absence of metastasis only in the lymph nodes and accounting for the stratified independent sampling design we use R> svy.cox.fit <-svycoxph(Surv(survival, surv_cens)~ECOG + liver_only + + Alb + Hb + Age + Differentiation + Gt_1_m1site + lymph_only, x = TRUE, + design = dstr2) We also store the predicted survival for each patient as R> pred_survey_cox <-predict(svy.cox.fit, type = "curve", newdata = noNA) As described in Step 4 of Section 3.2, we approximate the model using the ols function.

Validation of the nomogram
Harrell's c index on the original data is obtained after normalizing the linear predictors from the survey-weighted Cox model on which the nomogram was built on R> lp_normalized <-svy.cox.fit$x %*% as.matrix(svy.cox.fit$coefficients) -+ mean(svy.cox.fit$x %*% as.matrix(svy.cox.fit$coefficients)) R> cindex.orig <-with (noNA,+ Surv(survival,surv_cens) and equals R> cindex.orig [1] 0.7575879 We perform bootstrap validation using 200 bootstrap datasets constructed by sampling with replacement from the original data but in such a way to maintain the same ratio of long term survivors relative to the number of patients surviving for less than 2 years (stratified bootstrap). More precisely, among the long term survivors, we sampled with replacement as many long term survivors as in the observed data (132), and similarly we sampled with replacement 253 patients of the total 253 patients surviving less than 2 years; the two groups together formed the bootstrap sample. We then repeated this process 200 times. As illustrated in Section 4. An unbiased measure of the concordance probability is then obtained by subtracting the optimism estimate from the concordance probability of the original data.

Calibration of the nomogram
As illustrated in Section 4.1, the svycox.calibrate function is the main vehicle for achieving the calibration of the nomogram. Once called, it goes through the following steps to produce a calibration plot. First the data is split into 5 groups (note that the number of groups is chosen by the user). followed by estimation of the lower and upper bounds of the 95 per cent confidence intervals assuming normal approximation for the log of the survival function.

R> .loc
The graph is given in Figure 2 and could also be obtained using function svycox.calibrate from package SvyNom.

Discussion
Nomograms are widely used among clinicians as they provide estimates of the probability of an event, such as death or recurrence, tailored to the profile of individual patients, thus facilitating cancer prognosis. In the presence of complex design survey data with survival outcome, survey-weighted Cox models have been proposed but to the best of our knowledge there is no available software for building nomograms in this context. This article introduces the R package SvyNom to build, validate, and check the calibration of a nomogram for surveyweighted Cox models. The tool is illustrated step by step on a real dataset of gastric cancer and generic functions are also provided. It requires the user to prespecify the complex survey design using existent functions in the survey package. For the validation of the nomogram, the user has to modify the generation of the bootstrap samples to fit the specific survey design. The software is easy to use and can be easily extended to binary endpoints.