QoLR : An R Package for the Longitudinal Analysis of Health-Related Quality of Life in Oncology

Health-related quality of life has become increasingly important in clinical trials over the past two decades. The R package QoLR is a recently developed package for the longitudinal analysis of health-related quality of life in oncology. This package contains the scoring of most of the European Organisation for Research and Treatment of Cancer quality of life questionnaires and some programs to analyze the time to a health-related quality of life score deterioration as a modality of longitudinal analysis in oncology.


Introduction 1.Health-related quality of life
Health-related quality of life (HRQOL) is a subjective clinical endpoint that has become increasingly important in clinical trials in oncology over the past two decades (Osoba 2011).Although overall survival is still considered as the primary objective and the primary endpoint in many studies, most clinical trials now integrate HRQOL as an endpoint in order to investigate the clinical benefit for the patient.It even seems that HRQOL could become the primary endpoint in metastatic settings.

Assessment
HRQOL is mainly studied through validated questionnaires filled out by the patients at several timepoints during the study.In oncology, the European Organisation for Research and Treatment of Cancer (EORTC) has developed a core questionnaire, namely the QLQ-C30, to assess HRQOL in cancer patients with 30 questions (items) and some supplementary modules for disease-specific treatment measurements (Aaronson et al. 1993).
The EORTC HRQOL questionnaires are composed of a set of items with several categories of responses, usually constructed on a Likert scale (e.g., "Not at all"/"a little"/"quite a bit"/"very much", coded 1/2/3/4).These questions allow the estimation of one or more HRQOL dimensions through the calculation of scores.Generally, a score is the weighted sum of patients' items responses regarding one dimension and can be calculated if at least half of the items are answered.The scores are often normalized on a 0-100 scale in order to facilitate comparison with one or more supplementary HRQOL questionnaires.
For most of the HRQOL questionnaires, a high score corresponds to a high level of functioning for a functional scale, and to high presence of symptoms for a symptomatic scale.Conversely, a low score corresponds to a low level of functioning for a functional scale and to low presence of symptoms for a symptomatic scale.In this way, patients with a high level of HRQOL should have high scores on functional scales and low scores on symptomatic scales.For questionnaires that include an item about global health or global HRQOL (as for the EORTC QLQ-C30), the score obtained is coded like a functional scale.
The method to calculate the scores of the EORTC HRQOL questionnaire is defined in the EORTC scoring manual (Fayers, Aaronson, Bjordal, Curran, and Grønvold 1999).Briefly, all the scores of the EORTC HRQOL questionnaires are obtained with the same procedure: Let I 1 , . . ., I n be the n items answers to the studied dimension.
The first step, common to both functional and symptomatic scales, is the calculation of a raw score (RS), i.e., the mean of the items: If there are some missing items, the denominator equals the number of non-missing items.
If more than half the items are missing, the raw score cannot be calculated and then the dimension score is missing.
The second step is a linear transformation of the scores to a 0-100 scale to obtain a score S: • for functional scales: • for symptomatic scales: • for global health status:

Longitudinal analysis
The longitudinal analysis of HRQOL is a major challenge due to different parameters that have to be taken into account in the analysis.
Firstly, the longitudinal assessment of HRQOL may be compromised by the presence of missing data.Two types of missing data can occur: intermittent missing data (e.g., if a patient forgot to complete one or more questions at one measurement time) or monotone missing data (e.g., if a patient dropped out before the end of the study).The presence of missing data can be informative or non-informative of patient's health status.For example, if a patient dropped out the study due to disease progression, monotone missing data occur and can provide information regarding the patient's HRQOL level: the patient's HRQOL level is likely to have decreased.Thus, missing data are missing not at random.Regarding intermittent missing data, in some circumstances, we can suppose that the patient's HRQOL level remains unchanged between two available measures (missing at random or completely at random).
Secondly, self-assessment of HRQOL is subjective, i.e., it is dependent on the patient's internal standards and definition of HRQOL (Wiklund 2004;Bullinger 2002;Ubel, Peeters, and Smith 2010).However, patients' adaptation to treatment toxicity and their acceptance of the disease may mean that they do not necessarily assess their HRQOL with the same criteria at all measurement times.These changes can be reflected by a response shift effect (Gibbons 1999;Sprangers and Schwartz 1999;Korfage, de Koning, and Essink-Bot 2007) and can bias the interpretation of longitudinal HRQOL analysis if not taken into account in the study design or at least in the analysis method (Ahmed, Mayo, Corbiere, Wood-Dauphinee, Hanley, and Cohen 2005).
Response shift is defined as "a change in the meaning of one's self-evaluation of a target construct as a result of: (a) a change in the respondent's internal standards of measurement (i.e., scale recalibration); (b) a change in the respondent's values, i.e., the importance of component domains constituting the target construct) or (c) a redefinition of the target construct (i.e., reconceptualization)" (Sprangers and Schwartz 1999).In this way, the choice of the level of HRQOL reference to qualify a change like deterioration can be a major concern: the baseline score is not systematically the reference score.
Finally, the longitudinal analysis of HRQOL involves the question of the minimal clinically important difference (MCID).Indeed, there is a difference between the notion of "statistically significant" and "clinically significant".The MCID was defined by Jaeschke, Singer, and Guyatt (1989) as the smallest change/difference in HRQOL scores perceived as clinically important.Apart from statistical significance alone, the MCID is a key component that makes it possible to judge the clinical meaning of the results.Terwee et al. (2010) have reported that many methods have been proposed to determine the MCID, but no standard has been proposed at this time.These methods are generally grouped in two categories: the anchor-based and distribution-based methods.The anchor-based methods focus on the relation between the HRQOL scores and an external criterion that has clinical relevance (Lydick and Epstein 1993).This anchor can be a criterion of disease progression, for example but can also be a patient's self-assessment of the global HRQOL change (Jaeschke et al. 1989).Distribution-based methods are based on the distribution of the HRQOL scores on the obtained sample.For example, the MCID can then be defined as the half of the standard error (Norman, Sloan, and Wyrwich 2003) or the standardized response mean (Wyrwich, Nienaber, Tierney, and Wolinsky 1999a;Wyrwich, Tierney, and Wolinsky 1999b).
For the EORTC HRQOL questionnaires, Osoba, Rodrigues, Myles, Zee, and Pater (1998) have demonstrated that a MCID of 5 points is a small change, a change between 10 and 20 points is moderate and more than 20 points indicates very much change.These thresholds are used for illustration purposes but they are not a gold standard.Moreover, the MCID can also be questionnaire-specific.More recent and scale-specific guidelines are available for clinical significance for the EORTC QLQ-C30 (Cocks, King, Velikova, St-James, Fayers, and Brown 2011).
Due to the complexity of the longitudinal assessment of HRQOL, there is still no gold standard to analyze HRQOL over time in oncology.Moreover, another challenge of statistical methods for analyzing longitudinal HRQOL is to propose meaningful results for the clinicians.There is a need to develop statistical methods adapted for the decision-makers.Longitudinal results should have the ability to translate findings into information that decision makers find understandable and compelling.Thus, different methods to analyze longitudinal HRQOL have been proposed (Pan, Chen, Chung, Wang, Chen, and Hsiung 2012;Hunger, Döring, and Holle 2012;Penar-Zadarko, Binkowska-Bury, Wolan, Gawelko, and Urbanski 2012;Cnaan, Laird, and Slasor 1997).The most widely used is the linear mixed model (Diggle 1988).Survival analysis approaches, like the time to deterioration in a HRQOL score, have recently been proposed as a modality of longitudinal HRQOL analysis in cancer patients (Bonnetain et al. 2010;Hamidou et al. 2011).
The linear mixed model is optimal for a study design with 2 to 5 measurement times (Fairclough 2010).In this model, time is considered as a categorical variable.Moreover, these models are only adapted for studies whose HRQOL assessments are performed in some periods with few amplitude within patients.In case of a missing not at random profile for missing data, a pattern mixture model should be applied in order to produce robust results (Pauler, McCoy, and Moinpour 2003).However, these sub models are rarely applied mainly because of the complexity of the construction of the patterns.Moreover, at this time, these models do not deal with the occurrence of a response shift effect.Finally, these models can not provide results easy to understand for the clinician who is not familiar with the beta change and the mixed model approach.
Contrary to the linear mixed model, "time to deterioration" (TTD) models can propose clinically meaningful results with hazard ratios and log-rank tests (Bonnetain et al. 2010;Hamidou et al. 2011).Moreover, these models can handle the presence of missing data: when intermittent missing data occur for one patient, we can consider that the patient's HRQOL level remains unchanged since the previous available measure; whereas monotone missing data can be due to a deterioration of the patient's health status in case of advanced/metastatic setting.Finally, these models can take into account the occurrence of the recalibration component of the response shift effect by choosing different scores as the reference score to qualify the deterioration (Anota et al. 2013).
The aim of this paper is to present the R (R Core Team 2017) package QoLR which is available from the Comprehensive R Archive Network at https://CRAN.R-project.org/package=QoLR.This package makes it possible to calculate the scores of the EORTC QLQ-C30 and most of its modules and to determine the time to deterioration in a HRQOL score as a modality of longitudinal analysis.

Time to deterioration definitions
To date, several definitions of TTD in a HRQOL score have been proposed depending on event definition and censoring rules.The event definition can be defined according to the reference score, the MCID, missing scores, including death or not.The most intuitive definition is the time from inclusion in the study to a first deterioration of the score with a MCID of at least k points as compared to the baseline score (Hamidou et al. 2011).Patients with no deterioration before drop-out from the study are censored at the time of last HRQOL questionnaire completion or last follow-up.According to the construction of the score, deterioration corresponds to an increase (e.g., for symptomatic scales of the EORTC questionnaires) or a decrease (e.g., for functional scales of the EORTC questionnaires) of the score.Between two available HRQOL scores, the level of HRQOL is supposed to be constant.
The notion of "deterioration" requires a reference score.Generally, the reference score is the baseline score (before randomization or at inclusion in the absence of randomization).However, in order to take into account the occurrence of a response shift effect, the reference can also be • the best level of HRQOL already experienced by the patient (i.e., the best previous score); • or the previous score HRQOL score for the patient (i.e., immediately preceding score).
The value of these scores can change over time according to the patient's experience of treatment and disease progression.Using these changing references instead of baseline measurement could be considered as an alternative way to take into account the occurrence of the recalibration component of the response shift effect.
Furthermore, for event definition, we can consider deterioration as definitive (i.e., absorbing state) or not depending on the therapeutic setting.This induces two concepts: the TTD and the time until definitive deterioration (TUDD).Several definitions of TUDD have been proposed, according to the notion of "definitive deterioration".The TUDD has been defined as: 1. the time from inclusion in the study to a first deterioration of at least k points as compared to the reference score, (a) with no further improvement of more than k points as compared to the reference score, (b) or if the time of the deterioration correspond to the last observed HRQOL score, thus the patient dropped out (i.e., no more HRQOL data available) just after the deterioration was observed resulting in missing data (Bonnetain et al. 2010).
2. the time from inclusion in the study to a first deterioration of at least k points as compared to the reference score, (a) thereafter maintaining this deterioration of at least k points for all following scores, i.e., the deterioration is observed for all the scores following the first deterioration, (b) or if the patient dropped out just after the deterioration was observed resulting in missing data.
3. the time from inclusion in the study to a first deterioration of at least k points as compared to the reference score, (a) with no further improvement of more than k points as compared to the score qualifying the deterioration (i.e., the score at the time of the first deterioration observed), (b) or if the patient dropped out just after the deterioration was observed resulting in missing data.
To illustrate this, the following equations correspond to the three definitions of TUDD of a X score with a k-point MCID observed at time i as compared to the reference score X ref assuming that X represents a functional scale (i.e., a deterioration is observed when the score decreases): For example, if a patient has a reference score equal to 60 points for a functional score (X ref = 60), then a deterioration with a 5-point MCID as compared to the reference score is observed at time i if X i ≤ 55 (X i ≤ X ref − 5-point MCID).For example, let X i equal 40.
Then : 1. if a time k (k > i), X k = 66 or more, then the deterioration observed at time i is not definitive according to the first definition of TUDD; 2. if a time k (k > i), X k = 56 or more, then the deterioration observed at time i is not definitive according to the second definition of TUDD; 3. if a time k (k > i), X k = 46 or more, then the deterioration observed at time i is not definitive according to the last definition of TUDD.
Table 1 indicates the event definition according to the definition of TTD or TUDD retained.
The time until definitive HRQOL score deterioration is mainly applied in advanced or palliative setting.All-cause death can also be considered as an event if the patient did not experience deterioration before death.In this way, TUDD or death could be redefined as "HRQOL deterioration-free survival".
Table 1: Event definitions at time T i according to the definition of time to score X deterioration as compared to the reference score X ref .
Patients with no score available are excluded from the time-to-deterioration analyses.Patients with no baseline score are usually censored at baseline and those with no follow up scores but with a baseline score are censored one day after baseline.As for other analyses of HRQOL, in the TTD analyses, some sensitivity analyses could be performed: • Considering patients with no baseline score as events; • Considering patients with no follow-up score as events; • Varying the MCID.
For example, let us choose a MCID of 10 points instead of the 5 points initially fixed.Regarding the TUDD, as 5 < 10, then P(TUDD ≥ 5) ≥ P(TUDD ≥ 10), i.e., the probability that a patient presented a definitive deterioration with 5-point MCID is greater than the probability that a patient presented a definitive deterioration with a 10-point MCID.
Regarding the TUDD, if a patient experienced a definitive deterioration with a 10-point MCID but not with a 5-point MCID for any one of the proposed definitions, then this patient must be considered to have also presented a definitive deterioration with a 5-point MCID, and the event time will be the time of the 10-point deterioration.Indeed, for patients experiencing a TUDD with both a 5-point MCID and a 10-point MCID, the TUDD with a 5-point MCID must be the time of the first deterioration observed (5-point or 10-point MCID).In this way, if the deterioration with a 10-point MCID occurs first, we have to impose that the time of the 5-point deterioration equals the time of the 10-point MCID deterioration.
Figure 1 summarizes the different proposed definitions of TTD and TUDD.To have a valid observation, we need a date of HRQOL measure and a HRQOL score.When the reference score is the best previous score or the immediately preceding score, patients with no baseline score but with a post baseline score are kept in the TTD analyses if they have at least one follow up score available after the reference score.They are not censored at baseline.The first reference score is the first available score.
As TTD analyses belong to survival analyses, the TTD estimation can be calculated using the Kaplan-Meier or actuarial method and described using median and 95% confidence interval (CI).The Kaplan-Meier method is based on the intuitive idea that to be alive at time T , one has to be alive just before time T and not die at time T (Goel, Khanna, and Kishore 2010).Contrary to the Kaplan-Meier method, in the actuarial method probabilities are estimated for fixed time intervals, not determined by the date of observed death.Both methods can handle the presence of censored data, i.e., patients still alive at the end of the study.
In time to deterioration analyses, the event is "HRQOL score deterioration".
The Kaplan-Meier estimation is given by the following formula: where • n i is the number of subjects at risk at time i, i.e., the number of patients still in the study and who do not present a deterioration until time i − 1; • m i is the number of events observed at time i, i.e., the number of patients experiencing a HRQOL score deterioration at time i; • c i is the number of censored patients at time i, i.e., the number of patients who drop out at time i and who did not experienced a HRQOL deterioration before.
TTD can then be compared according to treatment arm in case of randomized clinical trials using the log-rank test and univariate Cox analyses to produce hazard ratios with 95% CI.Multivariate Cox regression can be applied to identify independent factors associated with TTD.
The log-rank test is a non-parametric test to compare the survival distributions of two samples A and B. The null hypothesis H 0 is that the survival distribution in both groups A and B are equal, i.e., the expected probability of the event at time i is the same in both groups.Under this null hypothesis, the theoretical probability of the event at time i is: where: • E Ai is the number of observed events in group A at time i; • E Bi is the number of observed events in group B at time i; • V Ai is the number of patients in group A who do not present the event at time i; Table 2: HRQOL scores obtained at 5 timepoints T 1 to T 5 for 10 patients and time events according to each definition of TTD and TUDD (1: definition 1; 2: definition 2; 3: definition 3) with a 5-point MCID.
Then the log-rank statistic equals: Under the null hypothesis, χ 2 exp is distributed according to a χ 2 distribution with one degree of freedom.
The Cox regression model link the instantaneous risk of event λ at time t with other covariates X 1 , . . ., X n as follows: where λ 0 (t) corresponds to a basic risk and corresponds to an instantaneous risk of event at time t when all covariates are equal to 0.
sectionIllustrations of the TTD and TUDD definitions Event definition and censoring rules depend on the definition of deterioration used, as illustrated in both Figure 1 and Table 1.Table 2 summarizes several situations for patients: • the variable id is the patient's identification number; • T 1 to T 5 correspond to five HRQOL assessments; • a deterioration corresponds to a score decrease.
There are some missing scores for several measures, illustrating the issue of intermittent or monotone missing data in HRQOL studies.
Table 3: Time to deterioration compared to the baseline score with a 5-point MCID for the patients of Table 2 and sensitivity analysis considering patients with no baseline score or with no follow up score as events.
last available HRQOL measure.For example, for the TTD as compared to the baseline score, patients 4 and 6 do not present a deterioration and are censored at T 5 and T 4 respectively.Patient 2 presents a deterioration as compared to the baseline score at T 3 , but not definitive as compared to the score qualifying the deterioration (TUDD.3).Patient 9 presents a deterioration as compared to the best score at T 3 : at T 3 , the previous best score equals 62 and the difference between this score and the score at T 3 is 6, i.e., greater than the MCID.
However, the HRQOL level of the patient goes up to 61 and then the deterioration is not definitive as compared to the deteriorated score 56.At the last HRQOL assessment, HRQOL score equals 57 and the previous best score is still 62.In this way, the deterioration observed at T 3 is definitive as compared to the deteriorated score.We recall that if a deterioration is followed by missing data (patient dropped out), the deterioration is definitive, whatever the definition of TUDD retained.
Two variables have then been created: • a dummy variable event indicating if the patient has deteriorated (event = 1) or not (event = 0); • a time variable equals to the time between baseline date, and the date of deterioration or censoring.
Table 3 below illustrates these two variables for the 10 patients of Table 2 and one definition of time to deterioration.
To illustrate, the patient 1 is in deterioration as compared to the baseline score (event = 1) and the time between the baseline date and the date of the deterioration equals Patient 8 has no baseline score: • in the primary analysis, this patient is censored at baseline (event = 0); • in the sensitivity analysis, this patient is in deterioration at baseline (event = 1).
In both cases, the time to deterioration equals: time = T 1 − T 1 = 0.
In the same way, patient 10 has only a baseline score, and no follow-up score: • in the primary analysis, this patient is censored one day after baseline (event = 0); • in the sensitivity analysis, this patient is in deterioration one day after baseline (event = 1).
In both cases, the time to deterioration equals to: time = T 1 + 1 − T 1 = 1.

QoLR package
The QoLR package was developed to allow the longitudinal analysis of HRQOL according to the time to deterioration approach.The QoLR has dependencies with two R packages (survival, Therneau 2017, and zoo, Zeileis and Grothendieck 2005).

Package structure
The QoLR package contains a set of functions to calculate the scores of the EORTC QLQ-C30 and most of its modules and two other programs to determine the time to deterioration in HRQOL score as a modality of longitudinal analysis whatever the definition of deterioration used.Other programs make it possible to print all of the results or to output in a CSV file according to treatment arm and perform all sensitivity analyses according to one reference score.A last program was created to obtain the TTD curves calculated according to the Kaplan-Meier estimation method, with the option of displaying some information (number of patients at risk, cumulative number of events, hazard ratio and log-rank test if two treatment arms are compared).
For the convenience of the reader, we summarize all the main functions, with arguments and descriptions of our package QoLR in Table 4. Hereafter, we describe some of the package functions in detail.

Function scoring.QLQC30()
The first application of the QoLR package is the estimation of the scores of most of the EORTC HRQOL questionnaires, such as the QLQ-C30 cancer specific questionnaire.
The first argument of the function scoring.QLQC30 and other functions for scoring is the name of the dataset with the items comprising the answers to the questionnaire (X parameter).The patient's identification number can be specified in the id parameter.A variable identifying the HRQOL assessment number can also be specified in the time parameter in case of a longitudinal HRQOL assessment.
The items must be named q1 to qi for the QLQ-C30 (i = 30), QLQ-C15-PAL (i = 15) and IN-PATSAT32 (i = 32) questionnaire.For all other supplementary modules, items must be named q31 to qi, because these modules have to be administered in conjunction with the QLQ-C30 core questionnaire.For example, items must be named q1 to q30 for the QLQ-C30 and then q31 to q50 for the QLQ-BN20.Moreover, the order of items has to be respected in the dataset.The result is a data frame: each variable corresponds to a HRQOL score.The score names are the same as those used in the EORTC scoring manual (Fayers et al. 1999).If a patient's identification number was specified in the id parameter, then this variable is replicated in the data frame obtained.Moreover, if a variable identifying the HRQOL assessment number was specified in the time parameter, then this variable is also replicated in the data frame obtained.

Function TTD()
This function is used to estimate the time to deterioration of a HRQOL score.To apply this function, the dataset must respect a general structure.The dataset X must be in long format with the following variables in this order: 1. Patient's identification number; 2. Variable identifying the HRQOL assessment number; 3. Date of HRQOL measure; 4. HRQOL scores; 5. Other variables like the date of death or the treatment arm.
The dataset must also be sorted by patient identification number and HRQOL measurement time.Dates must be in Julian format (i.e., number of days since a reference time point).
All definitions of TTD presented in this paper are programmed.Table 5 summarizes the arguments of this function and their possible values.According to the definition of deterioration retained, you must specify the following: • The name of your dataset (X); • The name of the HRQOL scores studied (score = " "); • The value of the MCID (MCID = 5, for example); • The reference score to qualify the deterioration which can be one of the following: - • Whether the deterioration corresponds to a decrease (order = 1) or an increase of the score (order = 2); • Whether patients with no baseline score are excluded (no_baseline = "excluded"), censored (no_baseline = "censored") or are in deterioration (no_baseline = "event") since baseline; • Whether patients with no follow-up score are censored (no_follow = "censored") or are in deterioration (no_follow = "event") one day after baseline; • If death is considered as an event, you must specify the name of the variable in your dataset X which contains the date of death for patients who died during the study and with missing values for patients still alive at the end of the study.
An option (sensitivity = TRUE) makes it possible to perform all sensitivity analyses available in one application of the TTD() function.
• Sensitivity analysis 1: A sensitivity analysis is conducted considering these patients in deterioration.
If, in addition to these parameters, the variable death is equals to the variable corresponding to the date of death in your dataset, then four analyses are performed: • A first analysis censoring patients with no baseline, those with no follow-up and those who died without experiencing deterioration before dying; • Sensitivity analysis 1: considering patients with no baseline and those with no follow-up score in deterioration; • Sensitivity analysis 2: considering death as an event; • Sensitivity analysis 3: considering simultaneously patients with no baseline, those with no follow-up score and death as an event.
The result of this function is a data frame with: • the patient's identification number; • a dummy variable called event equal to 1 if the patient is deteriorated, 0 if the patient is censored; • a variable called time, equal to the time to censoring or the time to deterioration in months.

Function TUDD()
This function allows the estimation of the time until definitive deterioration according to the definition retained.All definitions of TUDD presented in this paper are implemented.The syntax of this function is nearly the same as for TTD.• With no further improvement of at least k points as compared to the reference score (ref.def= "def1"); • maintaining this deterioration of at least k points for all following scores, i.e., the deterioration is observed for all the following scores (ref.def= "def2"); • with no further improvement of at least k points as compared to the score qualifying the deterioration (ref.def= "def3").
Moreover, in this function, you can perform sensitivity analysis according to the MCID, thus the MCID parameter is a vector, not scalar.

Function plotTTD()
The QoLR package also contains a program called plotTTD() to obtain the TTD curves calculated according to the Kaplan-Meier estimation method for all patients or by treatment arm (only two groups are allowed).The time parameter is a vector equal to the time to deterioration or the time to censoring, and the event parameter is a dummy vector equal to 1 if the patient is deteriorated and 0 if not.
Other information can also be added using options, e.g., at regular time point t for all patients or by treatment arm: • number of patients at risk (nrisk = T); • cumulative number of events (nevents = T).
In the case of TTD curves by treatment arm you must give the name of the group variable in the group parameter and the label of each group as you would like it to print in the group.namesparameter.The hazard ratio with 95% confidence interval and log-rank test can also be added on the graph (info = TRUE) at a determined position specified by the user (pos.info= c()).
xlab and ylab correspond to the name of the horizontal and vertical axis respectively.Table 7 summarizes the arguments of this function.
This function makes it possible to plot the TTD curves with additional information useful for researchers to easily obtain standard curves for presentation or scientific publications.
Functions write.TTD() and write.TUDD() These outputs can also be obtained for all definitions of TTD or TUDD with write.TTD() and write.TUDD() respectively, in the QoLR package.These programs create a comma-separated values (CSV) file with the results of the TTD or TUDD analyses performed in one or more scores according to one main deterioration definition.All sensitivity analyses according to this primary definition are also performed, with one or more MCID, for all patients or by group (e.g., treatment arm effect).The results produced are as follows: • the number of patients initially at risk and the number of events; • the median time to deterioration with 95% confidence interval; and in the case of analyses performed by group (only two groups are allowed): • the results of the log-rank test; • the univariate hazard ratio with 95% confidence interval.
The arguments of this function are almost the same as the TTD() and TUDD() functions.An additional parameter corresponds to the name of the file in which to print the results (file = "").The user can also specify the directory for this file in the (file = "") parameter.

Applications of the QoLR package
To apply one or more of these functions, you need to load the QoLR library in the R current session with the following command: R> library("QoLR") We shall demonstrate the use of five main programs of QoLR in one dataset dataqol corresponding to a randomized clinical trial with the answers to the 30 items of the QLQ-C30 questionnaire for 40 patients with longitudinal HRQOL assessment.Patients in this table were randomly allocated to one treatment group corresponding to the variable Arm (dichotomous variable equal to 0 or 1).
This dataset is available in QoLR package and can be imported into R via the command data("dataqol").
This dataset contains the following information: • the patient identification number in the Id parameter; • the treatment group to which each patient was allocated, corresponding to the variable Arm (dichotomous variable equals to 0 or 1); • a variable indicating the theoretical HRQOL assessment number in the time parameter (equal to 0 for baseline measure); • the date of HRQOL assessment in the date parameter; • and the answers to the 30 items of the QLQ-C30 in the variables q1 to q30; • and the date of death for patients who died during the study in the death parameter (missing for all patients who were still alive at the end of the study).
This dataset is in long format.The variable date corresponds to the date of HRQOL measure in Julian format.The baseline date was set to 0 and all the following dates correspond to the time between the measure and the baseline date (the baseline date is the date of origin).
HRQOL was evaluated about every 50 days.In the same way, death corresponds to the time between the baseline date and the date of death if the patient died during the study.Moreover, some items and/or measurement times are missing in order to illustrate how missing data are treated in the scoring and then in the Time to deterioration analysis.

Time to HRQOL score deterioration
Time to deterioration.To apply the TTD function, we have to first modify the scoring_dataqol dataframe in order to respect the requested format of the TTD function.
As a reminder, this dataset must contain the following information in the following order: 1. Patient identification number; 2. Variable identifying the HRQOL assessment number with value 0 for the baseline measure; 3. Date of HRQOL measure; 4. HRQOL scores; 5. Other variables, such as the date of death or the treatment arm.
The date of HRQOL assessment as well as the treatment arm and date of death are available in the dataqol dataset.We thus merged the score_dataqol dataframe with the important variables of the dataqol dataframe as follows: R> info <-dataqol[, c("Id", "time","date", "death", "Arm")] R> dataqol_final <-merge(score_dataqol, info, by = c("Id", "time")) Since HRQOL was measured at several measurement times for each patient, we can study the time to deterioration of the HRQOL scores.
As a reminder, for global health status (variable QL) and other functional scales (variables PF for physical functioning to SF for social functioning) a high score reflects a high level of HRQOL or functioning.For symptomatic scales (variables FA for fatigue to FI for financial difficulties), a high score reflects a high symptomatic level.
To begin, we can study the reference definition of TTD of the QL score, i.e., a deterioration as compared to the baseline score (ref.init= "baseline") with at least 5-point MCID (MCID = 5) and considering patients with no baseline or with no follow-up measure censored at baseline or just after baseline (no_baseline = "censored" and no_follow = "censored").The score QL corresponds to a measure of global HRQOL.In this way, a deterioration corresponds to a decrease of the score (order = 1).Since the values of these parameters are the default values (except for the MCID), we do not need to specify their values and the function can thus be applied as follows: R> ttd1 <-TTD(dataqol_final, score = "QL", MCID = 5) R> head(ttd1) Id event.QL time.QL 1 1 1 1.41273101 2 2 0 4.89527721 3 3 1 1.77412731 4 4 0 0.03285421 5 5 0 0.00000000 6 6 1 1.70841889 The result is a data frame with the identification number of patients (Id), the time to deterioration or to censoring in months (time.QL) and a dummy variable (event.QL) indicating whether the patient is deteriorated (event.QL = 1) or not (event.QL = 0).The suffix "QL" corresponds to the name of the treated score.According to this definition, 24 patients are deteriorated among the 40 patients in terms of the QL score: R> sum(ttd1$event.QL, na.rm = TRUE) [1] 24 If we want to consider patients with no baseline or no follow up as events, we have to fix the parameters no_baseline and no_follow to "event" as follows: R> ttd2=TTD(dataqol_final, score = "QL", order = 1, MCID = 5, + no_baseline = "event", no_follow = "event") R> head(ttd2) Id event.QL time.QL 1 1 1 1.41273101 2 2 0 4.89527721 1 1.77412731 4 4 1 0.03285421 5 5 1 0.00000000 6 6 1 1.70841889 R> sum(ttd2$event.QL, na.rm = TRUE) [1] 33 9 patients with no baseline or no follow-up measure are then added to the events.
To consider death as an event, we have to specify the value of the death parameter: R> ttd3 <-TTD(dataqol_final, score = "QL", MCID = 5, death = "death") R> head(ttd3) Id event.QL time.QL 1 1 1 1.412731 2 2 1 8.903491 3 3 1 1.774127 4 4 1 1.018480 5 5 0 0.000000 6 6 1 1.708419 Finally, we can obtain directly all the sensitivity analyses along with the primary analysis in one application of the function TTD by specifying sensitivity = TRUE: R> ttd4 <-TTD(dataqol_final, score = "QL", MCID = 5, death = "death", + sensitivity = TRUE) R> head(ttd4) When all sensitivity analyses are performed, some added variables are created.Variables event.QL and time.QL still correspond to the results of the primary analysis (TTD as compared to the baseline score in the present case).The variable event.SA1.QL is the dummy event variable considering patients with no baseline or no follow up measure in deterioration since baseline, whereas they were censured in the primary analysis (Sensitivity Analysis 1).
Since the corresponding times are the same as in the primary analysis, no new time variable was created.Variables event.SA2.QL and time.SA2.QL are the results for the analysis adding death as an event (Sensitivity Analysis 2).Finally, event.SA3.QL corresponds to the event variable with death and patients with no baseline or no follow up as events (Sensitivity Analysis 3).[1] 26 In this case, 26 patients experienced a deterioration as compared to the best previous QL score The function TTD() can handle many scores simultaneously, functional and/or symptomatic scores.You must define the name of the scores studied in the score parameter as well as the order to consider (i.e., decrease or increase): order = 1 for global health status or functional scales and order = 2 for symptomatic scales.Variables event and time are then created for each score with the score name as a suffix.The following example represents the application of the TTD function as compared to the baseline score with a 5-point MCID for QL, PF (with order = 1 for both scores) and FA (with order = 2): R> ttd6 <-TTD(dataqol_final, score = c("QL", "PF", "FA"), order = c(1, 1, 2), + MCID = 5) R> head(ttd6) Time until definitive deterioration.The TUDD is studied with the TUDD() function, quite similar to the TTD() function.By default, the deterioration is defined as a deterioration with a k-point MCID as compared with the baseline score, with no further improvement of more than k points as compared to the baseline score (Bonnetain et al. 2010).The result of the application of this function is fairly similar to that of the TTD().However, for TUDD(), the value of the MCID is also specified in the variable names time and event.
The deterioration can also be definitive as compared to the deterioration observed, i.e., with no further improvement of 5-point MCID as compared to the score obtained at the time of the first deterioration.This definition is applied by setting the parameter ref.As for the TTD, all sensitivity analyses can be performed simultaneously with the primary definition of TUDD.Moreover, many MCID can be specified.In fact, as defined in section 2, we need a dependence between sensitivity analyses varying the MCID for TUDD.An indicator of the MCID value is added as a suffix of the resulting parameters event and time.
Time to deterioration curves.Figure 2 corresponds to the TUDD of QL score as compared to the baseline score with a 5-point MCID according to treatment arm (Arm parameter).
In this graph, we printed the number of patients still at risk at each time point according to treatment arm (nrisk = T).Moreover, the result of the log-rank test and the hazard ratio of Arm 2 vs. Arm 1 is also printed (info = T, pos.info = c(6, 0.8)).The hazard ratio (Arm 2 vs. Arm 1) equals 2.86 with 95% confidence interval (1.16 − 7.09) and the result of the log-rank test is p = 0.018.
Table 8 is an extract of this file for a 5-point MCID.

Conclusion and outlook
The QoLR package is the first R package dedicated to the analysis of HRQOL.The implementation of the time to deterioration definitions in a HRQOL score allows the dissemination of these approaches in order to achieve the goal of standardization of longitudinal HRQOL analysis in oncology clinical trials.
QoLR will be updated as new modules are developed by the EORTC HRQOL group.The package will be completed over time, by some simulation algorithms of longitudinal HRQOL data with intermittent or monotone missing data of type Missing Completely At Random or Missing Not At Random for example.Other programs will make it possible to print the results of univariate and multivariate Cox analyses in a CSV file.Moreover, an ongoing project is investigating the presence of competitive risk between events.The current package will then be expanded by the addition of competitive risk models.Finally, the time to deterioration approach can be implemented using SAS software (SAS Institute Inc. 2011) to allow researchers not familiar with R software to apply this longitudinal analysis method.

Figure 1 :
Figure 1: Flowchart of the choice of definition of time to a health-related quality of life score deterioration.

Figure 2 :
Figure 2: Kaplan-Meier survival curve for the time until definitive QL score deterioration of at least 5-point MCID as compared to the baseline score.
Reference score X ref at T iEvent definition at T i

Table 2
illustrates different scenarios according to the event definition.This table summarizes, for each definition of TTD and TUDD, the time of the events for patients experiencing a deterioration with a 5-point MCID.In other cases, patients are censored at the time of the

Table 4 :
Summary of the functions in the QoLR package.

Table 7 :
Arguments of the plotTTD function.
time QL PF RF EF CF SF FA NV PA DY SL AP CO DI FI date death ArmThen we reorganized the obtained dataqol_final dataset in order that the date variable appeared in third position: In order to integrate the response shift effect, we can choose the best previous HRQOL score or the immediately preceding score as the reference score by specifying ref.init = "best" or ref.init = "previous" respectively.In the following example, the reference score is the best previous QL score:

Table 8 :
The CSV file created with the application of the write.TTD function.