plotROC : A Tool for Plotting ROC Curves

Plots of the receiver operating characteristic (ROC) curve are ubiquitous in medical research. Designed to simultaneously display the operating characteristics at every possible value of a continuous diagnostic test, ROC curves are used in oncology to evaluate screening, diagnostic, prognostic and predictive biomarkers. I reviewed a sample of ROC curve plots from the major oncology journals in order to assess current trends in usage and design elements. My review suggests that ROC curve plots are often ineﬀective as statistical charts and that poor design obscures the relevant information the chart is intended to display. I describe my new R package that was created to address the shortcomings of existing tools. The package has functions to create informative ROC curve plots, with sensible defaults and a simple interface, for use in print or as an interactive web-based plot. A web application was developed to reach a broader audience of scientists who do not use R .


About ROC curves
The receiver operating characteristic (ROC) curve is used to assess the accuracy of a continuous measurement for predicting a binary outcome. In medicine, ROC curves have a long history of use for evaluating diagnostic tests in radiology and general diagnostics. ROC curves originated in the field of signal detection theory.
For a continuous measurement that I denote as M , convention dictates that a test positive is defined as M equaling or exceeding some threshold c: M ≥ c. The goal of ROC analysis is to evaluate the classification accuracy of M in reference to the binary outcome D, which takes possible values 0 (negative) or 1 (positive). It is implicitly assumed that the subpopulation having D = 1 tends to have larger values of M compared to the subpopulation with D = 0. The classification accuracy of M can be evaluated by considering the confusion matrix ( Table 1). The confusion matrix cross-classifies the predicted outcome M ≥ c versus the true outcome D. The four cells of the matrix correspond to the possible classification outcomes: a true positive, a false positive, a true negative, and a false negative. ROC analysis assesses the trade-offs between the test's fraction of true positives versus the false positives as c varies over the range of M .
Formally, for a fixed cutoff c, the true positive fraction is the probability of a test to be positive in the diseased population: TPF (c) = P{M ≥ c|D = 1} and the false positive fraction is the probability of a test to be positive in the healthy population: FPF (c) = P{M ≥ c|D = 0}.
Since the cutoff c is not fixed in advance, one can plot the TPF against the FPF for all possible values of c. This is exactly what the ROC curve is, a plot of FPF (c) on the x axis and TPF (c) along the y axis as c varies. A useless test that is not informative at all in regards to the disease status has TPF (c) = FPF (c) for all c. The ROC plot of a useless test is thus the diagonal line. A perfect test that is completely informative about disease status has TPF (c) = 1 and FPF (c) = 0 for at least one value c. If the assumption that support of M |D = 1 is greater than that of M |D = 0 holds, then the ROC curve will lie in the upper left quadrant above the diagonal line, however this may not be the case in a particular sample.
Given a sample of test and disease status pairs, (M 1 , D 1 ), . . . , (M n , D n ), one can estimate the ROC curve by computing proportions in the diseased and healthy subgroups separately. Specifically, given a fixed cutoff c, an estimate of the TPF (c) is the proportion of test positives among the diseased subgroup: where 1{·} is the indicator function that equals 1 when the condition inside the parentheses is true and 0 otherwise. An estimate for FPF (c) is the proportion of test positives among the healthy subgroup and is given by a similar expression with D i = 1 replaced with D i = 0. Calculating these proportions for c equal to each unique value of the observed M i yields what is known as the empirical ROC curve estimate. The empirical estimate is a step function.
Other methods exist to estimate the ROC curve, such as the binormal parametric estimate which can be used to get a smooth curve. There are also extensions that allow for estimation with time-to-event outcomes subject to censoring. For a more thorough reference on the methods and theory surrounding ROC curves, interested readers may look into Pepe (2003).
A common way to summarize the value of a test for classifying disease status is to calculate the area under the ROC curve (AUC). The greater the AUC, the more informative the test. The AUC summarizes the complexities of the ROC curve into a single number and therefore is widely used to facilitate comparisons between tests and across populations. It has been criticized for the same reason because it does not fully characterize the trade-offs between false and true positives.

Design considerations
The main purpose of visually displaying the ROC curve is to show the trade-off between the FPF and TPF as the cutoff c varies. This can be useful for aiding viewers in choosing an optimal cutoff for decision making, for comparing a small number of candidate tests, and for generally illustrating the performance of the test as a classifier. In practice, once the FPF and TPF are computed for each unique observed cutoff value, they can be plotted as a simple line chart or scatter plot using standard plotting tools. This often leads to the unfortunate design choice of obscuring the critical and useful third dimension, the range of cutoff values c.
Another key design element is the use of a diagonal guideline for comparison. They allow observers to roughly estimate the area between the diagonal and the estimated ROC curve, which serves as a proxy for estimating the value of the test for classification above a useless test. Likewise, grid lines inside the plotting region and carefully selected axes allow for accurate assessment of the TPF and FPF at particular points along the ROC curve. Many medical studies use ROC curves to compare a multitude of candidate tests to each other. In those cases, curves need to be distinguished by using different colors or line types combined with a legend, or direct labels inside the plotting region.
In the medical literature, FPF and TPF are usually referred to in terms of the jargon sensitivity and specificity. Sensitivity is equivalent to the true positive fraction. Specificity is 1 − FPF , the true negative fraction. Sometimes, the FPF and TPF are incorrectly referred to as rates, using the abbreviations FPR and TPR. These are probabilities and their estimates are proportions, therefore the use of the term fraction as opposed to rate is preferred.

Existing plotting software
The ROC curve plot is, at the most basic level, a line graph. Therefore, once the appropriate statistics are estimated, existing plotting functions can be used to create a ROC curve plot. Viewers can identify ROC plots through context, by observing the shape of the line, and through the addition of axis labels, titles, legends, and so on. There are several R packages related to ROC curve estimation that contain dedicated plotting functions. The ROCR package (Sing, Sander, Beerenwinkel, and Lengauer 2005) plots the FPF versus TPF , as usual, and then takes the interesting approach of encoding the cutoff values as a separate color scale along the ROC curve itself. A legend for the color scale is placed along the vertical axis on the right side of the plotting region. The pROC package (Robin, Turck, Hainard, Tiberti, Lisacek, Sanchez, and Müller 2011) provides an option for plotting cutoff labels (print.thres = TRUE) and is mainly focused on estimating confidence intervals and regions for restricted ranges of the ROC curve. The plotting methods therein

Literature review
Anyone giving a cursory look at any of the major medical journals is likely to find at least one ROC curve plot. I sought to assess the usage of ROC curve plots and to evaluate the design choices made in the current oncology literature by conducting a small literature review. I searched PubMed for clinical trials or observational studies in humans reported in major oncology journals for the past 10 years for the terms "ROC Curve" OR "ROC Analysis" OR "Receiver operating characteristic curve". I conducted the search on October 8 Figure 1: Illustration of design choices in plotting ROC curves. Panel A shows a sparse ROC curve, with no design additions inside the plotting region. The plot results in more white space than anything else. It is difficult to accurately determine values without reference lines. Panel B shows a plot comparing 2 curves, with different line types and a legend. AUC values are also given in the legend. Panels C and D add grid lines, diagonal reference lines, and direct labels. plot manually for the following design features: the number of curves displayed, the type of axis labels (sensitivity/1−specificity or true/false positive fractions), presence or absence of grid lines, presence or absence of a diagonal guide line, whether any cut points were indicated, the type of curve label (legend or direct label), and presence of other textual annotations such as the AUC. The numerical results of the survey are summarized in Table 2.
The small minority of the figures make any attempt to indicate the values of the test cutoffs, which is an integral component of the ROC curve. I conjecture that this is mainly due to the use of default plotting procedures in statistical software. The software, by default, treats the ROC curve as a 2-dimensional object, obscuring the cutoff dimension. Grid lines and direct labels are also somewhat out of the ordinary. The absence of these features make accurate determination and comparison of the values more difficult. Many of the plots included large tables containing estimates and inference for AUC values, while the ROC curves themselves, numerous and without clear labels or reference lines, merely served as decoration. I aim to solve some of these problems by providing an easy-to-use plotting interface for the ROC curve that provides sensible defaults.
The panels of Figure 1 illustrate the most common styles of ROC curve plots, and the associated design elements. I favor the use of grid lines and a diagonal reference line to facilitate accurate readings off of the axes. Direct labels are preferred over legends because they omit the additional cognitive step of matching line types or colors to labels. The plotROC package additionally provides plotting of cutoff values, which are displayed interactively with the web-based output option, and direct labels for print use. Exact confidence regions for points on the ROC curve are optionally calculated and displayed. Additionally, the use of axis scales adjusted to be denser near the margins 0 and 1 facilitates accurate reading of FPF and TPF values. In medical applications, it is often necessary to have a very low FPF (less than 10%, for instance), therefore the smaller scales are useful for accurately determining values near the margins. The next section details the usage of the plotROC R package and these features.
The vast majority of the figures that were reviewed looked more like those in panels A and B than those in C and D. While plots like this do technically display the trade-offs between false and true positives, they minimize the amount of useful information that can be displayed in the space needed to plot a ROC curve. The plots created by plotROC attempt to increase the amount of information displayed in ROC charts that would otherwise be mostly white space. This is useful not only for print media, where space is limited, but also during data analysis. The analyst can quickly and easily view information that would otherwise be obscured by standard plotting software. The interactive features take this one step further, enhancing the plots with high density and easily accessible supplementary information.

shiny application
A shiny application (Chang, Cheng, Allaire, Xie, and McPherson 2017) was created in order to make the features more accessible to non-R users. A limited subset of the functions of plotROC can be performed on an example data set or on data that users upload to the website. Resulting plots can be saved to the users' machine as a PDF or as a stand-alone HTML file. It can be used in any modern web browser with no other dependencies at the website here: http://sachsmc.shinyapps.io/plotROC.

Installation and loading
Users can install the latest version of plotROC from CRAN. It can also be downloaded or installed from https://github.com/sachsmc/plotROC, where active development will take place. plotROC also requires at least version 2.0.1 of ggplot2, also available from CRAN. R> install.packages(c("ggplot2", "plotROC")) R> library("plotROC")

Quick start
After installing, the interactive shiny application can be run locally.

Command line basic usage
I start by creating an example data set. There are 2 markers, one that is moderately predictive and one that is not as predictive. I have implemented the empirical ROC curve estimate and the calculation of exact confidence regions as statistical transformations: stat_roc and stat_rocci, respectively. I have also defined geometric layers for the ROC curve and confidence regions for the ROC curve: geom_roc and geom_rocci, respectively. For further discussion and details of the grammar of graphics as implemented in ggplot2, I refer readers to Wickham (2010) and the ggplot2 vignettes.
To use the ROC geometric layer, I use the ggplot function to define the aesthetic mappings, and the geom_roc function to add a ROC curve layer. The geom_roc function requires that the named aesthetics d for disease status, and m for marker be present in the aes function call inside ggplot. By default, the ROC geom and stat are linked, so that when geom_roc is called, stat_roc does the computation, and when stat_roc is called, geom_roc is used to plot the layer. The disease status need not be coded as 0/1, but if it is not, stat_roc assumes (with a warning) that the lowest value in sort order signifies disease-free status.
R> basicplot <-ggplot(test, aes(d = D, m = M1)) + geom_roc() The geom_roc layer includes the ROC curve line combined with points and labels to display the values of the biomarker at the different cut points. It accepts the argument n.cuts to define the number of cut points to display along the curve. Labels can be suppressed by using n.cuts = 0 or labels = FALSE, however points will be displayed in the latter case. The size of the labels and the number of significant digits can be adjusted with labelsize and labelround, respectively.

Confidence regions and the rocci geom
It is common to compute confidence regions for points on the ROC curve using the Clopper and Pearson (1934) (2003), the cross-product of these intervals yields a 100 · (1 − α) percent rectangular confidence region for the pair. This is implemented in stat_rocci and displayed as a geom_rocci layer. These both require the same aesthetics as the ROC geom, d for disease status and m for marker. By default, a set of 3 evenly spaced points along the curve are chosen to display confidence regions. Points corresponding to the confidence regions are distinguished from the others with a different symbol. You can select points by passing a vector of values in the range of m to the ci.at argument. By default, the significance level α is set to 0.05, this can be changed using the sig.level option. An example is shown in Figure 2.

Styles and labels
The same objects like basicplot with roc and/or rocci layers can be treated like any other 'ggplot' objects. They can be printed to display the figure, and other layers can be added to the plot. I provide the function style_roc() which is a layer containing a theme, modified grid lines, and axes. Adding the style_roc() layer to the 'ggplot' object creates a plot with sensible defaults for use in print. This function has options for the number and location of major and minor breaks, addition of the diagonal guideline, axis labels, and any theme object created for use with ggplot2 can be supplied.
The direct_label function takes a 'ggplot' object as an argument and annotates the figure with a direct label with an automatically chosen location. It attempts to intelligently select an appropriate location for the label, but the location can be adjusted with nudge_x, nudge_y and label.angle. If the labels argument is NULL, it will take the name from the mapped aesthetic. A simple example with the default options is shown in Figure 3.

Interactive ROC plots
The basicplot object, which is of class 'ggplot', can be used to create an interactive plot and display it in the RStudio viewer or default web browser by passing it to the plot_interactive_roc function. Give the function an optional path to an HTML file as an argument called file to save the interactive plot as a complete web page. A screen shot of an interactive plot is shown in Figure 4. Hovering over the display shows the cutoff value at the point nearest to the cursor. Clicking makes the cutoff label stick until the next click, and if confidence regions are available, clicks will also display those as grey rectangles. By default, plot_interactive_roc removes any existing rocci geom and adds a high-density layer of confidence regions. This can be suppressed by using the add.cis = FALSE option. The points and labels layer of the roc geom can be hidden by using the hide.points = TRUE option. Then, points and labels will be displayed only when the mouse is hovering over the plotting region. Also by default, the style_roc function is applied, the settings can be modified by passing a call to that function or setting it to NULL.

R> plot_interactive_roc(basicplot)
R> plot_interactive_roc(basicplot, hide.points = TRUE) R> plot_interactive_roc(basicplot, style = style_roc(theme = theme_bw)) Users can export an interactive ROC plot by using the export_interactive_roc function, which returns a character string containing the necessary HTML and JavaScript. The JavaScript source can be omitted by using the omit.js = TRUE option. Users may wish to do this when there are multiple interactive figures in a single document; the source only needs to be included once. The character string can be copy-pasted into an HTML document, or better yet, incorporated directly into a dynamic document using knitr (Xie 2017). In a knitr document, it is necessary to use the cat function on the results and use the chunk options results = "asis" and fig.keep = "none" so that the interactive plot is displayed correctly. For documents that contain multiple interactive plots, it is necessary to assign each plot a unique name using the prefix argument of export_interactive_roc. This is necessary to ensure that the JavaScript code manipulates the correct SVG elements. For examples of interactive plots and how to incorporate them into knitr documents, see the package vignette (vignette("examples", package = "plotROC")) or the web page https://sachsmc.github.io/plotROC/. The next code block shows an example knitr chunk that can be used in an .Rmd document to display an interactive plot.

Multiple ROC curves
If you have grouping factors in your data set, or you have multiple markers measured on the same subjects, you may wish to plot multiple ROC curves on the same plot. plotROC fully supports faceting and grouping as done by ggplot2. These data are in wide format, with the 2 markers going across 2 columns. ggplot2 requires long format, with the marker result in a single column, and a third variable identifying the marker. I provide the convenience function melt_roc to perform this transformation. The arguments are the data frame, a name or index identifying the disease status column, and a vector of names or indices identifying the markers. Optionally, the names argument gives a vector of names to assign to the marker, replacing their column names. The result is a data frame in long format.

Themes and annotations
plotROC uses the ggplot2 package to create the objects to be plotted. Therefore, users can add themes and annotations in the usual ggplot2 way. A figure with a new theme, title, axis label, and AUC annotation is shown in Figure 6. plotROC provides the convenience function calc_auc that takes a 'ggplot' object that has a roc layer, extracts the data, and calculates the AUC.

Other estimation methods
By default calculate_roc computes the empirical ROC curve. There are other estimation methods out there, as summarized in the introduction. Any estimation method can be used, as The example is shown in Figure 7. Interactive plots with stat = "identity" are not currently supported. Another potential use of this approach is for plotting time-dependent ROC curves for time-to-event outcomes estimated as described in Heagerty, Lumley, and Pepe (2000). Here is an example using the survivalROC package (Heagery and Saha-Chaudhuri 2013) for estimation:

How it works
plotROC makes use of ggplot2 (Wickham 2009), gridSVG (Murrell and Potter 2017), and d3.js (Bostock, Ogievetsky, and Heer 2011) to create interactive plots. The first step in the process is to create 'ggplot' objects with roc and/or rocci layers. They can be plotted and inspected in the R console. These form the basis for both the print versions and the interactive versions of the plots. Basic styling and labeling can be added with the style_roc and direct_label functions.
plotROC makes interactive plots by first converting the 'ggplot' object into a scalable vector graphic (SVG) object with the gridSVG::grid.export function. This function maps each element of the plot to a corresponding element of the SVG markup language. Interactivity is then added using d3.js and JavaScript to manipulate those SVG elements in response to users' input. The main interactive feature is to display the cutoff labels at the points on the ROC curve closest to the mouse cursor.
There are many ways to solve this with d3.js, but Voronoi polygons are a convenient and efficient way to map the cursor location to approximately the nearest point on the ROC curve. For the set of cutoff points along the ROC curve, the d3.geom.voronoi function chain computes a set of polygons overlaying the plotting region such that the area of each polygon contains the region of the plot closest to its corresponding cutoff point. Hover events are bound to the polygons so that when the mouse cursor moves around the plotting region, the closest point on the ROC curve is made visible. Similarly, click events are bound to the polygons so that the appropriate confidence region is made visible upon clicking. The SVG code and all necessary JavaScript code is returned in the character string provided by export_interactive_roc. This approach is similar to what is done in the gridSVG grid.animate function, which uses the SVG <animate /> tags. However, d3.js provides a much richer set of features. There are several other R packages that aim to create interactive figures. The authors of animint (Hocking, VanderPlas, Sievert, Ferris, Tsai, and Khan 2017) created an extensive JavaScript library that creates plots in a similar way as ggplot2. A set of interactive features can be added to plots using d3.js. ggvis (Chang and Wickham 2016), rCharts (Reinholdsson, Russell, and Vaidyanathan 2014), and the more recently released htmlwidgets (Vaidyanathan, Cheng, Allaire, Xie, and Russell 2016) all leverage existing charting libraries written in JavaScript. qtlcharts (Broman 2017) uses a set of custom JavaScript and d3.js functions to visualize data from genetic experiments. Their general approach is to manipulate the data and record plotting options in R, and then pass those objects to the charting libraries or functions that handle the rendering and interactivity. plotROC lets R do the rendering, allowing the figures to be consistent across print and web-based media, and retaining the distinctive R style. This also allows users to manipulate the figures directly in R to suit their needs, using tools that are more accessible and familiar to most R users. Then, the JavaScript adds a layer of interactivity to the rendered figures.

Discussion
Here I have illustrated the usage and described the mechanics of a new R package for creating ROC curve plots. The functions are easy to use, even for non-R users via the web application, yet have sufficient flexibility to meet the needs of power users. My approach to creating interactive plots differs from other interactive charting packages. I found that existing approaches did not meet the highly specialized needs of plotting ROC curves. While ROC curve plots can technically be created with even the most basic plotting tools, I find that specialized functions make the results clearer and more informative. The functions are integrated into the existing and popular ggplot2 package, so that all the benefits and features therein can be used effectively.

Computational details
This manuscript is completely reproducible using the source files. The output below indicates the R packages and versions used. Compiling the PDF output also requires pandoc version 1.13.1 and pdfL A T E X.