netmeta : An R Package for Network Meta-Analysis Using Frequentist Methods

Network meta-analysis compares different interventions for the same condition, by combining direct and indirect evidence derived from all eligible studies. Network meta-analysis has been increasingly used by applied scientists and it is a major research topic for methodologists. This article describes the R package netmeta , which adopts frequentist methods to fit network meta-analysis models. We provide a roadmap to perform network meta-analysis, along with an overview of the main functions of the package. We present three worked examples considering different types of outcomes and different data formats to facilitate researchers aiming to conduct network meta-analysis with netmeta


Introduction
Network meta-analysis (NMA), also known as multiple-treatments meta-analysis or mixedtreatment comparison, is an extension of pairwise meta-analysis, pooling information from all randomized trials among a set of different interventions for the same medical condition (Salanti 2012). It takes into account and synthesizes both direct and indirect evidence in a single analysis: An estimate of the difference in effects between two given treatments may, in fact, derive from studies directly comparing them (direct evidence), but also from studies included in paths connecting the two treatments via one or more intermediate comparators (indirect evidence). As in pairwise meta-analysis, studies included in an NMA are assumed to be independent. NMA additionally requires that the transitivity assumption holds. Transitivity suggests that the underlying true relative treatment effect of each comparison is the same across all, observed or not, comparisons. Examination of the similarity of the distribution of effect modifiers across comparisons is often made to judge upon the plausibility of transitivity in a network. Transitivity implies consistency, which means that direct and indirect evidence are in agreement (Salanti 2012). An NMA provides network estimates for the relative effects between any pair of interventions included in the network, and a ranking thereof, with respect to the outcome of interest. The latter is not possible when performing a series of pairwise meta-analyses, i.e., one per treatment comparison. Given the advantages of NMA and the ease to fit models using common statistical software, its use in applied projects has been increasing rapidly in the last decade Petropoulou et al. 2017) and it represents a major research topic for methodologists (Efthimiou et al. 2015).
R package netmeta (Rücker et al. 2023) is the only specialized R package for frequentist network meta-analysis. The package adopts the approach by Rücker (2012) and builds on meta (Schwarzer 2007;Schwarzer, Carpenter, and Rücker 2015;Balduzzi, Rücker, and Schwarzer 2019), a general R package for pairwise meta-analysis. Frequentist NMA can also be fitted using R packages metafor (Viechtbauer 2010) and mixmeta (Sera, Armstrong, Blangiardo, and Gasparrini 2019), which provide functions for multivariate and multilevel meta-analysis. In fact, netmeta internally calls rma.mv() from metafor to calculate the maximum likelihood and the restricted maximum likelihood estimators for the between-study variance. R package mixmeta implements various meta-analytical models, both standard and non-standard, through a unified mixed-effects framework. Bayesian methods for NMA play an important role both in methodological research and applications (Lu and Ades 2004;Salanti, Higgins, Ades, and Ioannidis 2008;Welton, Caldwell, Adamopoulos, and Vedhara 2009). This is reflected in the large number of R packages for Bayesian NMA: gemtc (Van Valkenhoef and Kuiper 2021), bnma (Seo and Schmid 2022), pcnetmeta (Lin, Zhang, Hodges, and Chu 2020), multinma (Phillippo 2022), nmaINLA (Guenhan, Friede, and Held 2018), bayesmeta (Röver 2020), BUGSnet (Béliveau, Boyne, Slater, Brenner, and Arora 2019). Package netmeta (Rücker et al. 2023) is available from the Comprehensive R Archive Network (CRAN) at https://CRAN.R-project.org/package=netmeta. According to cranlogs (Csárdi 2019), R package netmeta is the most popular R package for network meta-analysis (3 645 monthly downloads between November 2021 and October 2022) followed by gemtc (1 781 downloads) and bayesmeta (919 downloads). Main reasons for its popularity are presumably the large number of implemented statistical methods and the easy application of these methods. More details on these aspects are provided in this article which describes R package netmeta in detail. The structure of the paper is as follows. Three example data sets are introduced in Section 2: Models implemented in netmeta, i.e., the standard NMA model (Rücker 2012;Rücker and Schwarzer 2014), methods specific for rare binary outcomes (Efthimiou, Rücker, Schwarzer, Higgins, Egger, and Salanti 2019), and component NMA (Welton et al. 2009, Rücker, Petropoulou, andSchwarzer 2020b), are described in Section 3. A roadmap to perform an NMA, along with an overview of the main functions of netmeta is provided in Section 4. The three worked examples are presented in Section 5.

Data sets
We will use three data sets which are available in R package netmeta. All three examples come from a database of NMAs ) made available by R package nmadb (Papakonstantinou 2019) which includes a set of functions for accessing the data sets and main characteristics. We extracted the data from nmadb and saved them in common data formats for NMA: one row per study, called wide arm-based format and one row per treatment-arm per study, called long arm-based format. The first two data sets consider continuous and binary outcomes and are available in wide and long arm-based format. The third data set, available in long arm-based format, was chosen as an example of a component NMA. Armbased in either wide or long format means that data on each treatment arm is available, as opposed to contrast-based format, where data are given for each contrast of treatments.

Data set 1: Adjuvant treatments in Parkinson's disease
The first data set (record no. 480851 in nmadb) contains data from a Cochrane review assessing efficacy and safety of three drug classes as adjuvant treatment to levodopa therapy in patients with Parkinson's disease and motor complications (Stowe et al. 2010). The authors conducted three pairwise meta-analyses comparing dopamine agonists, catechol-Omethyl transferase inhibitors (COMTI), and monoamine oxidase type B inhibitors (MAOBI) with placebo. The primary outcome was the mean reduction of the time spent in a relatively immobile "off" phase (mean off-time), calculated in hours per day. Relative treatment effects were expressed as mean difference. Data on this outcome were available for 5 331 patients from 28 studies comparing an active treatment with placebo and one three-arm study comparing two active treatments with placebo.
The full data set -in wide arm-based format -is part of R package netmeta and made available using the following command.
R> data("Stowe2010", package = "netmeta") The data set Stowe2010 has 29 rows, each containing information regarding a single study. To have a look at the format of this data set, we display some rows including the three-arm study. All studies with a single pairwise comparison provide information for two means (y1, y2), two standard deviations (sd1, sd2), two group sample sizes (n1, n2), along with two treatment labels (t1, t2). Information for a third treatment (t3, y3, sd3, n3) is not available for twoarm studies and thus coded as missing. The three-arm LARGO study provides information for all variables. A disadvantage of this wide arm-based format is that we would have to add four more variables (t4, y4, sd4, n4) for a four-arm study.

Data set 2: Antithrombotics to prevent strokes
The second data set (record no. 501250 in nmadb) comes from a systematic review aiming to estimate the effects of eight antithrombotic treatments in reducing the incidence of major thrombotic events in patients with non-valvular atrial fibrillation (Dogliotti, Paolasso, and Giugliano 2014). The review included 20 studies (79 808 participants), four of which were three-arm studies. The primary outcome is stroke reduction.
The full data set -in long arm-based format -is also part of R package netmeta.
R> data("Dogliotti2014", package = "netmeta") R> Dogliotti2014 %>% head (7) study id treatment stroke total 1 AFASAK-I 1989 1 VKAs 9  The first three rows pertain to the three-arm AFASAK-I study. For each treatment arm, the treatment label (treatment), the number of strokes (stroke), and the group sample size (total) is available. The subsequent rows belong to the two-arm BAATAF and CAFA studies. An advantage of this long arm-based format is that a four-arm study could be easily added by four additional rows.

Data set 3: Treatments for chronic obstructive pulmonary disease
The third data set (record no. 501201 in nmadb) comes from a systematic review of randomized controlled trials on pharmacologic treatments for chronic obstructive pulmonary disease (COPD, Baker, Baker, and Coleman 2009). The primary outcome, occurrence of one or more episodes of COPD exacerbation, is binary (yes/no). For this outcome, five drug treatments (fluticasone, budesonide, salmeterol, formoterol, tiotropium) and two combinations (fluticasone + salmeterol, budesonide + formoterol) were compared to placebo. The authors considered the two combinations as separate treatments instead of evaluating the individual components which is not surprising as the first methodological publication on how to evaluate individual components in NMA was published in the same year (Welton et al. 2009).
In this NMA, 38 studies (29 two-arm, three three-arm, six four-arm studies with a total of 28 216 patients) were included. The data set is again in long arm-based format. For each treatment arm, the treatment label (treatment), the number of patients with COPD exacerbations (exac), and the group sample size (total) is available. We can see that the Mahler study has four arms, comparing fluticasone, salmeterol, and their combination to placebo.

Network meta-analysis in theory
The notation in this paper follows Schwarzer et al. (2015, Chapter 8). Let n be the number of treatments of interest in a network (also called nodes or vertices). Each study contributes a number of pairwise comparisons, for example three-arm studies contribute three pairwise comparisons and four-arm studies contribute six pairwise comparisons. The sum of all pairwise comparisons across studies is denoted as m. Let k be the number of independent studies. If there are only two-arm studies, m corresponds to k, while m is greater than k if at least one study evaluates more than two treatments.

Standard NMA model
The common (fixed) effects and random effects NMA model developed by Rücker (2012) is based on graph and electrical network theory (Bailey 2007). In order to fit this model, treatment estimates of all pairwise comparisonsθ = (θ 1 , . . . ,θ m ) ⊤ and corresponding standard errors s i , i = 1, . . . , m, must be available. A multi-arm study with p treatments will contribute p(p − 1)/2 comparisons, meaning that treatment effects and standard errors need to be provided for each of them, e.g., a three-arm study will contribute three pairwise comparisons and a four-arm study will contribute six pairwise comparisons. As is common in meta-analysis, standard errors are assumed to be known and fixed. For multi-arm studies, adjusted standard errors are utilized (Rücker and Schwarzer 2014). In addition, a m × n design matrix X is required for the analysis which is defined by the network geometry, i.e., the way that m pairwise comparisons connect the n treatments. The elements of X are 0, 1 or −1 and represent the treatments (columns) of the comparison in the respective row.
The first step to get network treatment estimatesθ NMA is the calculation of the Moore-Penrose pseudoinverse matrix (Albert 1972). To this aim, we define the n × n Laplacian matrix L as where W is a diagonal matrix of dimension m × m whose diagonal elements are the inverse variance weights (1/s 2 1 , . . . , 1/s 2 m ). Matrix L has rank n − 1 and is thus singular. However, its Moore-Penrose pseudoinverse L + (Rao and Mitra 1971;Gutman and Xiao 2004) is defined and can be calculated by where J is a n × n matrix whose elements are all 1.
Once we have L + , we can finally calculate the network treatment estimatesθ NMA aŝ where H is typically called the hat matrix. From (2) we can see that elements ofθ NMA are linear combinations of elements fromθ (the observed estimates), with coefficients coming from rows of H.
The variance-covariance matrix ofθ NMA can be calculated as XL + X ⊤ which can be used to estimate all treatment contrasts and associated standard errors. These estimates are the same as obtained by weighted maximum likelihood (Yates 1940;Paterson 1983;Senn, Gavini, Magrez, and Scheen 2013).
A random effects model can be defined assuming a common heterogeneity variance τ 2 for each pairwise treatment comparison. To fit a random effects model, an estimate of τ 2 is added to the variance of each comparison, s 2 i +τ 2 , i = 1, . . . , m before calculating L in (1). A special case of the generalized DerSimonian-Laird estimate is a common estimator for τ 2 (Jackson, White, and Riley 2013;Rücker and Schwarzer 2014). Maximum likelihood and restricted maximum likelihood estimates for τ 2 are also available (Jackson, Riley, and White 2011;White 2015).

Ranking of treatments
Several ranking metrics exist to summarize the results from NMA and produce a treatment hierarchy. One can calculate the probability of each treatment being at each possible rank. This is traditionally done in a Bayesian framework drawing from the posterior distribution of network treatment estimates, but is also possible in a frequentist environment using resampling techniques. Surface under the cumulative ranking curve (SUCRA) values provide a summary of the rankograms (Salanti, Ades, and Ioannidis 2011). The name originates from the fact that SUCRA is calculated as the surface under the cumulative ranking probability curve. P-scores are a frequentist analogue to SUCRA values without resampling methods . SUCRA and P-score express the extent of certainty that a treatment is better than another treatment, averaged over all competing treatments.

Evaluating heterogeneity and inconsistency
The overall heterogeneity/inconsistency statistic Q total is defined as This statistic measures the total heterogeneity/inconsistency in the network. When all studies are two-arm studies, Q total follows a χ 2 distribution with k − (n − 1) degrees of freedom under the null hypothesis of no heterogeneity, conditional on the (assumed to be known) comparisonspecific variances. More general, each p-arm study contributes p − 1 degrees of freedom to Q total . The sum of the degrees of freedom contributed by each study minus n − 1 (the number of treatments minus 1, which is the dimension of the consistent subspace) gives the total degrees of freedom denoted by df in the following.
A generalized I 2 statistic (Higgins and Thompson 2002) can be defined as Defining a design as a unique set of treatments compared in an individual study (Higgins, Jackson, Barrett, Lu, Ades, and White 2012), the maximum number of designs is 2 n − n − 1 for n treatments. This leads to four possible designs for three treatments A, B, and C: A:B, A:C, B:C, A:B:C; or 11 designs for four treatments. Usually not all possible designs are present in an NMA. In a pairwise meta-analysis, all trials have the same design, e.g., A:B.
In some cases, inconsistency between designs might be likely, when they differ systematically: for example, when all two-arm studies in a network are older, while three-arm studies are more recent, design is a potential source of inconsistency.
Through an appropriate decomposition of the Q total statistic ) we can determine the heterogeneity of study results within the same design, Q W , and the inconsistency in treatment effects between different designs, Q B . A decomposition of Q W into parts coming from each study and a decomposition of Q B into parts coming from each design is also possible. Thus, for assessing the global inconsistency in a random effects model, the Q B statistic can be calculated based on a full design-by-treatment interaction random effects model (Higgins et al. 2012), considering the "method of moments" estimate for τ 2 (Jackson, White, and Riley 2012).
A method to evaluate local inconsistency in each treatment comparison separately is by using the separate indirect from direct evidence (SIDE) method (Dias, Welton, Caldwell, and Ades 2010). According to SIDE, each network estimate is split into the direct and the indirect estimate which are then tested against each other. A z test of the difference between direct and indirect estimate indicates potential evidence for inconsistency for each comparison in the network. A similar method, originally implemented in a Bayesian framework is backcalculation (Dias et al. 2010). Using this method, an indirect estimate with its variance is derived from the respective quantities derived by pairwise and network meta-analysis. Similarly to SIDE, the difference between direct and indirect estimate is used to construct a test statistic to estimate inconsistency.

Specific NMA methods for rare binary outcomes
When the outcome of interest is binary and studies provide number of events and sample sizes in each treatment group as in Dogliotti et al. (2014), we can estimate relative treatment effects (e.g., log odds ratios) and corresponding standard errors from each study. Subsequently, we can perform a standard NMA as described in Section 3.1.
The estimation of study-specific odds or risk ratios is based on approximations that do not perform well when events are rare either because event rates are low or group sample sizes are small. Especially treatment estimates can assume infinite values in studies with one or more zero event numbers. This can be circumvented via a "continuity correction"; this, however, may lead to low model performance. Conversely, there are two NMA models explicitly designed for rare binary outcomes currently supported by netmeta, the non-central hypergeometric (NCH-NMA) and the Mantel-Haenszel (MH-NMA) model.
The NCH-NMA model was proposed by Stijnen, Hamza, and Ozdemir (2010). It models the likelihood of events in each study arm conditional on the total number of events in the study via a noncentral hypergeometric distribution. In netmeta this is done using the so-called Breslow approximation, which is only valid when the numbers of events are small relative to the group sizes. Of note, netmeta provides only a common effects version of the NCH-NMA model and maximizes the combined log-likelihood of all studies to estimate all log odds ratios in the network, as well as their variance-covariance matrix. Studies with zero events in all treatment arms do not contribute to the likelihood.
The second method, MH-NMA, was proposed by Efthimiou et al. (2019) and generalizes the Mantel-Haenszel method for pairwise meta-analysis (Mantel and Haenszel 1959) which is a common effects method. The analysis is conducted in three stages. At stage one, studies with total zero events are removed and studies are grouped afterwards by design. Within each design, treatments with zero events in all studies are removed and designs left with only one treatment are removed.
At stage two, log odds ratios are estimated within each design. For two-arm studies, we employ the usual MH estimator (Mantel and Haenszel 1959). For a design d including p treatments, we need p − 1 odds ratios. To calculate them, we define with i indicating the study index, a Xi and b Y i being the number of events and non-events within treatment arms X and Y respectively and t (+i) being the total sample size of study i. We define as C XY the sum of all c XY i over all studies of the same design and we then define L XY = ln (C XY /C Y X ). The log odds ratio for XY for design d is then estimated where L X+ = p J=1 L XJ , likewise for other log odds ratios from this design. The expressions for the variance-covariance matrix are provided elsewhere (Greenland 1989;Efthimiou et al. 2019). At the end of this stage, we obtain a vector of log odds ratios, and corresponding variance-covariance matrix for design d.
Finally, at the third stage we synthesize the design-specific estimates across the network assuming consistency. All treatment contrasts versus an arbitrarily chosen reference treatment are the basic parameters of the model. The network treatment effects for all comparisons in the network and their variance-covariance matrix are then estimated using weighted least squares regression.

Component NMA
Treatments in NMA may sometimes be combinations of several interventions, i.e., may be composite sharing one or more common components, see Baker et al. (2009) for an example. The standard approach in analyzing such data sets is an NMA where all existing (simple or composite) treatments in the network are considered as different nodes in the network. Methods for such a standard analysis are described in Section 3.1.
However, we might be interested in estimating the individual effect of each component, to help us identify the best combination of components. If some treatments are combinations of common components, an additive component NMA (CNMA) model can be used to evaluate the effect of each component (Welton et al. 2009). This model assumes that the effect of a treatment combination is the sum of the effects of its components, which implies that common components cancel out in comparisons, and do not impact on the relative treatment effects.
The design matrix of the additive CNMA model is the n × c matrix Matrix X combines information on the structure of the network (B with m rows, representing the pairwise comparisons, and n columns, representing the treatments) and information on how the n treatments are composed of the c components (n × c matrix C). The elements of X are 0, 1 or −1 and represent the components (columns) of the comparison in each row. The additive model (common effects version) is given bŷ whereθ is the vector of observed relative effects from the studies, X the design matrix given in (3), and β a parameter vector of length c, representing the components, which is estimated using weighted least squares regression. For details of the estimation see Rücker et al. (2020b). Interaction CNMA models are readily available by additional columns in the components matrix C, thus allowing for synergistic or antagonistic effects when combining two components. In some cases CNMA models can also be applied to networks that are disconnected (i.e., they consist of two or more separate subnetworks).

Network meta-analysis in practice: Overview of netmeta
R package netmeta implements the (C)NMA approaches introduced in the previous section. In this section we describe a workflow to perform a (C)NMA, along with the related functions of netmeta ( Figure 1). After installing netmeta, we make it available for the current R session.

R> library("netmeta")
Loading required package: meta Loading 'meta' package (version 6.2-1). Type 'help(meta)' for a brief overview. Readers of 'Meta-Analysis with R (Use R!)' should install older version of 'meta' package: https://tinyurl.com/dt4y5drs Loading 'netmeta' package (version 2.8-1). Type 'help("netmeta-package")' for a brief overview. Readers of 'Meta-Analysis with R (Use R!)' should install older version of 'netmeta' package: https://tinyurl.com/kyz6wjbb We can see from the printout that R package meta (Balduzzi et al. 2019) is also loaded, which provides some basic functionality, and that brief overviews of methods and general hints on meta and netmeta are available in the R help system. The following command specifies that treatment estimates and confidence interval limits should be printed with two digits and standard errors with three digits. Available example data sets R> settings.meta(digits = 2, digits.se = 3) These settings are recognized by all print and plot functions for (network) meta-analysis objects in the current R session.

Organize data and perform main analysis
The main function of netmeta for standard NMA is likewise called netmeta(). It needs the data to be in contrast-based format, which means that each row of the data set contains information from one pairwise comparison. For multi-arm studies, all pairwise comparisons must be provided, e.g., a four-arm study must contribute six pairwise comparisons (see Section 3.1). Furthermore, netmeta() expects that the estimated treatment effects (argument TE) and corresponding standard errors (argument seTE) are available for all pairwise comparisons as well as information on the two treatment groups (arguments treat1 and treat2). Finally, netmeta() must know which rows belong to each multi-arm study, as standard errors are adjusted to take the correlation between pairwise comparisons into account (Rücker and Schwarzer 2014). Accordingly, argument studlab is mandatory for a network with at least one multi-arm study and study labels must be identical. For example, "LARGO", "Largo", and "Largo" for the pairwise comparisons of the three-arm study in Stowe2010 would be considered as three independent studies.
Typically, a data set is not in the contrast-based format expected by netmeta() and must be pre-processed accordingly. Auxiliary function pairwise() can be used to transform a data set from wide or long arm-based format to the contrast-based format and to calculate all pairwise treatment comparisons. The function can be used with binary, continuous, generic outcomes or incidence rates by internally calling metabin(), metacont(), metagen(), or metainc() from R package meta to calculate the treatment estimates and standard errors. An R object created with pairwise() contains all information to conduct an NMA and can be used as single input to netmeta().
R function netmeta() has several additional arguments, some of which are described below. By default, both common and random effects NMA are conducted. This can be changed using the logical arguments common and random. The method to estimate the between-study variance can be specified by method.tau. Finally, a reference treatment can be specified with argument reference.group; this only affects the way results are presented, but not the analysis itself. The first treatment in the network is used as reference if argument reference.group was not specified in pairwise() or netmeta().
The function netconnection() can be used to obtain information on the network structure and to determine whether a given network is fully connected or consists of subnetworks. Again, an R object created with pairwise() can be used as single input to netconnection(). For disconnected networks, use of netmeta() will result in an error with the recommendation to use netconnection() for further information.
R functions pairwise() and netconnection() can also be used in connection with rare binary outcomes or when conducting a component NMA. Likewise, many methods described in the next two sections are also relevant to rare binary outcomes and CNMAs.

Presentation of network structure, results and rankings
In general, the first step of an NMA is to construct a network graph in order to get an overview of the network structure. After conducting the NMA, results can be summarized using forest plots or league tables. A ranking of all treatments evaluated in an NMA provides additional information on the merits of individual treatments.

Network graphs
A graphical presentation of the network can be obtained through netgraph() which can be used with an R object created with netconnection() or netmeta(). Each treatment is represented by a point (node) in the plane and treatments are connected by a line (edge), if at least one direct pairwise comparison exists. By default, multi-arm studies are not highlighted in a network graph. If argument multiarm is TRUE, three-arm studies are indicated by a different color for the area enclosing the respective nodes. Studies with more than three treatments cannot be visually marked in the network graph. The standard layout is a circle presentation with equally distanced treatments on the circle. Using argument seq = "optimal" (which can be abbreviated), the number of line crossings is aimed to be minimized. Other layouts are available with iterate = TRUE (Rücker and Schwarzer 2016) which optimize the distance between treatment nodes using a stress majorization algorithm (Kamada and Kawai 1989;Hu 2012).
R function netgraph() has several additional arguments; three of them, more commonly used, are the following: edges of the network have a 3-D look if argument plastic = TRUE (default) and the number of studies contributing to each pairwise comparison can be printed on the edges (number.of.studies = TRUE). Finally, the width of the edges is determined by argument thickness. By default, edges are proportional to the number of studies directly comparing treatments.

Forest plots
Forest plots can be created using the generic function forest() or plot() which can be used, amongst others, with 'netmeta' objects. Basic settings include specifying whether results of the common or random effects model should be shown, and which treatment(s) should be used as reference (argument reference.group). In general, the settings from 'netmeta' objects are used as defaults. Results of the random effects model are shown in the forest plot if both a common and random effects NMA were conducted (can be changed using argument pooled). The first treatment in the network meta-analysis is used as reference if argument reference.group was not specified in netmeta() and forest().
Other important arguments include sortvar to change the order of treatments in the forest plot and the logical arguments drop.reference.group and baseline.reference to not print a line with the reference group or to switch the order of reference and other treatments, respectively.

League tables
A concise way to present NMA results is a league table (Hutton et al. 2015) which is a square matrix showing all pairwise comparisons in an NMA. For a single NMA, network estimates are typically shown in the lower triangle and estimates from direct pairwise comparisons in the upper triangle. It is also common to show results from two NMAs in a single league table, e.g., network estimates from an efficacy outcome in the lower triangle and a safety outcome in the upper triangle.
R function netleague() can be used to create league tables for 'netmeta' objects. Either one or two NMAs can be provided as input. Furthermore, the user can specify whether to construct a league table for common or random effects network estimates (arguments common and random), and how to sort the league table (argument seq). Finally, league tables can be easily exported, e.g., as Excel sheets, to be further edited for a publication.

Ranking of treatments
R function rankogram() can be used to calculate the probabilities of each treatment being at each possible rank, and SUCRA values for a 'netmeta' object. A pivotal argument is small.values determining whether small values in the outcome are beneficial ("good") or harmful ("bad"). If argument small.values is not provided in the netrank() command, it is taken from the 'netmeta' object which defaults to "good". Counter-intuitive ranking results are typically due to using the wrong value for argument small.values. R function plot.rankogram() plots these probabilities in a separate graph for each treatment, commonly referred to as rankograms.
R function netrank() can be used to generate a treatment hierarchy for a 'netmeta' or 'rankogram' object using P-score or SUCRA. The ranking metric to be calculated can be selected using argument method. The meaning of the small.values argument is the same as in the rankogram() function.
Additional functions are available to (partially) rank treatments for more than one outcome (Rücker and Schwarzer 2017). R function netposet() creates a partial order of treatment ranks. Input to netposet() can be any number of NMAs created with netmeta() or an existing ranking matrix. Function plot.netposet() can be used to create a scatter plot for rankings of two outcomes and hasse() can be used to generated a Hasse diagram (Carlsen and Bruggemann 2014) for any number of outcomes.

Evaluation of heterogeneity and inconsistency
The decomposition of the overall Q statistic described in Section 3.3 can be performed with decomp.design() for 'netmeta' objects.
The function netsplit() provides two methods to compare direct and indirect evidence for each pairwise comparison. The first method (argument method = "Back-calculation"), separate indirect from direct evidence (SIDE), was introduced by Dias et al. (2010) and a corresponding method using back-calculation was described by . The second method ("SIDDE"), separate indirect from direct design evidence (SIDDE) was described in Efthimiou et al. (2019). For the random effects model, the direct treatment estimates are based on the common between-study variance τ 2 from the NMA. By default, SIDE is used for 'netmeta' objects and SIDDE for 'netmetabin' objects. The output for SIDE or SIDDE can be very extensive for large networks. Accordingly, argument show = "both" can be used in print.netsplit() and forest.netsplit() to only show pairwise comparisons contributing both direct and indirect evidence.
The net heat plot  implemented in netheat() visualizes hot spots of inconsistency in an NMA. Note that the net heat plot is a visualization of the "betweendesigns" part of function decomp.design(). The rows and columns of the plot correspond to pairwise comparisons within designs. Not all comparisons are given in the net heat plot, only those that can contribute to inconsistency. The colors on the diagonal represent the inconsistency contribution of the corresponding design (red = large). The colors on the offdiagonal are associated with the change in inconsistency between direct and indirect evidence in a network estimate in the row after relaxing the consistency assumption for the effect of one design in the column. A blue-colored element indicates that the evidence of the design in the column supports the evidence in the row, whereas a red-colored element indicates that the evidence of the design in the column contrasts to the evidence in the row ).

Additional analyses
The function netbind() can combine different (C)NMA objects. This is useful when the aim is to display the results of several (C)NMA in a forest plot.
The function funnel.netmeta() generates a "comparison-adjusted" funnel plot to assess funnel plot asymmetry in an NMA (Chaimani and Salanti 2012;Chaimani, Higgins, Mavridis, Spyridonos, and Salanti 2013). The user must specify a meaningful treatment order (argument order) which should reflect the (a priori) expected direction of small study effects in each pairwise comparison. See Chaimani and Salanti (2012) for more details on specifying the order.
The function netimpact() evaluates the importance of individual studies in an NMA by calculating the reduction of the precision when the study is removed from the network (Rücker, Nikolakopoulou, Papakonstantinou, Salanti, Riley, and Schwarzer 2020a). The function netcontrib() gives the percentage contribution of each direct comparison (or study) to each network estimate using an algorithm based on flow decomposition (Papakonstantinou et al. 2018) or a random walk approach (Davies, Papakonstantinou, Nikolakopoulou, Rücker, and Galla 2022).
The function netmeasures() provides measures for quantifying the direct evidence proportion (i.e., the contribution of direct effect estimates combined for two-arm and multi-arm studies to each network estimate), the mean path length, and other measures ).
Finally, netpairwise() conducts separate pairwise meta-analyses for all comparisons with direct evidence. In contrast to netmeta() and netsplit() unadjusted standard errors are used in the calculations and the between-study heterogeneity variance is allowed to differ between comparisons.

Specific NMA methods for rare binary outcomes: netmetabin()
The netmetabin() function can be used to perform an NMA for rare binary outcomes, as described in Section 3.4. Like netmeta(), the function requires data to be supplied in contrast-based format, where each row contains the number of events and sample sizes from the respective pairwise comparison. This data format can be obtained using pairwise(). Three NMA methods for binary outcomes are available: NCH-NMA (method = "NCH"), MH-NMA ("MH", default), and the standard NMA model ("Inverse") by Rücker (2012). For NCH-NMA and MH-NMA, only the odds ratio can be used as effect measure (argument sm).
For the inverse variance method, all effect measures provided by metabin() from meta can be used, however, this method is typically not suitable for rare binary outcomes. Of note, pairwise() and netmeta() are called internally for the inverse variance method.
By default, netmetabin() excludes studies with zero events in all arms and does not use a continuity correction for studies with zeros in some, but not all treatment arms. This may lead to designs having only one treatment left. These designs are excluded from the network meta-analysis, which may result in network connectivity issues, i.e., the network may become disconnected. A continuity correction can be used in such cases with argument cc.pooled = TRUE which uses a continuity correction by adding a small increment (as defined by argument incr) to all events and non-events in a study with some zero events. However, as noted in Section 3.4, using a continuity correction may lead to low model performance.
4.6. Component NMA: netcomb(), discomb() CNMA models, described in Section 3.5, are implemented in netcomb(). The main argument of netcomb() is an R object created with netmeta(). A common separator must be used for all combined treatments in the network and by default the plus sign is used in netcomb() (argument sep.comps = "+"). Accordingly, for combinations named in the form A + B (with treatment components A and B), netcomb() automatically conducts a CNMA and estimates the effects of all treatment components and combined treatments in the network, based on the additive model. The separator must be a single character, for example, setting argument sep.comps = "and" will result in an informative error message. Finally, specifying a separator that is not available in any treatment label will give the same results as netmeta() as each treatment combination will be handled as a separate treatment.
A special property of CNMA is that it is possible to analyze disconnected networks (i.e., networks that consist of two or more separate subnetworks) if subnetworks share at least one common treatment component. For example, if one subnetwork contains treatment A + B and another subnetwork contains treatment A+C, then A is a common component of the two networks. A CNMA analysis of a disconnected network can be conducted using discomb().
As standard NMA cannot be applied to a disconnected network, discomb() needs to start from scratch instead of using a network meta-analysis object. Accordingly, the function shares many arguments with netmeta() and netcomb() and can be used with a 'pairwise' object as single input (Rücker et al. 2021).
Functions netcomplex() and netcomparison() can be used to calculate the treatment effect for any complex intervention, i.e., combination of treatment components, or comparisons of complex interventions. Input is an object created with netcomb() or discomb().

Example 1: Adjuvant treatments in Parkinson's disease
This data set is in wide arm-based format and includes several two-arm studies and a single three-arm study. Accordingly, we can use pairwise() to transform the data set into contrastbased format and to estimate treatment effects and corresponding standard errors.
R> pw1 <-pairwise(treat = list(t1, t2, t3), n = list(n1, n2, n3), + mean = list(y1, y2, y3), sd = list(sd1, sd2, sd3), + studlab = study, data = Stowe2010) For a continuous outcome, we have to specify all variables containing information on group sample sizes (argument n), means (mean), and standard deviations (sd). Furthermore, we have to provide the treatment labels (treat) and -as we have multi-arm studies -the study labels (studlab). For a data set in wide arm-based format, the information for these quantities with exception of the study labels is distributed over several variables, e.g., t1, t2, and t3 for treatment labels. Therefore, we are using list() to collate all respective variables. Finally, we specify the data set in argument data. By default, the mean difference is used as effect measure for continuous outcomes (argument sm = "MD"). Alternatively, we could use standardized mean differences ("SMD") or ratio of means ("ROM"). For the standardized mean difference, the method by Crippa and Orsini (2016) is used to guarantee consistent SMDs and standard errors for multi-arm studies.
R object pw1 is a data frame with 31 rows, each corresponding to a pairwise comparison (28 rows from 28 two-arm studies and 3 rows from the three-arm study), and 25 columns (variables).
The following commands show all rows in pw1 coming from the three studies we already considered in Section 2.1. We see that the two-arm studies contribute a single row whereas the three-arm study contributes three rows.
The first five variables in object pw1 are the main input to netmeta(). R> net1 <-netmeta(TE, seTE, treat1, treat2, studlab, data = pw1, + common = FALSE, ref = "plac") Here, we specify that we want to only see results for the random effects model (common = FALSE) and use placebo as reference (reference.group); otherwise the first treatment would be used as reference.
We can also use object pw1 directly to conduct the NMA.

Network graphs
Before printing the results of the NMA, we have a look at the network graph, which does not use the 3-D look (argument plastic), marks the three-arm study ( The network graph is shown in Figure 2. The line width is proportional to the number of studies directly comparing the treatments and the three-arm study is marked by the blue area. We immediately see that the comparisons dopamine agonist versus placebo and COMTI versus placebo include the largest number of studies. Furthermore, the comparison COMTI versus MAOBI was only considered in the three-arm study. We may want to avoid the crossing lines in this small network which is due to the circle presentation. We can switch to a different layout using argument iterate = TRUE. A disadvantage of all layouts but the circle presentation is that treatment labels may overlap with edges. In the network graph generated with the following command (figure not shown) the treatment label for placebo is overlapping with an edge.

R> nodes1 <-netgraph(net1, iterate = TRUE)$nodes
Each call of R function netgraph() creates an invisible list containing data frames with information on nodes and edges. The above command extracts the information for nodes. The x and y adjustment of treatment labels is stored in the variables adj.x and adj.y.

Network meta-analysis results
The main NMA results are reported by printing net1.

R> net1
Number The printout starts with a concise summary of the network: the number of studies, k, the number of pairwise comparisons, m, the number of observations, o, the number of treatments, n, and the number of designs, d.
The network estimates for the random effects models are provided next, taking placebo as reference treatment (as we used reference.group = "plac" to generate net1). We get network estimates for all treatments compared to placebo, even for treatments without any direct comparisons with placebo (which is not the case in Stowe2010 as comparisons with placebo are available for all treatments, see Figure 3). Using argument all.treatments = TRUE, we would get three matrices providing the random effects network estimates, lower and upper confidence limits for all observed and unobserved pairwise comparisons.
Recalling that the outcome is the difference in means of the off-time reduction of the first treatment minus the second treatment, the random effects NMA shows strong evidence that all adjuvant therapies, on a background of levodopa, reduced the off-time compared to placebo. Dopamine agonists seem to perform somewhat better than COMTI and MAOBI, with a average reduction of about 1.5 hours per day instead of 1 hour per day.
Information on heterogeneity and inconsistency is provided next. According to the Cochrane Handbook , I 2 indicates a moderate level of heterogeneity/inconsistency. Furthermore, inconsistency between designs seems not to be an issue in this NMA.
The summary.netmeta() function generates a more detailed NMA printout including information on all pairwise comparisons from individual studies.
R> print(summary(net1), truncate = studlab %in% selstudy1, nchar.trts = 4, + nma = FALSE) Original data (with adjusted standard errors for multi-arm studies): We use the additional arguments to only show individual results for the previously selected three studies (argument truncate), abbreviate treatment labels to four characters (argument nchar.trts) and suppress the printing of the NMA results (argument NMA) which are identical to the printout of net1.
The first part of the detailed printout shows the original data in contrast-based format. The study labels and the first four columns (treat1 to seTE) essentially come from pw1. We only notice that the third comparison of the three-arm study has been switched as treatments are sorted alpha-numerically by netmeta(). Accordingly, the mean difference switches from 0.78 (placebo versus MAOBI) to −0.78 (MAOBI versus placebo).
Next we see the adjusted standard errors (seTE.adj) which are identical for two-arm studies, however, inflated for the three-arm study. We can easily spot multi-arm studies in the printout as they are marked by an asterisk in column multiarm. Column narms reports the number of the treatment arms in each study which is also provided in a more concise way in the next part of the printout (labelled "Number of treatment arms (by study)").
The next part gives the network estimates under the random effects model for each pairwise comparison in the three studies. We notice that the network estimates and confidence intervals are the same for the three comparisons of COMTI and placebo. All network estimates and confidence intervals are identical to the printout of net1.
The legend at the bottom is printed as treatment labels are abbreviated. We could suppress printing of the legend using argument legend = FALSE.

Forest plots
A standard forest plot with placebo as reference shown in Figure 4 can be generated with one of the following commands.

R> forest(net1) R> plot(net1)
A fuller picture is provided by the following command creating a forest plot showing all pairwise comparisons of all active treatments with other treatments ( Figure 5). Of note, the abbreviated treatment names provided to argument reference.group must be (and are) unambiguous.

Ranking of treatments
A treatment ranking using P-scores clearly shows that dopamine agonists appear to be the most effective treatment. The average proportion of treatments worse than dopamine agonists is about 97% under the random effects model. A more detailed picture of the treatment hierarchy is given by the rankograms for the four competing treatments. We use set.seed() to make resampling results reproducible. Dopamine agonists have 91.7% estimated probability of producing the best value compared to only 7.7% and 0.6% for the other active treatments. Placebo ranks last with 100% estimated probability. By default, 1000 simulations are conducted which could be changed in rankogram() using argument nsim.
The following command creates the rankogram shown in Figure 6.
R> plot(ran1) The printout shows that there are three pairwise comparisons contributing both direct and indirect evidence, for none of which there is evidence of inconsistency. Furthermore, COMTI Finally, the net heat plot also does not show any evidence of disagreement between direct and indirect estimates (Figure 7).

Example 2: Antithrombotics to prevent strokes
We change the appearance of confidence intervals for the next two examples, which both use the odds ratio as effect measure.
R> pw2 <-pairwise(treat = treatment, n = total, event = stroke, + studlab = study, data = Dogliotti2014, sm = "OR") This command is more concise compared to the first example using list() as the pivotal arguments relate to single variables in a data set in long arm-based format. For binary outcomes, we have to specify sample sizes (argument n) and the number of events (event) in addition to treatment and study labels. The odds ratio is used as effect measure (while the risk ratio is the default).
In R object pw2, variable TE contains the log odds ratio and seTE its standard error. Common and random effects NMAs using the inverse variance method can be conducted in the usual way.
R> net2 <-netmeta(pw2, ref = "plac") Comparison not considered in network meta-analysis: studlab treat1 treat2 TE seTE WASPO, 2007 VKAs Aspirin NA NA We get a warning that the WASPO study is excluded from the NMA. We can confirm that the study is excluded due to zero events in both treatment groups.
As the example concerns a rare binary outcome, we also perform NMA using the Mantel-Haenszel approach and we compare the results with the inverse variance (conventional) NMA.

Example 3: Treatments for chronic obstructive pulmonary disease
We start the analysis of the third example data set by pre-processing the data for NMA.
The network graph in Figure 10 was generated with the following command. We wanted the placebo node to be on the right so we use argument rotate to turn the network plot three positions counter-clockwise (−3 · 360 • /8 treatments = −3 · 45 • ).
The standard NMA shows that all treatments but budesonide and formoterol are more effective than placebo. Furthermore, the combination treatments are more effective than the individual treatments.

R> net3
Number The next step is based on the assumption that the addition of placebo in any of the other treatments in the network does not change the risk of COPD exacerbations (we specify this through the argument inactive). (We note in parentheses that it is not necessary in general to specify an inactive treatment. For example, this is not possible if there are only active interventions, or if an active effect must be assumed also for a reference intervention such as "treatment as usual"). Assuming placebo as inactive, we now conduct a CNMA. We do not have to specify the separator (argument sep.comps), as the plus sign used in treatment labels of Baker2009 is the default.
R> (nc3 <-netcomb(net3, inactive = "plac")) The number of active components is printed in addition to the other well known quantities, followed by results for treatment combinations and components. The estimated odds ratio for the combination fluticasone + salmeterol is equal to the product of the component odds ratios: 0.74 = 0.90 · 0.82. Equivalently, the log odds ratio of the combination is equal to the sum of the component log odds ratios. A comparison of the heterogeneity statistics shows that the additive model fits the data nearly as well as the standard NMA. The Q test under "difference" corresponds to a likelihood-ratio test (Rücker et al. 2021).
We use netbind() again to produce a forest plot with results from standard NMA and additive CNMA. Overall, the random effects estimates of standard NMA and additive CNMA do not differ much (Figure 11).

R> forest(nb3)
In a further step, we can rank treatments based on the treatment estimates from the additive CNMA.
R> netrank(nc3) Overall, the ranking order does not change when considering an additive CNMA instead of a standard NMA.

Conclusions and outlook
This article provides a general introduction and overview to R package netmeta for frequentist NMA. A central aim of netmeta is to provide methods for all essential steps in an NMA in a user-friendly way.
The main hurdle in netmeta is to transform the available data set into the comparisonbased format required by netmeta() for standard NMA, netmetabin() for NMA with rare binary outcomes, and discomb() for disconnected networks. Accordingly, use of R function pairwise() constitutes the main pre-processing step. Our examples demonstrate its use for the most common NMA data formats, i.e., wide arm-based and long arm-based format. After that, the typical (C)NMA workflow as outlined in this article is straightforward to follow, after conducting the main analysis which is the input to all subsequent analyses.
The simplicity of netmeta has led to its use by two web-applications, MetaInsight (Owen, Bradbury, Xin, Cooper, and Sutton 2019) and CINeMA (confidence in network meta-analysis, Nikolakopoulou et al. 2020).
While netmeta offers a comprehensive set of functions for NMA and an easy to use interface, it lacks on flexibility compared to the more general R packages metafor (Viechtbauer 2010) and mixmeta (Sera et al. 2019). A limitation of netmeta is that methods to conduct subgroup analysis and network meta-regression are not readily available. Moreover, further levels of hierarchical modelling cannot be added, in contrast with Bayesian models. Accordingly, possible future developments include providing subgroup analysis and meta-regression and linking one or more R packages for Bayesian NMA with netmeta in order to make more flexible NMA methods available to applied scientists in a user-friendly way.
In summary, netmeta offers a wide range of analysis and presentational tools for fitting frequentist NMA in R. This article presents the main capabilities of netmeta; however, the package is constantly updated with new features and methods in line with methodological research and new needs that arise when conducting an NMA.