kdecopula: An R Package for the Kernel Estimation of Bivariate Copula Densities

We describe the R package kdecopula (current version 0.9.0), which provides fast implementations of various kernel estimators for the copula density. Due to a variety of available plotting options it is particularly useful for the exploratory analysis of dependence structures. It can be further used for accurate nonparametric estimation of copula densities and resampling. The implementation features spline interpolation of the estimates to allow for fast evaluation of density estimates and integrals thereof. We utilize this for a fast renormalization scheme that ensures that estimates are bona fide copula densities and additionally improves the estimators' accuracy. The performance of the methods is illustrated by simulations.


Introduction
Dependence modeling with copulas has attracted a lot of attention in recent decades. By now, copulas are established tools in many fields of applied statistics, such as finance (Cherubini, Luciano, and Vecchiato 2004), hydrology (Salvadori and De Michele 2007), or machine learning (Elidan 2013).
At the very heart of copula theory is the famous theorem of Sklar (Sklar 1959). It states that any multivariate distribution function can be decomposed into the marginal distributions and a copula, which captures the dependence between variables. Let X and Y be two continuous random variables with joint distribution F and marginal distributions F X and F Y . Then, for all (x, y) in the support of the random vector (X, Y ), The copula C : [0, 1] 2 → [0, 1] is the bivariate distribution function of the random vector (U, V ) = F X (X), F Y (Y ) which has uniform marginal distribution. If F admits a density, we can also decompose the density f into where c, f X and f Y are the densities corresponding to C, F X and F Y , respectively.
One of the major benefits of copula-based modeling is that inference for marginal distributions can be separated from the modeling of the dependence structure, i.e., the copula. For the estimation of the copula density c, it is most common to take a two-step approach: First, obtain estimates F X , F Y of the marginal distributions. A convenient and flexible way to do this is to use the empirical distribution function as an estimator. Second, define pseudoobservations U , V = F X (X), F Y (Y ) . The copula density is then estimated as the joint density of U , V .
Often, one assumes a parametric model for the copula density c and estimates its parameters by maximum-likelihood. Although there is a large variety of parametric copula models, they notoriously lack flexibility and bear the risk of misspecification. Nonparametric density estimators remedy these issues. But since copulas live on a bounded support -the unit hypercube -estimators have to be carefully tailored to this problem.
A specific class of nonparametric density estimators are kernel estimators. They are a popular tool for exploratory data analysis and widely used in many disciplines (e.g., Aitken and Lucy 2004;Kie, Matthiopoulos, Fieberg, Powell, Cagnacci, Mitchell, Gaillard, and Moorcroft 2010). The package kdecopula implements several bivariate kernel copula density estimators that have been proposed in recent years. In a nutshell, the package provides methods for: • estimation, • bandwidth selection, • simulation, • visualization.
There exist two alternative methods for the kernel estimation of copula densities in R (R Core Team 2016): the function kcopula from the ks package (Duong 2014) implements an estimator that glues together two independent kernel estimates for the center and boundary of the unit square; the function npcopula from the np package (Hayfield and Racine 2008) derives the copula density from an estimate of the joint distribution (as proposed by Racine 2015). However, these implementation do not reflect the numerous specialized contributions on the topic. Our package closes this gap by implementing state-of-the-art methods for kernel copula density estimation. The implemented methods are substantially more accurate than existing implementations. Additionally, the package provides a normalization algorithm which ensures that estimates are a bona fide copula densities and further improves the accuracy.
Apart from kernel estimators, Schellhase (2014) implemented nonparametric copula density estimators based on penalized likelihood estimation in the pencopula package (using B-splines or Bernstein polynomials). The extension penDvine (Schellhase 2015) provides a convenient version with automatic bandwidth selection. A comparison with our implementation will show that these estimators are only competitive when the dependence is weak. The author is not aware of any software implementations for nonparametric copula density estimation outside of R.
In Section 2, we give a review of kernel copula density estimators and point to the relevant literature. Section 3 describes the functionality of the package and gives examples for its use. In Section 4, we give background on the implementation of the estimators using spline interpolation for fast evaluation and renormalization of the estimates. The statistical accuracy of the estimators in this package and other nonparametric copula density estimators (see previous paragraphs) is compared in Section 5. A summary is given in Section 6.

Kernel estimators of the copula density: a review
This section will review different approaches to kernel estimation of the copula density. As is common in the literature, we focus on the bivariate case.
Assume we have iid observations (U i , V i ), i = 1, . . . , n, from a bivariate copula C and are interested in the estimation of the corresponding density c(u, v). One could apply the usual kernel density estimator to this problem: where we used the notation K b (·) = K(·/b)/b. The kernel function K is typically assumed to be a symmetric, bounded probability density function on R 2 and b n > 0 is the smoothing or bandwidth parameter. There is a problem, however. The estimator will put a considerable amount of probability mass outside of the unit square. This implies that c n is not a density function on [0, 1] 2 , because it does not integrate to one. The estimator will additionally suffer from severe bias at the boundaries (see, e.g., Charpentier, Fermanian, and Scaillet 2006). Three different approaches to tackle this problem have emerged. All three techniques arose initially in the context of univariate kernel density estimation on the unit line. The following sections explain the ideas behind them (in the context of copulas) and give references for more detailed accounts.

The mirror-reflection method
An intuitive way of adapting c n to make sure that it is a density on [0, 1] 2 is the following: gather all probability mass that was put outside of the unit square, and redistribute it back to [0, 1] 2 . This is the idea behind the mirror-reflection technique, which was proposed for copula density estimation by Gijbels and Mielniczuk (1990). As indicated by the name, all data are reflected at the corners and edges of the boundary region. The augmented data set containing all reflections is given by A visualization of the augmented data set is given in Figure 1. The mirror-reflection estimator is then defined as the usual kernel density estimator on the augmented data: By reflecting all data points at the corners and edges also the probability mass outside of the unit square gets reflected back to the interior. As a result, the estimator now integrates to one. A detailed analysis of the asymptotic properties and a method for automatic bandwidth selection are given in Nagler (2014).

The beta kernel method
A second approach is to use kernels whose support matches the support of the density we want to estimate, and vary the shape of those kernels depending on the point where density shall be estimated. This is achieved by so-called boundary kernels, and beta kernels are one instance. An estimator of the copula density based on this idea was proposed by Charpentier et al. (2006): where β(·; p, q) is the density of a Beta(p, q)-distributed random variable. We refer to Nagler (2014) for details on asymptotics and bandwidth selection.

The transformation method
A third approach is inspired by the early work of Devroye and Györfi (1985) and was introduced to kernel copula density estimation by Charpentier et al. (2006). The simple idea is to transform the data so that it is supported on the full R 2 (instead of the unit cube). On this transformed domain, standard kernel techniques can be used to estimate the density. An adequate back-transformation then yields an estimate of the copula density. For the transformation, the inverse of standard normal cdf is most common since it is known that kernel estimators tend to do well for Gaussian random variables.
Denote Φ as the standard Gaussian cdf and φ its first order derivative. Then ( is a random vector with Gaussian margins and copula C. By Sklar's Theorem, the corresponding density f can be written as This density can be easily estimated by a standard kernel estimator. From such an estimator f n , we can derive an estimator for the copula density c by isolating c in (2): This procedure is illustrated in Figure 2. The left panel shows the original data from the copula density c; next to it we see the transformed data after the inverse Gaussian cdf has been applied. The third plot shows a kernel estimate of the density of the transformed data; and finally, the fourth plot shows the corresponding kernel estimate of the copula density.
The most natural choice for f n is the conventional kernel density estimator. More recently, Geenens, Charpentier, and Paindaveine (2017) proposed to use a local likelihood estimator with nearest-neighbor bandwidths instead. Another recent extension was introduced by Wen and Wu (2015) who suggested to taper the back-transformation in the tails by increasing the variance of the Gaussian densities in the denominator of (3). For more details, we refer to the original papers.

The package's functionality
In the following, we describe the most important functions provided by the package. All function either produce or take objects of the S3-class kdecopula for which several methods are available.

Estimation and bandwidth selection: kdecop
At the core of the kdecopula package is the function kdecop, which estimates the copula density from data. The only mandatory input is an n × 2 matrix of copula data, i.e., data with standard uniform margins. Such data is usually obtained in a first step by applying the empirical marginal cdf s to the data. This is equivalent to a rank transformation as shown below. The following lines of code load the package and an accompanying data set. The data is transformed to uniform margins in the third line, and the last line fits the kernel estimator.

R> summary(fit)
Kernel copula density estimate ( The function kdecop provides all estimation methods mentioned in Section 2. The estimation method can be specified via the method argument, e.g., kdecop(..., method = "MR"). For each method, we have implemented an automatic bandwidth selection procedure. Below we list all implemented methods including a reference to the bandwidth selection procedure used:

MR
The mirror-reflection estimator of Gijbels and Mielniczuk (1990). Smoothing parameters are selected by minimizing the AMISE using the Frank copula as the reference copula (see Nagler 2014, Section 3.2.4). beta The beta kernel estimator of Charpentier et al. (2006). Smoothing parameters are selected by minimizing the AMISE using the Frank copula as the reference copula (see, Nagler 2014, Section 3.3.3).

T
The transformation estimator of Charpentier et al. (2006), but allowing for a bandwidth matrix and not just one parameter. The bandwidth matrix is set by a rule of thumb which is the normal reference rule on the transformed domain (see, Nagler 2014, Section 3.4.4): where Σ Z is the empirical covariance matrix of Φ −1 (U i ) and Φ −1 (V i ) , i = 1, . . . , n.
TLL1, TLL2, TLL1nn, TLL2nn (default) The transformation local likelihood estimator of Geenens et al. (2017). TLL1 approximates the log-density linearly; TLL2 by quadratic polynomials. The -nn versions use nearest-neighbor bandwidths instead of fixed ones. For fixed-bandwidth versions, the bandwidth matrix is set by the rule of thumb where q is the degree of the polynomial, Σ Z is the empirical covariance matrix of Φ −1 (U i ) and Φ −1 (V i ) , i = 1, . . . , n. This rule of thumb is similar to the normal reference rule, but ensures that the bandwidth matrix vanishes at the mean-square optimal rate. It is possible to specify the bandwidths manually using the bw argument of kdecop, although we recommend against it. If it is necessary to manually make an estimate more or less smooth, we advise to use the bandwidth multiplier argument kdecop(..., mult = 1). Values larger than one will make the estimate smoother; values less than one make the estimate less smooth.

Working with the estimated density: (d/p/r)kdecop
In analogy to the usual (d/p/r)-prefixes for distribution families in R, we provide (d/p/r)versions for the kdecop-family. The functions dkdecop and pkdecop can be used to evaluate the density and cdf of a kdecopula object, respectively.  R> pkdecop(cbind(c(0.1, 0.9), c(0.1, 0.9)), fit) [1] 0.0327257 0.8505370 The rkdecop function simulates data from the estimated density. This can be done in two ways: a) using pseudo-random numbers based on runif, b) using quasi-random numbers based on ghalton from the qrng package (Hofert and Lemieux 2015).
R> pseudo <-rkdecop(500, fit) R> quasi <-rkdecop(500, fit, quasi = TRUE) 3.3. Visualization: the plot and contour generics For many people, the most interesting feature is probably to make exploratory plots. There are three common ways to visualize a copula density: (a) a surface (or perspective) plot of the copula density, (b) a contour plot of the copula density, (c) a contour plot of the copula density when combined with standard normal margins. The following three lines of code produce the plots shown in Figure 3. Optionally, further arguments can be passed to improve the aesthetics.

R> plot(fit) R> contour(fit, margins = "unif") R> contour(fit)
In the author's experience, the most useful plot is (c), the marginal normal contour plot. Copula densities usually explode at some corners of the unit square. As a result, it is not possible to reasonably visualize the raw density (a) on the whole [0, 1] 2 . The function plot.kdecopula() therefore restricts the displayed area to [0.01, 0.99] 2 . Obviously, this hides some information in the tails. This can be problematic because the tails are often of particular interest in copula models. But even on the restricted domain the copula density often attains values larger than 20 when there is strong dependence in the tails. On this scale, the shape of the copula  density close to the center of the unit square is difficult to assess. Similarly, the contours of the raw density (b) are inappropriate to reflect the tail behavior because contour lines near to the corners become too close to be visually distinguishable. The marginal normal contour plot overcomes these issues by transforming the margins such that the transformed density is bounded. It has the additional advantage that it allows for an intuitive interpretation that is relative to the Gaussian copula as explained in the following paragraph.
If the true copula is the independence copula, the contours are perfect circles (see, Figure 4a). This is obviously not the case for the estimated density in Figure 3. Figure 4b shows a Gaussian copula with Kendall's τ set to the estimated τ from the data. A Gaussian copula combined with Gaussian margins results in a bivariate Gaussian density and its contours are ellipses. Since most statisticians are familiar with this kind of distribution it seems natural to use this as a benchmark when interpreting the marginal normal contour plot for other copulas. The next plot, Figure 4c, shows the Student t copula (df = 3). Here the contours look like a diamond due to the higher density values in the tails (i.e., the corners of the square). This reflects that -in contrast to the Gaussian copula -the Student t copula exhibits tail dependence (e.g., Joe 2014), a concept that is very important in the modeling of risks. In general, a spiky shape in the corners of the contours is an indication of tail dependence in the respective corner. This can be observed again in Figure 4d and Figure 4e, where the Gumbel and Clayton copula are shown, respectively. The Gumbel copula is asymmetric and features upper tail dependence only. This is reflected by a spiky shape in the upper right corner and a flatter shape in the lower left corner. For the Clayton copula it is the other way around. Finally the Frank copula has no tail dependence and has lighter tails than the Gaussian, which corresponds to a more flat shape of the contours.
Going back to the estimated density in Figure 3, we see a rather flat shape in the lower left corner and a more spiky shape in the upper right corner. This would indicate that there is no lower, but upper tail dependence. Hence, the Gumbel copula is the most appropriate fit choosing from the parametric families in Figure 4. However, we also observe some asymmetry with respect to the main diagonal. This is not reflected by any of the parametric models under consideration.

Dependence measures
It is often useful to summarize the dependence in a single number, a dependence measure. Many of the popular dependence measures are functionals of the copula. In fact, this property is required by Rényi's axioms, see Schweizer and Wolff (1981). Such measures can be calculated for copula density estimates fitted with kdecop(). For example, Kendall's τ can be expressed as There are several ways to calculate this measure for a copula density estimate. A straightforward way is to solve the integral numerically with the cubature package (Narasimhan and Johnson 2016): R> library("cubature") R> f <-function(u) pkdecop(u, fit) * dkdecop(u, fit) R> int <-adaptIntegrate(f, lowerLimit = c(0, 0), upperLimit = c(1, 1)) R> 4 * int$integral -1 This method is quite slow because two numerical integrals (one is in pkdecop) are nested. A second way to compute τ utilizes a stochastic interpretation of (4). Note that τ = 4E[C(U, V )] − 1, where (U, V ) is a random vector with density c. The expectation can then be approximated by Monte Carlo integration: R> uv_qmc <-rkdecop(10^4, fit, quasi = TRUE) R> 4 * mean(pkdecop(uv_qmc, fit)) -1 The third way is to compute the sample version of Kendall's τ on simulated data. This method is asymptotically equivalent to the other methods, but simpler and faster: R> cor(uv_qmc[, 1], uv_qmc[, 2], method = "kendall") [1] 0.4718781 We see that the three methods lead to different results, but are close. The two Monte Carlo methods have the drawback that their result is random. A large number of samples need to be drawn to get an accuracy of multiple digits, but only the first one or two digits are useful for interpretation of the measure.

Implementation based on spline interpolation
Typically, the evaluation of a kernel density estimate requires going back to the original data. As a result, the computational effort increases with the sample size. We avoid that issue by evaluating the actual density estimate only once on a fixed number of grid points. For further evaluations we use cubic spline interpolation between the values on this grid. This way, the density can be evaluated efficiently -independently of the sample size. It has the additional advantage that analytical expressions for integrals of the (interpolated) density estimate become available. We make use of that fact to implement a fast renormalization scheme that ensures that the the density estimate is close to a bona fide copula density.

Evaluating the estimate by cubic spline interpolation
Recall that the support of a copula density is the unit cube [0, 1] 2 . Let m ∈ N and define a finite set of points p j ∈ [0, 1] such that 0 ≤ p 1 < · · · < p m ≤ 1. Then, the set defines a symmetric grid on the unit cube. Cubic splines are piecewise cubic polynomials that can be used to approximate or interpolate some function between points on a grid. We will show how cubic spline interpolation can be used to approximate a copula density estimate. We explain in detail how a one-dimensional cubic spline interpolation is constructed when one of the coordinates is fixed. The two-dimensional case is a straightforward extension and only sketched.

The one-dimensional case
Let us first fix v k and assume that the values of an estimate c(·, v k ) : [0, 1] → R + are available on the grid points u j , 1 ≤ j ≤ m. We want to interpolate the function c(·, v k ) at another point u * , where u j < u * < u j+1 for some j ∈ {2, . . . , m − 2}. We define the interpolated curve segmentc j,j+1 (·, v k ) : [u j , u j+1 ] → R as some cubic polynomial A cubic polynomial defined on a closed interval is fully determined by its function values and first derivatives at the boundary points. Definec j,j+1 1 as the partial derivative ofc j,j+1 w.r.t. its first argument. After some simple algebraic manipulations, we find that the coefficients of a cubic spline approximation can be written as Now we replacec j,j+1 (u j , v k ) andc j,j+1 (u j+1 , v k ) by the known values c(u j , v k ) and c(u j+1 , v k ). Similarly, we want to replace the derivativesc j,j+1 and c 1 (u j+1 , v k ). These are unknown, but can approximated by a finite difference scheme. We set Note that these can only be computed for 2 ≤ j ≤ m−2, since four distinct values c(u j+j * , v k ), j * = −1, 0, 1, 2, show up in the above formulas. For fixed v k and some u ∈ [u 2 , u k−1 ), the spline approximationc(u, v k ) of the function c(u, v k ) can then be written as We extended this to allow for the full range (u ∈ [0, 1]) by extrapolating the 'outer' two polynomials at the borders, i.e., The advantage of cubic spline interpolation is that it is easy to compute. In particular, the computational effort only depends on m, the number of knots. Additionally, the above approximation allows to write integrals as a sum of quartic polynomials, which can be computed equally fast. This will prove advantageous in Section 4.2, where we use such integrals to renormalize the copula density estimates.
The two-dimensional case q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q Bivariate functions can be approximated similarly by a sequence of two one-dimensional interpolations. We will illustrate this by a small example and omit any further details (for more, cf., Habermann and Kindermann 2007). Figure 5a shows the unit cube with grid points (u j , v k ) = (j/5, k/5), j, k = 0, . . . , 5, indicated as dots. Assume we know all function values c(u j , v k ) on this grid and want to approximate the function at the point (0.5, 0.5) indicated by a cross. We first do four one-dimensional (horizontal) interpolationsc(0.5, v k ), j = 1, 2, 3, 4 (triangles). Note that all values that are required to calculate the spline coefficients are known. Another one-dimensional (vertical) interpolation based on the four new valuesc(0.5, v k ) gives us the final interpolated valuec(0, 5, 0.5).

Choice of grid
In Figure 5a we showed a grid that has equal spacings between grid points. This seems natural, but in our context we found it more appropriate to use a grid that is equally spaced after a transformation by the inverse Gaussian cdf . Figure 5b depicts such a grid with 20 knots. It was constructed by placing 20 equidistant knots on the line segment [−3, 3] and then applying the Gaussian cdf to them. The two-dimensional set-product of these 20 points yields the final two-dimensional grid.
We see that the grid points are more sparse in the center of the unit square and concentrate towards the boundaries and corners. This choice takes into account that for copula densities the areas near the corners are most important. In those areas, copula densities often explode while being rather flat in the center. This allows us to keep the approximation errors in the important areas small. A nice side-effect is that the marginal normal contour plots described in the last section can be visualized more nicely. The number of knots can be controlled by the knots argument of kdecop and defaults to 30. A smaller number reduces computation time, but comes at the cost of a larger approximation error.

Renormalization of the density estimate
We now introduce the idea of iterative renormalization of kernel copula density estimators.
Recall that by the definition of a copula, the marginal densities have to be uniform, i.e., This property is of particular importance, when other functionals of the density are of interest.
For example, assume that we integrate the density estimate to obtain an estimate for the corresponding conditional cdf, C(v|u) = v 0 c(u, s)ds. This is a common task in vine copula models which gained a lot of popularity following the seminal paper of Aas, Czado, Frigessi, and Bakken (2009). If the estimated density does not satisfy the uniform margins property, the estimate of the conditional cdf may exceed unity, which makes it problematic. The lack of uniform margins was mostly ignored in the literature, although kernel estimates usually do not satisfy the uniform margins property (5). Now let c(u, v) be a consistent kernel estimator of c(u, v) for all (u, v) ∈ [0, 1] 2 . From Sklar's theorem for density functions (1), we know that dividing a bivariate density by the product of its marginal densities yields a copula density. Hence, a simple way to adjust the estimator is to divide the initial estimate by and is asymptotically equivalent to the original one under mild conditions. For sophisticated kernel estimators -such as the beta kernel or local likelihood estimators -the two integrals have to be computed numerically. Conveniently, the spline approximations introduced in the previous section allow for fast computation of the integrals in (5).
The proposed renormalization procedure can be split into two steps.
1. Find a spline approximation of the initial estimate that is defined by its values on a finite grid.
2. Renormalize the approximated density values on this grid by dividing by the (approximated) marginal densities (see Equation 6).
The resulting estimate will typically be closer to a bona fide copula density. However, the renormalization is only carried out on a finite number of grid points. Apart from that grid, the renormalized estimate typically still does not satisfy the uniform margins property. But we can simply repeat the two steps above until a satisfactory result is achieved. Our experience suggests that a very small number of iterations is sufficient. The number of iterations can be set by the renorm.iter argument of kdecop and defaults to three.
The renormalization will turn out to have two benefits: functionals of kernel estimates will show the desired behavior and, additionally, the estimates are more accurate (see Section 5).
For the latter there is an intuitive interpretation. By ensuring that margins are uniform, we incorporate additional information about the true density. This reduces the set of plausible estimates and increases the probability of being 'close' to the true one.

Comparison of methods
We compare all estimators implemented in this package in a simulation study. Additionally, we include results for other nonparameteric copula density estimators available in R.
• np: The kernel estimator provided by the npcopula function in the np package (Hayfield and Racine 2008).
• ks: The kernel estimator provided by the kcopula.de function in the ks package (Duong 2014).
• bern, bspl: The penalized Bernstein polynomial and B-spline estimators provided by the paircopula function in the penDvine package (Schellhase 2015).
Default settings are used for all implementations. In particular, smoothing parameters are selected automatically.
As a performance measure, we use the integrated absolute error (IAE) where we estimate the integrals as the mean over the grid (j/101, k/101), j, k = 1, . . . , 100.
We study several target copula densities and two sample sizes (n = 200, n = 1 000). We use three families from Figure 4 (Independence, Gaussian, Gumbel) and three copulas derived from multi-modal mixtures of bivariate Gaussian distributions (marginal normal contour plots are shown in Figure 6). For the Gaussian and Gumbel copulas we create scenarios with weak and strong dependence (Kendall's τ of 0.3 and 0.7). For mixture families we use the package GMCM (Bilgrau, Eriksen, Rasmussen, Johnsen, Dybkaer, and Bøgsted 2016). Figure 7 shows the mean of the IAE over 100 replication for various scenarios. The figure is divided into 8 panels, each representing a different target density. Each panel shows results for the nine estimators implemented in kdecopula (left of the dotted line) and four estimators implemented in other packages on CRAN. Each estimator is represented by a bar which is split in two parts. The solid part indicates the mean IAE for n = 1 000, the transparent part indicates the mean IAE for n = 200. As expected, all estimators improve when the sample size increases.

Error analysis
Two scenarios, the independence and Mixture III copulas, show somewhat different results from the others. We shall discuss them in more detail later. In all other scenarios, TLL2nn is the top performer and TLL2 is a close second. In general, the transformation methods seem to work better than MR and beta. The estimators from the np and ks packages are often among the least accurate. The methods from the penDvine package work rather well when there is weak dependence, but struggle when dependence is strong. The relative performance is largely consistent across the two sample sizes.
The density of the independence copula is constant. The two estimators from penDvine perform best for this target. The reason is that their basis function formulation can reproduce constants, so they behave similar to a parametric estimator in this case. The independence copula is an example where the transformation approach is suboptimal. A constant density can be estimated easily because there is no curvature. But after transformation, the target density is a bivariate Gaussian which is more difficult to estimate.
The second scenario standing out is where Mixture III is the target density. The B-spline estimator from penDvine is most accurate, followed by TTCV and TTPI. This scenario is the only one where the other transformation methods are consistently outperformed by nonkdecopula methods. The issue here is how the bandwidth matrix is selected. Methods T, TLL1, TLL2, TLL1nn, and TLL2nn all use a bandwidth matrix that is proportional to square root of the empirical covariance. This is often a good choice because it 'stretches' the kernels in a way that resembles the shape of the data. For Mixture III, the overall correlation is negative. The correlation in each mixture component, however, is strongly positive. In this case the shape of the kernels does not reflect to the local orientation of the data. TTCV, TTPI, and non-kdecopula methods do better in this scenario because they do not rely on such a rule of thumb, but use more sophisticated criteria. This comes at a cost in terms of speed, as we shall see in Section 5.4.

The effect of renormalization
In Section 4.2 we claimed that the renormalization algorithm implemented in this package improves the performance of the estimators. The results presented in Figure 7 are based on default settings, i.e., three iterations of the algorithm. Table 1 show the relative reduction of IAE compared to non-normalized estimators averaged over all scenarios.
We observe that the performance has improved for all estimators. The average gain ranges T TLL1 TLL2 TLL1nn TLL2nn TTCV TTPI  13% 15% 25%  39%  21%  28%  29% 24% 22% Table 1: The effect of renormalization: numbers indicate by how much the IAE could be improved after three iterations of the renormalization algorithm (rounded to the next integer).

MR beta
between 13% and 39%. This contributed significantly to the good performance observed in Figure 7. In fact, the estimator that could be improved the second most is TLL2nn, the top performer in our initial study.

Computation time
Some design choices in kdecopula prioritize speed over accuracy. One example is the use of spline interpolation, another is the choice of bandwidth selection methods.  Table 2: Computation time (in seconds) required for estimating the density and evaluating it on a 100 × 100 grid.
All kdecopula methods except TTCV and TTPI are much faster than estimators from other packages. The difference gets larger for larger sample sizes. Note that TTCV, TTPI, np, and ks are much slower when n = 1000. This is caused by them using sophisticated bandwidth selection techniques that have large complexity with respect to sample size.

Summary and extensions
We have described the R package kdecopula, which implements several cutting-edge kernel estimation techniques for copula densities. The package allows for automatic selection of the smoothing parameter and resampling. Several plotting options make it particularly useful for the exploratory analysis of copula data. Its abilities have been illustrated by small code examples.
The implementation utilizes spline interpolation for fast evaluation and renormalization of the density estimates. Simulations show that the implementations in this package perform best among available methods for nonparametric copula density estimation. A contributing factor for the good performance is the renormalization of the estimators, which was shown to notably improve the accuracy.
Compared with the functionality provided by the np package, kdecopula lacks two features: 1. It does not provide functionality for estimating copula densities when the data contain discrete variables. A problem is that the copula is not unique when margins are discrete, but there are infinitely many copulas that, when combined with the marginal distributions, lead to the same joint distribution (see, e.g., Genest and Nešlehová 2007).
It is unclear which of these copulas shall be estimated and how to justify the choice. Additionally, estimating a copula from discrete data necessarily involves modeling of the marginal distributions, which is deliberately avoided in kdecopula.
2. It does not allow for more than two variables. One major issue is that kdecopula uses interpolation to evaluate and renormalize the estimators. In more than two dimensions the number of grid points explodes rapidly and renders the interpolation approach infeasible.
A kdecopula-based solution for both points is the kdevine package (Nagler 2017b). It implements a kernel estimator of general multivariate densities based on vine copulas (Nagler and Czado 2016), which use marginal densities and bivariate copulas as building blocks. Continuous convolution (Nagler 2017a) is used to handle discrete variables, which induces copulas similar to the multilinear copula (see, Genest and Nešlehová 2007;Genest, Nešlehová, Rémillard et al. 2014).

Supplementary material
R code for the simulation study can be found at: https://gist.github.com/tnagler/bd194a711026c3d375ab6ae023a5bad5.