Recursive Numerical Evaluation of the Cumulative Bivariate Normal Distribution

We propose an algorithm for evaluation of the cumulative bivariate normal distribution, building upon Marsaglia's ideas for evaluation of the cumulative univariate normal distribution. The algorithm is mathematically transparent, delivers competitive performance and can easily be extended to arbitrary precision.


Introduction
The cumulative normal distribution, be it univariate or multivariate, has to be evaluated numerically. There are numerous algorithms available, many of these having been fine-tuned, leading to faster evaluation and higher accuracy but also to lack of mathematical transparency.
For the univariate case, Marsaglia (2004) has proposed a very simple and intuitive but powerful alternative that is based on Taylor expansion of Mills' ratio or similar functions. In this note we will extend Marsaglia's approach to the bivariate case. This will require two steps: reduction of the evaluation of the cumulative bivariate normal distribution to evaluation(s) of a univariate function, i.e., to the cumulative bivariate normal distribution on the diagonal, and Taylor expansion of that function. Note that a similar approach, but with reduction to the axes instead of the diagonals, has been proposed by Vogt (2008).
The resulting algorithm has to be compared with existing approaches. For overview on and discussion of the latter, cf. (Bretz & Genz 2009), (Aǧca & Chance 2003), (Terza & Welland 1991), and (Wang & Kennedy 1990). Most implementations today will rely on variants of the approaches of Divgi (1979) or of Drezner & Wesolowsky (1990). Improvements of the latter method have been provided by Genz (2004) and West (2005). The method of Drezner (1978), although less reliable, is also very common, mainly because it is featured in (Hull 2008) and other prevalent books.
It will turn out that the algorithm proposed in this paper is able to deliver near double precision (in terms of absolute error) using double arithmetic. Furthermore, implementation of the algorithm using high-precision libraries is straightforward; indeed, a quad-double implementation has been applied for testing purposes. Performance is competitive, and trade-offs between speed and accuracy may be implemented with little effort.

Theory
In this section we are going to develop the algorithm. In order to keep the presentation lean we will often refer to the author's recent survey (Meyer 2009). For further background on normal distributions the reader is also referred to text books such as (Balakrishnan & Lai 2009), (Kotz, Balakrishnan & Johnson 2000) and (Patel & Read 1996).

Evaluation on the diagonal
Denote by the density and distribution function of the standard normal distribution. Mills' ratio is then defined as Furthermore, denote by the density and distribution function of the bivariate standard normal distribution with correlation parameter ∈ (−1, 1). We will also write We are going to use the following properties: In the following we will assume that x ≤ 0, ≥ 0. In this case the following bounds apply (cf. (Meyer 2009, Th. 5.2)): Furthermore, as is proven implicitly in (Meyer 2009, App. A.2), Starting with we find the recursion which we can use to recursively evaluate the Taylor expansion of D around zero. Dividing by 1 − 2 for convenience, we define Using we derive the following recursion scheme: with initial values (2.10) Here we have used that We can now compute Φ 2 (x; ) numerically via Note that it would also have been possible to work with, e.g., one of the functions instead. The resulting recursion schemes are in fact easier (two summands instead of three) but will be running into numerical problems (cancellation, or lower accuracy for x −→ −∞).

Reduction to the diagonal
In order to apply the results from Section 2.1 to the numerical evaluation of Φ 2 (x, y; ) for general x, y and , we start with the symmetric formula (cf. (Meyer 2009, Eq. (3.16))) where δ x = 1 2 , x < 0 and y ≥ 0, 0, else, , δ y = 1 2 , y < 0 and x ≥ 0, 0, else, From the axis y = 0 to the diagonal x = y we get by applying the formula (cf. (Meyer 2009, Eq. (3.18 Specifically, we obtain where in an implementation (2.15) should be used for −→ 1, and (2.16) for −→ −1, in order to avoid catastrophic cancellation. Note also that In a last step, if necessary to ensure x ≤ 0 and ≥ 0, we apply the formulas (cf. (Meyer 2009, Eq. (2.15)) and (Meyer 2009, Eq. (3.27 where in an implementation (2.20) should be used for −→ 1, and (2.21) for −→ −1, in order to avoid catastrophic cancellation. It will be favorable to work with x . If (2.18) has to be applied (i.e., if 1 − 2 2 x < 0, which is equivalent with a x > 1), correspondingly we will work with (2.22)

Implementation
In the following we will discuss implementation of the algorithm derived in Section 2. The C++ language has been chosen because it is the market standard in quantitative finance, one of the fields frequently requiring evaluation of normal distributions.

Evaluation on the diagonal
Source code (in C++) for evaluation of Φ 2 (x; ) as in Section 2.1, for x ≤ 0 and ≥ 0, is provided in Figure 1. In the following we will comment on some details of the implementation. Equations (2.2) -(2.10) show that it is reasonable to provide a := 1 − , instead of , as input for the evaluation of Φ 2 (x; ). Moreover, cf. (2.13) and (2.22), double inversion (i.e., computation of 1 − (1 − z) instead of z) is to be avoided in the reduction algorithm.
Values for Φ(x) and for Φ(λ( ) · x) are also expected as input parameters. This makes sense because the values are needed by the reduction algorithm as well (and hence should not be computed twice).
Note that λ( ) has to be computed anyway. Constants (all involving π) have been pre-computed in double precision. The recursion stops if a new term does not change the computed sum. If the a priori bound for the absolute error, given by (2.1), is less than 5 · 10 −17 , the upper bound is returned (relative accuracy on the diagonal may be increased by dropping this condition but overall relative accuracy will still be determined by the reduction to the diagonal, cf. Section 3.2), and by the accuracy of the implementation of Φ. The final result is always checked against the upper and lower bound.
Note that d 2k and d 2k+1 have different sign but comparable order. Bracketing them before summation can therefore reduce cancellation error.

Reduction to the diagonal
Source code (in C++) for evaluation of Φ 2 (x; ) as in Equation (2.11) is provided in Figure 3, and source code for evaluation of Φ 2 (x, 0; x ) − δ x is provided in Figure 2. In the following we will comment on some details of the implementation.
It is assumed that sqr(x) evaluates x*x. The cutoff points | | = 0.99 have been set by visual inspection and might be optimized.

Discussion
Evaluation of Φ 2 (x, y; ) as in Section 3 will require (at most) four calls to an implementation of the cumulative standard normal distribution Φ (Phi() in the code). The actual choice may well determine both accuracy and running time of the algorithm. For testing purposes I have been using a hybrid method, calling the algorithm from (West 2005, Fig. 2) for absolute value larger than 0.5, and Phi() from (Marsaglia 2004) else. Besides Phi(), exp() will be called two times, arcsin() or arccos() two times, and sqrt() six times. Everything else is elementary arithmetic.
In general, assuming that all numerical fallacies in the reduction algorithm have been taken care of, the diagonal is expected to provide a worst case because the errors of the two calls to Phi2diag() will not cancel. With respect to the reduction algorithm, the case x ≈ y, y ≈ x, implying | | ≈ 1, is most critical.
In order to give an impression of the algorithm's behaviour, we will discuss the results of a simulation study. For each n ∈ {0, . . . , 200}, m ∈ {1, . . . , 10 6 }, the value of Φ 2 (x n m , y n m ; n m ) has been computed via the Phi2() function from Figure 3 where x n m has been drawn from a uniform distribution on [x n −0.05, x n + 0.05] with x n := n/10 − 10, y n m has been drawn from a uniform distribution on [−10, 10], and n m := 2Φ(r n m ) − 0.5 where r n m has been drawn from a uniform distribution on [−10, 10] as well.
The C++ implementation from (West 2005) has been serving as a competitor. Both functions have been evaluated against a quad-double precision version of Phi2(), implemented using the QD library (Bailey, Hida, Li & Thompson 2010) and quad-double precision constants.
The diagram in Figure 4 is displaying, for n ∈ {0, . . . , 200}, the 99% quantile and the maximum of the absolute difference between the double precision  Figure 4: Absolute error of implementations of Φ 2 in a simulation study algorithms (Phi2 and West) and the quad-double precision algorithm. Apart from a shift due to subtractions for positive x, errors of Phi2 are rather symmetric around zero. The peaks at |x n | ≈ 7 are due to the Taylor expansion around zero; the peaks at |x n | ≈ 2 are due to Taylor expansion after transformation of the argument. The characteristics of the 99% quantile, in particular the little peaks at |x n | ≈ 0.7, are already visible in the error of the Φ function used. The maximum error of West almost always stays below the one of Phi2. Note that the maximum error of West is determined by the case −→ −1 and might be reduced by careful consideration of that case.
In the simulation study, Phi2 was a little slower than West: it took approximately five minutes and four minutes to perform the 201 · 10 6 evaluations on a fairly standard office PC (and it took two days to perform the corresponding quad-double precision evaluations). The number of recursion steps used by Phi2diag is increasing with |x|. Because of the mathematical transparency of the algorithm it should be easy to find an appropriate trade-off between speed and accuracy by replacing the condition terminating the recursion.