R Functions to Symbolically Compute the Central Moments of the Multivariate Normal Distribution

The central moments of the multivariate normal distribution are functions of its n × n variance-covariance matrix Σ. These moments can be expressed symbolically as linear combinations of products of powers of the elements of Σ. A formula for these moments derived by diﬀerentiating the characteristic function is developed. The formula requires searching integer matrices for matrices whose n successive row and column sums equal the n exponents of the moment. This formula is implemented in R , with R functions to display moments in L A TEX and to evaluate moments at speciﬁed variance-covariance matrices included.


Introduction
The central moments of an n-dimensional random vector X are defined as where E[· · · ] denotes expected value. Suppose that X is distributed according to the multivariate normal distribution with mean µ and variance-covariance matrix where the variance terms are σ ii , i = 1, . . . , n, the covariance terms are σ ij , i = j, and by symmetry σ ij = σ ji . For the multivariate normal distribution the central moments are not functions of the mean vector µ, and depend only on the variance-covariance terms σ ij .
We wish to compute the symbolic expression of any moment m k 1 ,...,kn in terms of the n(n + 1) symbols σ ij . Note that if n i=1 k i is an odd integer, the moment is 0, so that only moments where this sum is even need be considered.
The multivariate normal distribution is fundamental to mathematical statistics, and its moments play a central role in statistical methodology. Various methods have been developed to numerically compute them (Muirhead 1982, p. 46) and (Anderson 1971, p. 49). Kan (2008) developed a formula (his Proposition 1) for the central moments as a repeated sum. He gives an excellent review of other formulas that have been developed, and cites Isserlis (1918) as deriving the first expression for the central moments. Muirhead (p. 49) used the matrix derivatives of the multivariate normal distribution's characteristic function to derive a formula for multivariate cumulants. Tracy and Sultan (1993) also used matrix derivatives to derive an expression for the distribution's moments (their Theorem 2) based on a recurrence relationship of the derivatives. This article develops a new explicit formula for the moments starting with the derivatives of the characteristic function. The expression for the moments is based on a search algorithm over certain integer matrices. The final goal of this paper is to translate this formula into R functions that produce symbolic representations of moments in terms of the variance-covariance terms σ ij .
The functions described here are available in the package symmoments implemented in the R system for statistical computing (R Development Core Team 2009). Both R itself and the symmoments package (as well as all other packages used in this paper) are available under the terms of the General Public License (GPL) from the Comprehensive R Archive Network (CRAN, http://CRAN.R-project.org/).

Development of the formula
The moments of any distribution can be represented by the derivatives of the distribution's characteristic function. The characteristic function of the multivariate normal distribution is (Muirhead 1982, p. 5, 49) E where t = (t 1 , t 2 , ..., t n ). Within a constant, the moment is the k 1 , ..., k n -order derivative of the characteristic function evaluated at t = 0: where i is the imaginary unit. Expanding the exponential into an infinite sum, this is Since we are to compute the central moment, we will set µ = 0, so that the term it µ will not appear: will ultimately cancel with the negative in the infinite sum, and will be omitted for convenience in notation.

The expression in t is
We need to find the coefficient of t 1 1 t 2 2 · · · t n n in All products in the elements of the sum will occur. Any product will be obtained by choosing σ ij t i t j a certain number of times, say ij . Since one term is chosen from each of terms, ij ij = . Further, for any such matrix ( ij ), there will be a term, since it can be constructed by choosing σ ij t i t j from the first ij terms, and so forth for each (ij) until is exhausted. For any ( ij ) there are 11 . . . nn (14) ways to choose the terms, where this is the multinomial coefficient. So, For the moment, we distinguish between σ ij and σ ji as symbols. As a result, each ij σ ij ij is unique as determined by unique ( ij ). However, t i t j = t j t i , so since each σ ij is combined with two t's, the total exponent in t is 2 . That is, a term t 1 1 t 2 2 · · · t n n must have n i=1 i = 2 . We need to determine the terms for which, for any ( 1 , . . . , n ), We will get t k in the product in the following mutually exclusive cases: So the exponent of t k will be This sum is obtained by adding the sum of ij across row k to the sum across column k, since the diagonal element k occurs in both sums. That is, we get We can now partition the set of ( ij ) in Equation 16 according to these sums, that is, { k , k = 1 . . . n}. As stated before, the sum of the exponents, k , must be 2 .
Since differentiation is distributive with respect to addition and multiplication by constants, the derivative of the product of ts can be determined from the derivatives of the individual terms: Thus, Incorporating the constants from Equation 11, noting again that the negative signs will cancel, the full sum is Setting t = 0, only terms with i = k i for all i will remain. Otherwise, the only in the infinite sum which occurs is for = n i=1 k i /2. So this reduces to This formula shows that evaluating m k 1 ,...,kn symbolically requires enumerating all n × ndimensional matrices of non-negative integers, ( ij ), which satisfy the condition Conceding now that the symbols σ ij and σ ji signify the same entity, we can search for ( ij ) by looking only at terms ij σ ij ij for which i ≤ j. In fact, any other matrix for the term can be obtained by decrementing ij and incrementing ji by the same integer for one or more subscripts for which i < j. For any σ ij ij in the term, this can be done in ij + 1 ways. So there are a total of i<j ( ij + 1) transpositions. The multinomial coefficients derived above must be applied separately to each of these ( ij ) matrices. Thus, the full coefficient for a matrix will include as a multiplier the sum of these coefficients over all of the i<j ( ij + 1) transposed matrices. Let Υ be the set of upper-triangular integer matrices, and, for any ( ij ) ∈ Υ, let Λ(( ij )) be the set of all integer matrices (h ij ) obtained by so transposing ( ij ). Then the sum above can be decomposed in terms of Υ and Λ(( ij )) for each ( ij ) ∈ Υ: But by symmetry, the products in (σ ij ) are the same for each member of Λ(( ij )), specifically ij σ ij ij . So the final formula is

Discussion
Formula 31 was implemented in R (R Development Core Team 2009) with a recursive function that determines the set of upper-triangular integer matrices that satisfy Criterion 29. A second function calculates their associated coefficients. Additional functions were written to create L A T E X (L A T E X3 Project Team 2009) code to display the moments symbolically, and to calculate the moments for specified variance-covariance matrices.
The potential for complexity in these computations is seen from the results in Table 1. In this table, n is the dimension of the multivariate vector and #(σ ij ) is the number of distinct elements in the variance-covariance matrix, N = n(n+1 2 . Size is measured by two values, M and r. M is the total of the exponents of the moment, M = n i=1 k i . The value r is the number of terms for a moment E[X 1 1 , ..., X 1 n ] with all exponents equal to 1, which is (2M − 1)!/(2 M −1 (M − 1)!) (Wikipedia 2009). Example is a moment of the given Size. Potential Terms is a maximum for the number of ( ij ) matrices to be checked for this example, determined as the product of 1 + max(k i , k j ) over i = j times the product of 1 + [k i /2] over i, where [ ] denotes truncation. The last column, # Terms, is the actual number of terms in the moment as determined by the functions.
It is clear that computation of high-order moments will be very intensive. Kan reports similar computational difficulties. More efficient or targeted algorithms for searching the matrices in Criterion 29 would allow higher order moments to be computed. However, it is likely that the problem is intractable as described by Garey and Johnson (1979). For example, symbolic computation of the central moments E[X k 1 1 · · · X kn n ] for (k 1 , ..., k n ) < (k, ..., k) for a fixed k may be NP-hard in n , or computation of E[X k 1 1 · · · X kn n ] may be NP-hard in max(k 1 , . . . , k n ) for fixed n.
The formula derived here could be expanded to incorporate mean terms, which would allow computation of non-central moments. These moments could also be used to approximate other integrals integrated against the multivariate normal distribution by using a Taylor expansion in several variables (Fulks 1961). Such an approximation would be a linear combination of non-central moments.
Finally, Criterion 29 might arise in other contexts, such as networks (Stergiou and Siganos 1996). For example, suppose that there are n airports and airport i can accommodate k i arrivals or departures on a day, where a plane may take off and land at the same airport. This network is illustrated in Figure 1. The problem is to determine the set of flights between airports that totally expend the capacities, k i , of all airports. For this problem, ij represents  E[X 2 1 X 2 2 X 2 3 X 2 4 X 2 5 X 2 6 X 2 7 X 2 8 X 2 9 ] 7.68484 × 10 19 6,763,895 Table 1: Examples of sizes of moment problem (*: The function aborted after 2 days due a space allocation problem.) Figure 1: Airport capacity network. the number of flights from airport i to airport j, and ii is the number of flights which start and end at airport i. The set of flights is the set satisfying Criterion 29.

R functions to compute normal multivariate moments
In the R functions to compute the central moments, upper-triangular matrices ( ij ) for n dimensions are represented as vectors of length n(n + 1), with row 1 followed by row 2, etc. For example, for n = 2, ( ij ) is represented as ( 11 , 12 , 22 ). Each such matrix represents the exponents for a single product of σ ij s. For example, (1, 2, 0) represents σ 1 11 σ 2 12 σ 0 22 = σ 11 σ 2 12 . The representations are accumulated and stored internally, raising the possibility of space allocation problems as encountered in the second to last example in Table 1. This problem could be alleviated by saving the representation matrix to a file instead.
The function multmoments searches the integer matrices for those satisfying Criterion 29. This is a recursive function which implements a branch-and-bound algorithm. The function multmoments is called by callmultmoments. This function initializes variables, determines the coefficients of the terms from the upper-triangular representations, and returns a list consisting of the original moment vector, the set of representations, and the corresponding set of coefficients. This list is set to class moment. The moment class has four methods: print, toLatex, evaluate, and simulate. The print method prints a moment object, usually the output of callmultmoments, showing a mathematical representation of the moment, followed by the rows of the representation with the corresponding coefficient attached.
The toLatex method uses a moment object, usually the output of callmultmoments, to determine the L A T E X code for the moment sorted lexicographically. Note that it inserts double backslashes where L A T E X requires a backslash; these can be reset to single backslashes by writing the output to a file using the R base function writeLines, as illustrated below.
The evaluate method determines the value of a moment object for a specified variancecovariance matrix Σ, which must be represented as an upper-triangular matrix in vector form.
The simulate method uses Monte Carlo integration Rizzo (2008) to numerically approximate a moment object for a specified mean and variance-covariance matrix Σ (represented as a square matrix in vector form), with a specified number of random samples. Note that simulate uses only the moment definition, not the representation, so can be used with any moment in vector notation by converting the vector to a moment object. The simulate method uses the rmvnorm function from the mvtnorm package (Genz et al. 2009). Computation of the moments was validated by comparison to specific published cases and to known types such as E[X k ] and E[X 1 1 , . . . X 1 n ], through consistency checks, and to comparison to numerically-computed moments. For consistency, the representations for moments which are permutations of each other must be the same within ordering across rows and columns; for example, E[X 2 1 X 4 2 ] and E[X 4 1 X 2 2 ] have the same representations within ordering. Also, a moment containing one or more odd powers will evaluate to 0 for a specified covariance matrix if the component of the random variable corresponding to one of the odd powers is independent of the other components, that is, if the off-diagonal covariance terms are zero for this component. The functions were confirmed to have this property for a number of cases.
Further validation was done using Monte Carlo integration. Various moments of dimension up to six were compared for two values of the covariance matrix, the identity matrix and a covariance matrix computed from ten random normal vectors obtained using the identity covariance matrix and then adding 1 to the diagonal elements to increase the variability. Forty estimates of the moment were then computed using the simulate method, each obtained from one million randomly generated multivariate normal vectors. From these estimates, 95% confidence intervals were computed and compared to the moment estimates using the estimate method. In the 37 experiments done using an identity matrix, the moments computed from the symbolic representations fell outside the confidence interval in three (8.1%) of the cases. However, in all three cases callmultmoments produced the value of zero, which is correct since the moments contained odd powers and the components of the vectors were independent. In the 37 experiments done using a randomly-generated matrix, the moments computed from the symbolic representations fell outside the confidence interval in two (5.4%) of the cases (E[X 2 1 X 9 2 X 11 3 ] and E[X 2 1 X 4 2 X 7 3 X 7 4 X 8 5 ]), where the symbolically computed values lay slightly outside the confidence intervals. These two cases were run again using five million random normal vectors for each of the 40 estimates, and the moments computed from the symbolic representations fell inside the confidence intervals. These results provide further evidence that the functions given here are correct, and in fact are superior to numerical integration in obtaining moments, since it only requires a simple evaluation of a polynomial with integer coefficients. Numerical integration using adaptive methods was also implemented but worked poorly for higher dimensions (Kuonen 2003).

Compute the representation of a central moment (callmultmoments)
The following code calculates a central moment and shows the three components, moment, representation, and coefficients.