Multivariate Locally Stationary Wavelet Process Analysis with the mvLSW R Package

This paper describes the R package mvLSW. The package contains a suite of tools for the analysis of multivariate locally stationary wavelet (LSW) time series. Key elements include: (i) the simulation of multivariate LSW time series for a given multivariate evolutionary wavelet spectrum (EWS); (ii) estimation of the time-dependent multivariate EWS for a given time series; (iii) estimation of the time-dependent coherence and partial coherence between time series channels; and, (iv) estimation of approximate confidence intervals for multivariate EWS estimates. A demonstration of the package is presented via both a simulated example and a case study with EuStockMarkets from the datasets package. This paper has been accepted by the Journal of Statistical Software. Presented code extracts demonstrating the mvLSW package is performed under version 1.2.1.


Introduction
Technological advances in sensors and other data recording mechanisms have led to an increased need to efficiently and accurately analyse multivariate time series. Areas of research where such methods are commonly required include economics (Rua and Nunes 2009), medicine (Cribben 2012), telecommunications (Bardwell, Eckley, Fearnhead, Smith, and Spott 2016) and environmental science (Shama 2007). Historically, one might approach such a multivariate time series challenge by making the assumption that the underlying process was second order stationary. However, there are an increasing number of cases where such global stationarity assumptions are not tenable due to the second order structure of the process changing over time. Such series are said to be non-stationary.
Over the years, a number of popular approaches for modelling non-stationary time series have been proposed, though activity has predominantly focussed on the univariate time series setting. Notable contributions in the univariate setting include the seminal work of Priestley on evolutionary processes (Priestley 1965), time varying moving average methods (Hallin 1986) and locally stationary processes (Dahlhaus 1997;Nason, von Sachs, and Kroisandt 2000). For a recent review of the literature in this area, we refer readers to the excellent review article by Dahlhaus (2012). Implementations of these models in R (R Core Team 2015) include the LSTS (Olea, Palma, and Rubio 2015) and wavethresh (Nason 2008(Nason , 2016. These respectively implement the locally stationary Fourier and wavelet approaches for univariate time series. A multidimensional implementation of a locally stationary framework is also available for regular lattice processes (see Eckley, Nason, and Treloar (2010) and Eckley and Nason (2011) for details).
We are, of course, by no means the first to consider the challenge of modelling non-stationary multivariate time series. Early contributions include those by Ombao, von Sachs, and Guo (2005) who extend the smoothed localised exponential (SLEX) model to the multivariate setting and Sanderson, Fryzlewicz, and Jones (2010) who apply the wavelet methodology to bivariate and multivariate time series respectively. The SLEX framework, originally proposed by Ombao, Raz, von Sachs, and Guo (2002), seeks to segment a non-stationary time series into dyadic stationary blocks that then permit the employment of standard methods to form a time-dependent Fourier analysis. The multivariate extension of SLEX, proposed by Ombao et al. (2005), enables the estimation of the cross-spectrum and investigations into the coherence structure. Conversely, the work of Sanderson et al. (2010) and Cho and Fryzlewicz (2015) decompose the dependence within a locally stationary time series into two multiscale components; the within-channel spectral structure and the cross-spectrum (between-channel) structure for the bivariate and p-variate time series cases respectively. However neither of these recent wavelet-based contributions directly address a key modelling challenge for truly multivariate non-stationary signals, namely the identification of whether whether the connection between two channels is either direct or indirect (i.e., driven by other observed channel(s)).
This article presents the R implementation of an alternative formulation for multivariate locally stationary wavelet (LSW) model, proposed by Park, Eckley, and Ombao (2014). The work extends the locally stationary wavelet framework of Nason et al. (2000) to a multivariate setting, also enabling the estimation of the within-channel spectral structure and cross-spectrum. Park et al. (2014) also introduce the concepts of local coherence and, crucially, local partial coherence within the multivariate locally stationary wavelet setting. The mvLSW package (Taylor, Park, Eckley, and Killick 2017) implements the work of Park et al. (2014), building upon the univariate locally stationary wavelet time series implementation in wavethresh (Nason 2008(Nason , 2016. Specifically mvLSW provides functionality for: the simulation of multivariate LSW time series for a given evolutionary wavelet spectrum (EWS); estimation of the multivariate EWS for a given time series with point-wise confidence intervals; and, estimation of the local coherence and local partial coherence between time series channels. The mvLSW package is available for download from the comprehensive R archive network (CRAN). Note that throughout this article all mvLSW package references relate to version 1.2.1 of the package, as available on CRAN.
The paper is structured as follows. The framework of multivariate LSW modelling is briefly introduced in Section 2 together with the introduction of the mvLSW package for defining a multivariate EWS and simulating a multivariate LSW time series. Section 3 describes estimation of the multivariate EWS and its implementation in R. The concept and estimation of localised coherence and partial coherence is discussed in Section 4. We then introduce the functions developed to implement this coherence estimation framework, prior to summarising the mechanism by which approximate confidence intervals can be constructed in Section 5. A complete demonstration of the mvLSW package is presented in Section 6 including a case study which makes use of the European financial index time series EuStockMarkets that is accessible from the datasets package.

Framework
This section briefly introduces the multivariate locally stationary wavelet (LSW) time series and introduces its implementation within R in the mvLSW package. Section 2.1 introduces the multivariate LSW framework of Park et al. (2014) and a description of the time-frequency power decomposition by the multivariate evolutionary wavelet spectrum (EWS). Section 2.2 presents a worked example for a trivariate LSW process and demonstrates how to simulate a time series with the specified spectral form.

The multivariate LSW process
Adopting the notation of Park et al. (2014), the doubly indexed P -variate stochastic process {X t;T } for t = 1, . . . , T with dyadic length, i.e., T = 2 J for some J ∈ N, is said to be a multivariate LSW process if it is represented by: (1) As in the earlier univariate LSW work of Nason et al. (2000), the {ψ j,k } denote the set of discrete non-decimated wavelets for each level j and location k pair such as the Daubechies compactly supported wavelet (Daubechies 1990). The random vectors {z j,k = (z j,k ) } are uncorrelated innovations with zero expectation and P ×P identity variance-covariance matrix. Finally, V j (u) denotes the lower-triangular P ×P transfer function matrix for the given level index at rescaled time u := t/T ∈ (0, 1) . Some smoothness conditions are assumed for the elements of the transfer function matrix, controlling its behaviour (see Park et al. 2014, for details). For notational ease, the explicit dependence on T shall henceforth be suppressed but its dependence on the process and derived estimates shall naturally be assumed.
Within the above modelling framework, arguably the key element of interest is the transfer function matrix. Since the innovations are orthogonal with unit variance and the wavelets are not channel specific, then all forms of dependency between and within channels must be encapsulated by the transfer function matrix. For instance, the diagonal element V j (u) (p,p) for channel p, for p = 1, . . . , P , defines the time-varying amplitudes of the respective wavelet at level j akin to the univariate scenario of Definition 1 in Nason et al. (2000). Furthermore, if the off-diagonal element V j (u) (p,q) , for p > q, is non-zero then there is a time-varying dependence between channels p and q, but if this element is zero for all time and levels then the two channels are uncorrelated.
The transfer function matrix is also useful in forming the spectral representation of a multivariate LSW time series. Specifically, the power contained within a multivariate LSW process is described by the multivariate evolutionary wavelet spectrum (EWS) taking the form: where V j (u) denotes the transpose of V j (u). The diagonal element S (p,p) j (u) denotes the auto-spectrum for channel p, whilst the off-diagonal element S (p,q) j (u), for p = q, denotes the cross-spectrum between the channel pair p and q, and it describes how the power in the process is shared between channels.

Trivariate worked example
The multivariate EWS defined in Equation 2 is a concise description of the time-varying distribution of power for a multivariate LSW time series. To demonstrate this, we introduce a trivariate LSW process which will be used as a worked example throughout the paper. We assume that the process is built using discrete Haar wavelets, and that the EWS for this process has non-zero power at only the second finest level, j = 2, given by: Note that the auto-spectrum for the second channel is constant and so this channel is marginally stationary. The auto-spectrum for the first channel increases over the unit time interval and so, with the spectrum at all other levels equal to zero, the variability of this channel steadily increases. Conversely, the auto-spectrum for the third channel decreases over time. Dependence between the channels is also time-varying within this example.
Due to the dimensions of a multivariate EWS, visualisations that illustrate the distribution of power over levels and locations can only be obtained via slices through this 4D object. For instance, the plot in Figure 1 displays the trivariate EWS at the second level defined in Equation 3 that is invoked by the following command: The essential input arguments used here are: • x: An object of mvLSW class.
• style: Numerical index defining the plotting style: 1. Single spectra plot between channel pair (p, q) and at level j is specified via info = c(p, q, j). 2. Panel plot across channels at level j is specified via info = j.
3. Panel plot across levels for channel pair (p, q) is specified via info = c(p, q), 4. As style = 3, but presented as an image plot.
Additional arguments such as ylim and lwd can be supplied to customise the generated plot. Point-wise approximate confidence intervals introduced in Section 5 can be drawn by supplying a list containing the upper and lower bounds via the optional Interval argument. Furthermore, the inclusion of the logical argument diag, when style = 2, suppresses the drawing of the diagonal plots on the panel. This is particularly useful for plotting the local coherence and partial coherence estimates discussed in Section 4.
The transfer function matrix is obtained by factorising the multivariate EWS matrix using Cholesky decomposition. A realisation of a multivariate LSW process with a pre-specified EWS is therefore simulated using Equation 1 with innovations sampled independently from some defined distribution that has zero mean and unit variance. This procedure is implemented by the rmvLSW command. For example, the following extract generates a realisation of a trivariate LSW time series, with EWS structure as defined by Equation 3 with Gaussian innovations: R> set.seed(1) R> X <-rmvLSW(Spectrum = True_mvEWS, noiseFN = rnorm) R> plot(x = X, main = "Gaussian mvLSW time series") The input arguments for rmvLSW are: • Spectrum: The multivariate EWS as a mvLSW object.
• noiseFN: Function for generating the orthogonal innovation process with zero expectation and unit variance. Additional arguments provided to rmvLSW are passes to the supplied function.
The S3 simulate method can alternatively be called to simulate a realisation of a multivariate LSW time series for a provided multivariate EWS. An example of simulated trivariate LSW time series is presented in Figure 2.

Multivariate EWS estimation
The multivariate LSW framework presented in the previous section enables the generation of a time series that possess spectral properties defined by a specified multivariate EWS.
Conversely, interest often lies in understanding the spectral structure for a given multivariate LSW time series. Such a scheme was proposed by Park et al. (2014), estimating the multivariate EWS, {Ŝ j,k } for levels j = 1, . . . , J and locations k = 0, . . . , T − 1. To begin, each channel of the time series is independently transformed based on a given wavelet to determine the empirical wavelet coefficient vector, j,k ] , with elements: The wavelet periodogram is then defined by the matrix set for each level and location pair as: Mirroring a result of Nason et al. (2000) for univariate time series, Park et al. (2014) established that Equation 5 is both biased and has non-vanishing estimator variance. These issues are overcome by smoothing and correcting the raw periodogram estimate in defining the multivariate EWS estimator: Here Perhaps unsurprisingly, the bias correction in Equation 6 means that the spectral matrix estimates {Ŝ j,k } are no longer guaranteed to be positive definite. This leads to difficulties in evaluating quantities that are derived from the multivariate EWS, such as the localised coherence and partial coherence which we introduce in Section 4. The matrices of the multivariate EWS estimate must therefore be regularized using, for example, the approach of Schnabel and Eskow (1999) to ensure that subsequently derived estimates are themselves valid.

Trivariate worked example continued
Estimation of the multivariate EWS is implemented using the mvEWS function. The following code extract estimates the trivariate EWS for the time series simulated in Section 2.2. The wavelet transform is performed using the Haar wavelet and smoothing of the periodogram is performed using the rectangular kernel w M (m) = (2M + 1) −1 for m ∈ [−M, M ] with parameter M = √ T = 32.
• filter.number, family: the wavelet, passed to wd from wavethresh.
• kernel.name, kernel.param: Defines the smoothing kernel that is evaluated by the base command kernel. The argument kernel.name is a character string that names any kernel type evaluated by kernel.
• bias.correct: A logical variable, indicating whether the biased or corrected estimator should be estimated.
Estimation of the trivariate EWS involves a number of steps including wavelet transformation, smoothing, bias correction and regularisation. The function returns an object of class mvLSW. Details about each step are stored within this object, under the information list. A summary of these details can be examined by invoking the print or summary command.

R> summary(EWS)
== The printed summary states the dimension of the object, the wavelet and smoothing method used in deriving the estimate. The item Type under the smoothing heading identifies that the same kernel function is applied to all levels. Smoothing can be performed on a by-level basis. For further details, the reader is referred to the package's manual. A measure of the smoothing performance is calculated using a generalized cross validation (GCV) criterion (Ombao et al. 2001). The sundries information states whether or not the bias correction has been applied and the minimum eigenvalue across all time and level pairs to confirm whether all estimated spectral matrices are positive definite.
We should note that the package requires a time series of dyadic length. In practice, not all series will be of length 2 J for some J ∈ N. As Nason and Silverman (1994) describe, various techniques can be used to overcome this challenge. These include the use of reflective or periodic boundary conditions; zero padding and truncating either the head or tail of the data. Naturally, there can be various advantages and disadvantages associated with each of these approaches, depending on the underlying data generation process (Percival and Walden 2010).

Localised coherence and partial coherence
Dependence within a multivariate time series can occur both within and between channels.
Coherence provides a measure of the linear dependence between any channel pair. However, if the multivariate time series is non-stationary then the coherence measure may vary over time. Park et al. (2014) introduced the concept of local coherence based on the locally stationary wavelet framework defined in Section 2.1. At a particular level, the local coherence matrix function is defined by: where D j (u) = diag{[S (p,p) j (u)] −1/2 : p = 1, . . . , P }. The off-diagonal elements of the localised coherence matrices may take any value within [−1, 1] where strong linear dependence is identified by values near ±1. Whilst this coherence measure provides an indication of strong dependence between a channel pair, it cannot distinguish whether this relationship is direct or indirect. In other words, it cannot identify whether the relationships occur because of direct dependencies with other (observed) channels. To quantify direct linear dependence, Park et al. (2014) introduced the local partial coherence matrix function. For any given level, this is defined as follows: where G j = S j (u) −1 and H j (u) = diag{[G (p,p) j (u)] −1/2 : p = 1, . . . , P }.

Trivariate worked example continued
Within the mvLSW package both forms of coherence can be estimated using the coherence command. This implements a 'plug-in' estimator proposed by Park et al. (2014). The input arguments for coherence are: • object: The multivariate EWS as a mvLSW object.
• partial: A logical variable, indicating whether to evaluate the partial coherence?
Here, the distinction between evaluating the localised coherence and localised partial coherence is simply achieved by toggling the logical argument as demonstrated below with the trivariate EWS estimate derived in Section 3.1: R> RHO <-coherence(object = EWS, partial = FALSE) R> GAMMA <-coherence(object = EWS, partial = TRUE) As with the multivariate EWS, the returned coherence measure is stored within a 4D array of class mvLSW. A visualisation of a slice through this object can be generated by invoking the plot command as described in Section 2.2 for appropriate style and info arguments. Note that the diagonal elements of both coherence matrices do not contain any useful information and so can be suppressed when generating the image by toggling the diag logical argument as demonstrated in the following extract for the trivariate example: R> plot(x = RHO, style = 2, info = 2, diag = FALSE, ylim = c(-1, 1), + lwd = 2, ylab = "Coherence") R> plot(x = GAMMA, style = 2, info = 2, diag = FALSE, ylim = c(-1, 1), + lwd = 2, ylab = "P. Coh.") The localised coherence and partial coherence estimates at the second level are presented in Figure 3. Recalling the form of the EWS in Equation 3, it is perhaps unsurprising that the estimates of the localised coherence at the second level are mostly positive and gradually increase over time. However, the localised partial coherence estimate between the second and third channels is closer to zero compared to its localised coherence estimate, suggesting that the measured linear dependence can mostly be explained by their dependence with the first channel.

Confidence intervals for the multivariate EWS estimate
Under reasonably mild assumptions, Park (2014) established that the approximate, point-wise 100(1 − α)% confidence interval for a given element of the multivariate EWS, S (p,q) j,k , is given by: where Φ(x) denotes the standard Gaussian cumulative distribution function. The asymptotic variance of an element of the multivariate EWS estimator is easily shown to be: where the covariance between any pair of wavelet periodogram elements is: Cov Here, B j,l,h (λ) = τ Ψ j,h (τ )Ψ l,h (τ − λ) is the inner product of cross-level autocorrelation wavelet inner product function where Ψ j,l (τ ) = k ψ j,k (0)ψ l,k+τ (0) (Fryzlewicz and Nason 2006). Note that the special case {A j,l = B j,j,l (0)} for levels j, l = 1, . . . , J denotes the inner product matrix that quantifies the leakage bias of the raw periodogram, see Section 3. The variance is dependent on knowing the true EWS but this is typically unavailable. Consequently, the variance is estimated by substituting the multivariate EWS estimateŜ j,k .

Trivariate worked example continued
Before calculating the asymptotic variance, the autocorrelation wavelet inner product function B j,l,h (λ) must first be calculated. The command AutoCorrIP evaluates the inner product functions for a given wavelet such as the Haar wavelet demonstrated below: R> HaarACWIP <-AutoCorrIP(J = J, filter.number = 1, family = "DaubExPhase") The AutoCorrIP command returns a (2T + 1) × J × J × J array that corresponds to the lag λ and level indices j, l and h respectively.
As with the localised coherence and partial coherence estimates, the variance is evaluated as a 'plug-in' estimator based on a given multivariate EWS estimate. This is implemented by varEWS as demonstrated below for the trivariate EWS estimate evaluated in Section 3.1:
Given the variance estimate, the approximate, point-wise 95% confidence interval for the trivariate EWS elements is derived using ApxCI.
• var: Variance estimate of the multivariate EWS as a mvLSW object.
• alpha: Type I error.
The returned object is a list with two items named "L" and "U". These are objects having class mvLSW, that contain the point-wise lower and upper interval bounds respectively. The point-wise confidence intervals can be included on a plot of the multivariate EWS estimate using the plot command demonstrated in Section 2.2 with the additional Interval argument. The 95% confidence intervals for the trivariate EWS estimate in Figure 4 is generated by the following command: R> plot(x = EWS, style = 2, info = 2, Interval = CI) To demonstrate the accuracy of the 95% confidence intervals, the estimates are compared against the 95% bootstrap intervals. The following extract evaluates the bootstrap intervals from 100 simulated time series with power spectrum defined by the estimate trivariate EWS in Section 3.1. The interval estimates for the second level are presented in Figure 5. It is clear from comparing these estimates that the analytically derived intervals are close to those obtained via bootstrapping.

European financial indices
We now illustrate the application of the package using EuStockMarkets, the European financial indices from the datasets package. The data set contains the daily index of the German (DAX), Swiss (SMI), French (CAC) and British (FTSE) markets. The log-returns are evaluated to remove any trend features and the series is truncated to produce a time series of length T = 1024. Figure 6 presents a trace plot of the time series. Each channel of the time series is independently deemed to be non-stationary according to the test implements by BootTOS from the costat package (Nason and Cardinali 2013).

Discussion
The mvLSW package contains a number of tools to assess a multivariate non-stationary time series under the locally stationary wavelet framework defined by Park et al. (2014). The main command mvEWS estimates the multivariate EWS for a LSW time series, Section 3.1, and is flexible for the user to specify the best wavelet transform and smoothing kernel that is most appropriate to explore the spectral structure of the time series. However, perhaps more usefully, the package also includes tools for analysing the dependence structure that occurs between pairs of channels, as described by the localised coherence and partial coherence measures presented in Section 4.1. Uncertainty in the multivariate EWS estimate can be quantified analytically by the routines demonstrated in Section 5.1 in evaluating approximate, point-wise, 95% confidence intervals for the spectral elements.
A complete case study for examining the spectral structure of the European financial indices EuStockMarkets was demonstrated in Section 6. This showed that a major non-stationary feature occurs due to a peak in power at the finest level in 1997. Furthermore, analysis of the coherence structure identifies that the indices are strongly and positively linearly dependent with a notable fraction arising from the indirect relationship between all four markets.