COGARCH ( p, q ) : Simulation and Inference with the yuima Package

In this paper we show how to simulate and estimate a COGARCH( p, q ) model in the R package yuima . Several routines for simulation and estimation are introduced. In particular, for the generation of a COGARCH( p, q ) trajectory, the user can choose between two alternative schemes. The ﬁrst is based on the Euler discretization of the stochastic diﬀerential equations that identify a COGARCH( p, q ) model while the second considers the explicit solution of the equations deﬁning the variance process. Estimation is based on the matching of the empirical with the theoretical autocorrelation function. Three diﬀerent approaches are implemented: minimization of the mean squared error, minimization of the absolute mean error and the generalized method of moments where the weighting matrix is continuously updated. Numerical examples are given in order to explain methods and classes used in the yuima package.


Introduction
The continuous-time GARCH(1, 1) process has been introduced in Klüppelberg, Lindner, and Maller (2004) as a continuous counterpart of the discrete-time generalized autoregressive conditional heteroskedastic (GARCH hereafter) model proposed by Bollerslev (1986). The idea is to develop in continuous time a model that is able to capture some stylized facts observed in financial time series exploiting only one source of randomness for returns and for variance dynamics. Indeed, in the continuous-time GARCH process (COGARCH hereafter), the stochastic differential equation, that identifies the variance dynamics, is driven by the discrete part of the quadratic variation of the same Lévy process used for modeling returns.
The continuous nature of the COGARCH makes it particularly appealing for describing the behavior of high frequency data (see Haug, Klüppelberg, Lindner, and Zapp 2007, for an application of the method of moments using intraday returns). It is possible to estimate the model using daily data and then generate the sample paths of the process with intraday frequency. This feature makes these models more useful than their discrete counterparts especially for risk management purposes. The recent crises show the relevance of monitoring the level of risk using high frequency data (see Shao, Lian, and Yin 2009;Dionne, Duchesne, and Pacurar 2009, and references therein for some further explanation on the value-at-risk computed on intraday returns).
The generalization to higher order COGARCH(p, q) processes has been proposed in Brockwell, Chadraa, and Lindner (2006) and Chadraa (2009). Starting from the observation that the variance of a GARCH(p, q) is an autoregressive moving average model (ARMA hereafter) with order (q, p − 1) (see Brockwell et al. 2006, for more details), the variance is modeled with a continuous-time autoregressive moving Average process (CARMA hereafter) with order (q, p − 1) (see Brockwell 2001;Tómasson 2015;Brockwell, Davis, and Yang 2007, and many others) driven by the discrete part of the quadratic variation of the Lévy process in the returns. Although this representation is different from that used by Klüppelberg et al. (2004) for the COGARCH(1, 1) process, the latter can be again retrieved as a special case.
Many authors recently have investigated the COGARCH(1, 1) model from a theoretical and an empirical point of view (see Maller, Müller, and Szimayer 2008;Kallsen and Vesenmayer 2009;Müller 2010;Bibbona and Negri 2015, and many others). Some R codes for the estimation and the simulation of a COGARCH(1, 1) driven by a compound Poisson or by a variance gamma are available in Granzer (2013). For the general COGARCH(p, q), the main results are given in Brockwell et al. (2006) and Chadraa (2009). The aim of this paper is to describe the simulation and the estimation schemes for a COGARCH(p, q) model in the yuima package version 1.0.2 developed by Brouste, Fukasawa, Hino, Iacus, Kamatani, Koike, Masuda, Nomura, Ogihara, Shimuzu, Uchida, and Yoshida (2014). Based on our knowledge, the yuima version 1.6.8 (YUIMA Project Team 2017) used in this paper is the only R package available on the Comprehensive R Archive Network (CRAN; https://CRAN.R-project.org/package=yuima) that allows the user to manage a higher order COGARCH(p, q) model driven by a general Lévy process. Moreover, the estimation algorithm gives the option to recover the increments of the underlying noise process and estimates the Lévy measure parameters. We recall that a similar procedure is available in yuima also for the CARMA model (see Iacus and Mercuri 2015, for a complete discussion). The estimated increments can be used for forecasting the behavior of high frequency returns, as shown in the empirical analysis. We also provide a comparison with GARCH models. At the moment, there exist several choices for GARCH modeling in R, see for instance fgarch (Wuertz and Chalabi 2016), tseries (Trapletti and Hornik 2015) and rugarch (Ghalanos 2015). In our comparison the natural choice seems to be the rugarch package that allows the user to forecast the time series using both parametric distributions and estimated increments.
The outline of the paper is as follows. In Section 2 we discuss the main properties of the COGARCH(p, q) process. In particular we review the condition for the existence of a strictly stationary variance process, its higher moments and the behavior of the autocorrelation of the squared increments of the COGARCH(p, q) model. In Section 3 we analyze two different simulation schemes. The first is based on the Euler discretization while the second uses the solution of the equations defining the variance process. Section 4 is devoted to the estimation algorithm. In Section 5 we describe the main classes and corresponding methods in the yuima package and in Section 6 we present some numerical examples about the simulation and the estimation of a COGARCH(p, q) model and we conduct a comparison with the GARCH(1, 1) model. Section 7 concludes the paper. In the Appendix we derive the higher moments of a COGARCH(p, q) process and its autocorrelation function by means of Teugels martingales.

COGARCH(p, q) models driven by a Lévy process
In this section, we review the mathematical definition of a COGARCH(p, q) process and its properties. In particular we focus on the conditions for the existence of a strictly stationary COGARCH(p, q) process and compute the first four unconditional moments. Their existence plays a central role for the computation of the autocorrelation function of the squared increments of the COGARCH(p, q) model and consequently the estimation procedure implemented in the yuima package.
The COGARCH(p, q) process, introduced in Brockwell et al. (2006) as a generalization of the COGARCH(1, 1) model, is defined through the following system of stochastic differential equations: where q and p are integer number such that q ≥ p ≥ 1. The state space process Y t is a vector with q components: Vector a ∈ R q is defined as: a = [a 1 , . . . , a p , a p+1 , . . . , a q ] with a p+1 = · · · = a q = 0. The companion q × q matrix A is Vector e ∈ R q contains zero entries except for the last component that is equal to one.
[L, L] d t is the discrete part of the quadratic variation of the underlying Lévy process L t and is defined as: If the process L t is a càdlàg pure jump Lévy process the quadratic variation is defined as in (2).
Remark 1 A COGARCH(p, q) model is constructed starting from the observation that in the GARCH(p, q) process, its discrete counterpart, the dynamics of the variance is a predictable ARMA(q, p − 1) process driven by the squares of the past increments. In the COGARCH(p, q) case, the ARMA process leaves the place to a CARMA(q, p − 1) model (see Brockwell 2001, for details about the CARMA(p, q) driven by a Lévy process) and the infinitesimal increments of the COGARCH(p, q) are defined through the differential of the Lévy L t as done in (1).
The COGARCH(p, q) process generalizes the COGARCH(1, 1) process that has been introduced following different arguments from those for the (p, q) case. However, choosing q = 1 and p = 1 in (1), the COGARCH(1, 1) process developed in Klüppelberg et al. (2004) and Haug et al. (2007) can be retrieved through straightforward manipulations and, for obtaining the same parametrization as in Proposition 3.2 of Klüppelberg et al. (2004), the following equalities are necessary: Before introducing the conditions for strict stationarity and the existence of unconditional higher moments, it is worth noting that the state space process Y t can be seen as a multivariate stochastic recurrence equation (see Brandt 1986;Kesten 1973) and the corresponding theory can be applied to the COGARCH(p, q) process (see Brockwell et al. 2006;Chadraa 2009, for more details) in order to derive its main features.
In the case of the compound Poisson driven noise, the representation through the stochastic difference equations is direct in the sense that the random coefficients of the state process Y t can be written explicitly while in the general case, it is always possible to identify a sequence of compound Poisson processes that converges to the chosen driven Lévy process.
In the following, we consider the case of distinct eigenvalues that implies matrix A is diagonalizable, that is: where λ 1 , λ 2 , . . . , λ q are the eigenvalues of matrix A and are ordered as follows: Through the theory of stochastic recurrence equations, Brockwell et al. (2006) provide a sufficient condition for the strict stationarity of a COGARCH(p, q) model. Under the assumption that the measure ν L (l) is not trivial, the stationary solution of process Y t converges in distribution to the random variable Y ∞ if there exists some r ∈ [1, +∞] such that: for some matrix S such that it is possible to find a diagonalizable matrix A as in (3). The operator · r denotes both the vector L r norm if its argument is a vector and the associated natural matrix norm if the argument is a matrix (see Serre 2002, for details).
If we choose as a starting condition Y 0 d = Y ∞ then the process Y t is strictly stationary and consequently the variance process V t is strictly stationary. As shown in Klüppelberg et al. (2004) and remarked in Brockwell et al. (2006), the condition in (6) is also necessary for the COGARCH(1, 1) case and can be simplified as: Using the stochastic recurrence equation theory, it is also possible to determine the condition for the existence of higher moments of the state process Y t . The conditions r ≥ 1, {λ 1 } < 0 in (5) and where S is chosen such that matrix A is diagonalizable, are sufficient for the existence of the κth order moment of the process Y t as shown in Brockwell et al. (2006). Choosing κ = 1, the unconditional stationary mean of Y t exists and can be computed as: where e 1 = [1, 0, . . . , 0] is a q × 1 vector.
In this case, the condition in (8) can be simplified as follows: where µ := +∞ −∞ l 2 dν L (l) is the second order moment of the Lévy measure ν L (l). It is worth noting that condition (10) ensures also the strict stationarity for Y t since, using ln(1 + x) ≤ x, we have the following sequence of inequalities: The exponential of matrix A: is used in the stationary covariance matrix (see Chadraa 2009, for more details) The existence of the second moment of Y ∞ in (8) becomes: is the fourth moment of the Lévy measure ν L (l) and m := +∞ 0 a eÃ t ee eÃ t adt.
Before introducing the higher moments and the autocorrelations, we recall the conditions for the non-negativity of a strictly stationary variance process in (1). Indeed, under the assumption that all the eigenvalues of matrix A are distinct and the relation in (6) holds, the variance process V t ≥ a 0 > 0 a.s. if: a e At e ≥ 0, ∀t ≥ 0.
The condition in (12) is costly since we need to check it at each time instance. Nevertheless some useful controls are available 1 (see Tsai and Chan 2005, for details).
• A necessary and sufficient condition to guarantee that a e At e ≥ 0 in the COGARCH(2, 2) case is that the eigenvalues of A are real and a 2 ≥ 0 and a 1 ≥ −a 2 λ (A), where λ (A) is the biggest eigenvalue.
Combining the requirement in (6) with that in (12) it is possible to derive the higher moments and the autocorrelations for a COGARCH(p, q) model. As a first step, we define the returns of a COGARCH(p, q) process on the interval (t, t + r] , ∀t ≥ 0 as: Let L t be a symmetric and centered Lévy process such that the fourth moment of the associated Lévy measure is finite, we define matrixÃ as: It is worth noting thatÃ has the same structure as A except for the last row q wherẽ A q, = (−b q + µa 1 , . . . , −b 1 + µa q ). For any t ≥ 0 and for any h ≥ r > 0, the first two moments of (13) are and The computation of the autocovariances of the variance of the squared returns (13) is based on the following quantities defined as: and R = 2r 2 µ 2 + ρ.
Observe that P h and P 0 are q × q matrices. Q h and Q 0 are q × 1 vectors and R is a scalar. The q × q matrix COV ( ∞ ) is defined as: The autocovariances of the squared returns (13) are defined as: while the variance of the process G Combining the autocovariances in (20) with (15) and (21) we obtain the autocorrelations: As done for the GARCH(p, q) model the autocovariances of the returns G

Simulation of a COGARCH(p, q) model
In this section we illustrate the theory behind the simulation routines available in the yuima package for a COGARCH(p, q) model. The corresponding sample paths are generated according two different schemes.
The first method is based on the Euler-Maruyama (see Iacus 2008, for more details) discretization of the system in (1). In this case the algorithm follows these steps: • We determine an equally spaced grid from 0 to T composed by N intervals with the same length ∆t: 0, . . . , (n − 1) × ∆t, n × ∆t, . . . , N × ∆t.
• We generate the trajectory of the underlying Lévy process L t sampled on the grid defined in the previous step.
• Choosing a starting point for the state process Y 0 = y 0 and G 0 = 0, we have The discrete quadratic variation of the processes L t process is approximated by • We generate the trajectories for the variance process and for process G n based on the following two equations: Although the discretized version of the state process Y n in (23) can be seen as a stochastic recurrence equation, the stationarity and non-negativity conditions for the variance V n are different from those described in Section 2. In particular, it is possible to determine an example where the discretized variance process V n assumes negative values while the true process is non-negative almost surely.
In order to clarify deeper this issue we consider a COGARCH(1, 1) model driven by a variance gamma Lévy process (see Madan and Seneta 1990;Loregian, Mercuri, and Rroji 2012, for more details about the VG model). In this case, the condition for the non-negativity of the variance in (12) is ensured if a 0 > 0 and a 1 > 0 while the strict stationarity condition in (7) for the COGARCH(1, 1) requires that E[L 2 ] = 1 and a 1 − b 1 < 0. The last two requirements guarantee also the existence of the stationary unconditional mean for the process V t .
The second condition depends on ∆t since for instance if ∆t → 0 the condition is never true since the approximated quadratic variation ∆ [L, L] is non-negative and then also ln 1 + a1∆ [L, L] n is positive.
From a theoretical point of view, the values used for the parameters guarantee the nonnegativity and the stationarity of the variance of a COGARCH(1, 1) process.
To overcome this problem, we provide an alternative simulation scheme based on the solution of the process Y n given the previous point Y n−1 . Applying Ito's Lemma for semimartingales as in Protter (1990) to the transformation e −At Y t , we have: We substitute the definition of Y t in (1) and get: Using the following property for an exponential matrix we get Except for the case where the noise is a compound Poisson, the simulation scheme follows the same steps of the Euler-Maruyama discretization where the state space process Y n on the sample grid is generated through the approximation of the relation in (24): or equivalently: The sample path is simulated based on the recursion in (26) choosing method = mixed in the simulate function, as done below: R> set.seed(123) R> sim2 <-simulate(model1, sampling = samp1, true.parameter = param1, + method = "mixed") R> plot(sim2, + main = "Sample Path of a VG COGARCH(1,1) model with mixed scheme") In the case of the COGARCH(p, q) driven by a compound Poisson Lévy process, a trajectory can be simulated without any approximation of the solution in (24). Indeed it is possible to determine the time where the jumps occur and then evaluate the corresponding quadratic variation in an exact way. Once the trajectory of a random time is obtained, the piecewise constant interpolation is used on the fixed grid, in order to maintain the càdlàg property of the sample path.

Estimation of a COGARCH(p, q) model
In this section we explain the estimation procedure that we propose in the yuima package for the COGARCH(p, q) model. As done for the CARMA(p, q) model driven by a Lévy process in Iacus and Mercuri (2015), we consider a three step estimation procedure that allows the user to obtain estimated values for the COGARCH(p, q) and the parameters of the Lévy measure.
This procedure is structured as follows: • Using the moment matching procedure explained below, we estimate the COGARCH(p, q) parameters a := [a 1 , . . . , a p ], b := [b 1 , . . . , b q ] and the constant term a 0 in the variance process V t . In this phase, the estimation is obtained by minimizing some distances between the empirical and theoretical autocorrelation function.
• Once the COGARCH(p, q) parameters are available, we recover the increments of the underlying Lévy process.
• In the third step, we use the increments obtained in the second step and get the Lévy measure parameters by means of the maximum likelihood estimation procedure.
The process G t in (1) is assumed to be observed on the grid 0, 1 × ∆t, . . . , n × ∆t, . . . , N × ∆t with ∆t = T N . The associated sampled process G n is defined as: In the following we require the underlying Lévy process to be symmetric and centered in zero. The COGARCH(p, q) increments of lag one are determined as: and the increments of lag r as: where r ≥ 1 is an integer and for r = 1 the definition in (4) coincides with (27).
It is worth mentioning that the increments G (r) n can be obtained as a time aggregation of increments of lag one as follows: The time aggregation in (28) can be useful during the optimization routine when the values of increments G (1) n are very small in absolute value. Using the sample G (r) n n≥r , we compute the empirical second order moment n 2 and the empirical autocovariance functionγ (h) is defined as: where d is the maximum lag considered.
The empirical autocorrelations are:ρ We use the theoretical and the empirical autocorrelations in order to define the moment based conditions for the estimation procedure. By the introduction of the (q + p) × 1 vector θ := (a, b) we define the vector function g (θ) : R q+p → R d as follows: where f G (r) n , θ is a d-dimensional real function with the components defined as: In the estimation algorithm, we replace the expectation in (30) with the sample counterpart. The components of vectorĝ (θ) = [ĝ 1 (θ) , . . . ,ĝ d (θ)] are defined as: The vector θ is obtained by minimizing some distances between empirical and theoretical autocorrelations. The optimization problem is: where the components of vectors ρ r andρ r are the theoretical and empirical autocorrelations defined in (22) and (29) respectively. Function d (x, y) measures the distance between vectors x and y. In the yuima package, three distances are available and listed below: • The square of L 2 norm • The quadratic form where the positive definite weighting matrix W is chosen to improve the efficiency within the class of GMM type estimators.
It is worth noting that the objective function ĝ (θ) 2 2 is a special case of the function ĝ (θ) 2 W , where the weighting matrix W coincides with the identity matrix. Both distances are related to the generalized method of moments (GMM) introduced by Hansen (1982). The efficiency of GMM estimators is discussed in Appendix A based on a particular choice of the weighting matrix W.
It is worth noting that by solving the minimization problem in (33), we obtain estimates for the vector θ. The parameter a 0 is determined by the inversion of the formula in (15) where Once the estimates of vector θ are obtained, the user is allowed to retrieve the increments of the underlying Lévy process based on the following procedure. This stage is independent of the nature of the Lévy measure but it is only based on the solution of the state process Y t and on the approximation of the quadratic variation with the squared increments of the Lévy driven process.
Let G t− be the left limit of the COGARCH(p, q) process G t , the increments ∆G t := G t − G t− can be approximated using the observations {G n } N n=0 as follows: Recalling that G t = 0≤s≤t √ V s (∆L s ), the approximation in (37) becomes: where V n is the value of the variance process at the observation time t = n∆t and ∆L n = L n − L n−1 is the discretized Lévy increment from t − ∆t to t.
Using the discretization scheme introduced in (25), the process Y n is written as: and using the result in (38), the difference equation (39) can be approximated in terms of the squared increments of the COGARCH(p, q) process to obtain: Choosing Y 0 equal to the unconditional mean of the process Y t , we are able to retrieve its sample path based on the recursive equation in (40). The only quantities that we need to compute are the squared increments of the COGARCH(p, q) process on the grid {0, 1 × ∆t, 2 × ∆t, . . . , n × ∆t, . . . , N × ∆t}. The estimated state process in (40) is also useful for getting the estimated trajectory of the variance process. Finally note the Lévy increment at a general time t = n × ∆t is obtained as: The increments of the underlying Lévy process in (40) can be used for estimation of some Lévy measure parameters. As observed in Kappus and Reiss (2010), identification of the Lévy measure from increments observed at arbitrary discrete grid is not trivial. In yuima, it is possible to estimate the parameters of a compound Poisson and a variance gamma process.
In the first case the quasi-likelihood estimation in Iacus (2011) is considered while, in the second case, the likelihood function is determined as reported in Seneta (2004).

Usage
In this section we illustrate the main classes and methods in yuima that allow the user to deal with a COGARCH(p, q) model. The routines implemented are based on the results considered in Section 3 for the simulation and in Section 4 for the estimation procedures.
In particular we classify these classes and methods in three groups. The first group contains the classes and functions that allow the user to define a specific COGARCH(p, q) model in the yuima framework. The second group is used for the simulation of the sample paths for the COGARCH(p, q) model and in the third estimation can be done both using simulated or real data. A summary of the main constructors, classes and methods with their usage is reported in Figure 3.

Classes and methods for the definition of a COGARCH(p, q) model
The main object for a COGARCH(p, q) process in the yuima package is an object of class 'yuima.cogarch' that contains all the information about a specific COGARCH(p, q) process. setCogarch(p, q, ar.par = "b", ma.par = "a", loc.par = "a0", Cogarch.var = "g", V.var = "v", Latent.var = "y", jump.variable = "z", time.variable = "t", measure = NULL, measure.type = NULL, XinExpr = FALSE, startCogarch = 0, ...) The arguments used in a call of the function setCogarch can be classified in four groups. The first group composed by p, q, ar.par, ma.par and loc.par contains information about the COGARCH parameters. In particular the integers p and q are the order of the moving average and autoregressive coefficients respectively, while ar.par, ma.par and loc.par indicate the label of the COGARCH coefficients in the variance process.
The arguments Cogarch.var, V.var, Latent.var, jump.variable and time.variable compose the second group and they are strings chosen by the user in order to personalize the model. In particular, they refer to the label of the observed COGARCH process (Cogarch.var), the latent variance (V.var), the state space process (Latent.var), the underlying pure jump Lévy (jump.variable) and the time variable (time.variable).
The third group characterizes the Lévy measure of the underlying noise. measure.type indicates whether the noise is a compound Poisson process or another Lévy without the diffusion component while the input measure identifies a specific parametric Lévy measure. If the noise is a compound Poisson, measure is a list containing two elements: the name of the intensity parameter and the density of the jump size. In the alternative case, measure contains information on the parametric measure. We refer to the yuima documentation for more details.
The last group is related to the starting condition of the SDE in (1). In particular, XinExpr is a logical variable associated to the state process. The default value XinExpr = FALSE implies that the starting condition is the zero vector, while the alternative case, XinExpr = TRUE, allows the user to specify as parameters the starting values for each component of the state variable. The input startCogarch is used to specify the start value of the observed COGARCH process.
An object generated by setCogarch extends the 'yuima.model' class and it has only one additional slot, called @info, that contains an object of class 'cogarch.info'. We refer to the yuima documentation for a complete description of the slots that constitute the object of class 'yuima.model' while an object of class 'cogarch.info' is characterized by slots containing the same information of the setCogarch inputs except for startCogarch, time.variable and jump.variable that are used to fill the slots inherited from 'yuima.model'.

Classes and methods for the simulation of a COGARCH(p, q) model
simulate is a method for an object of class 'yuima.model'. It is also available for an object of class 'yuima.cogarch'. The function requires the following inputs: simulate(object, nsim = 1, seed = NULL, xinit, true.parameter, space.discretized = FALSE, increment.W = NULL, increment.L = NULL, method = "euler", hurst, methodfGn = "WoodChan", sampling = sampling, subsampling = subsampling, ...) In this work we focus on the argument method that identifies the type of discretization scheme when the object belongs to the class of 'yuima.cogarch', while for the remaining arguments we refer to Brouste et al. (2014) and yuima's documentation (YUIMA Project Team 2017). The default value "euler" means that the simulation of a sample path is based on the Euler-Maruyama discretization of the stochastic differential equations. This approach is available for all objects of class 'yuima.model'. For the COGARCH(p, q) an alternative simulation scheme is available choosing method = "mixed". In this case the generation of trajectories is based on the solution (24) for the state process. In particular if the underlying noise is a compound Poisson Lévy process, the trajectory is built using a two step algorithm. First the jump time is simulated internally using the exponential distribution with parameter λ and then the size of jump is simulated using the random variable specified in the slot yuima.cogarch@model@measure. For the other Lévy processes, the simulation scheme is based on the discretization of the state process solution (25) in Section 5.
The simulate method is also available for an object of class 'cogarch.est.incr' as shown in the next section.

Classes and methods for the estimation of a COGARCH(p, q) model
The 'cogarch.est' class is a class of the yuima package that contains estimated parameters obtained through the gmm function. The structure of an object of this class is the same of an object of class 'mle' in the package stats4 (see R Core Team 2017) with an additional slot, called @yuima, that is the mathematical description of the COGARCH model used in the estimation procedure.
The 'cogarch.est' class is extended by the 'cogarch.est.incr' class since the latter contains two additional slots: @Incr.Lev and @logL.Incr. Both slots are related to the estimated increments in (40). In particular @Incr.Lev is an object of class 'zoo' that contains the estimated increments and the corresponding log-likelihood is stored in the slot @logL.Incr.
If we apply the simulate method to an object of class 'cogarch.est.incr', we obtain an object of class 'yuima' where the slot data contains the COGARCH(p, q) trajectory obtained from the estimated increments. As remarked for an object of class 'cogarch.est', even in this case, an object of class 'cogarch.est.incr' is built internally from function gmm. This function returns the estimated parameters of a COGARCH(p, q) model using the approaches described in Section 4.
gmm(yuima, data = NULL, start, method = "BFGS", fixed = list(), lower, upper, lag.max = NULL, equally.spaced = TRUE, Est.Incr = "NoIncr", objFun = "L2") In order to apply the gmm function, the user has to specify the COGARCH(p, q) model, observed data and the starting values for the optimizer using respectively the main arguments yuima, data and start. It is worth noting that the input yuima can be alternatively an object of class 'yuima.cogarch' or an object of class 'yuima' while data is an object of class 'yuima.data' containing observed data. Nevertheless if data = NULL, the input yuima must be an object of 'yuima' and contain the observed data.
Arguments method, fixed, upper and lower are used in the optimization routine. In particular the last three arguments can be used to restrict the model parameter space while, using the input method, it is possible to choose the algorithms available in the function optim.
The group composed by lag.max, equally.spaced and objFun is useful to control the error function discussed in Section 4. In particular, lag.max is the maximum lag for which we calculate the theoretical and empirical autocorrelation function. equally.spaced is a logical variable. If TRUE, in each unitary interval, we have the same number of observations and the COGARCH increments, used in the empirical autocorrelation function, are aggregated on the unitary lag using the relation in (28). Otherwise the autocorrelation function is computed directly using all increments. The objFun is a string variable that identifies the objective function in the optimization step. For objFun = "L2", the default value, the objective function is a quadratic form where the weighting matrix is the identity one. For objFun = "L2CUE" the weighting matrix is estimated using continuously updating GMM (L2CUE). For objFun = "L1", the objective function is the mean absolute error. In the last case standard errors for estimators are not available.
The remaining argument Est.Incr is a string variable. If Est.Incr = "NoIncr", the default value, gmm returns an object of class 'cogarch.est' that contains the COGARCH parameters. If Est.Incr = "Incr" or Est.Incr = "IncrPar" the output is an object of class 'cogarch.est.incr'. In the first case the object contains the increments of the underlying noise while in the second case it contains also the estimated parameters of the Lévy measure. If Est.Incr = "Incr" or Est.Incr = "IncrPar" the output is an object of class 'cogarch.est.incr'. In the first case the object contains the increments of the underlying noise while in the second case it contains also the estimated parameters of the Lévy measure.
Function gmm uses function cogarchNoise for the estimation of the underlying Lévy in a COGARCH(p, q) model. This function assumes that the underlying Lévy process is symmetric and centered in zero.
cogarchNoise(yuima, data = NULL, param, mu = 1) In this case the arguments yuima and data have the same meaning as the arguments discussed for the gmm function. The remaining inputs param and mu are the COGARCH(p, q) parameters and the second moment of the Lévy measure respectively.

Simulation and estimation of COGARCH(p, q) model
In this section we show how to use the yuima package for the simulation and the estimation of a COGARCH(p, q) model driven by different symmetric Lévy processes. As a first step we focus on a COGARCH(1, 1) model driven by different Lévy processes available in the package. In particular we consider the cases in which the driven noise is a compound Poisson with jump size normally distributed and a variance gamma process. In the last part of this section, we show also that the estimation procedure implemented seems to be adequate even for higher order COGARCH models. In particular we simulate and then estimate a COGARCH(2, 1) model driven by a compound Poisson process where the jump size is normally distributed.

COGARCH(1, 1) model with compound Poisson
The first example is a COGARCH(1, 1) model driven by a compound Poisson process. As a first step, we choose the set of the model parameters: R> numSeed <-200 R> param.cp <-list(a1 = 0.038, b1 = 0.053, a0 = 0.04 / 0.053, x01 = 50.33) where a 1 , b 1 and a 0 are the parameters of the state process Y t ; x 0,1 is the starting point of the process X t . The chosen value is the stationary mean of the state process and it is used in the simulation algorithm.
In the following command line we define the model using the setCogarch function.
We simulate a sample path of the model using the Euler discretization. We fix ∆t = 1 15 and the command lines below are used to instruct yuima for the choice of the simulation scheme: R> Term <-1600 R> num <-24000 R> set.seed(numSeed) R> samp.cp <-setSampling(Terminal = Term, n = num) R> sim.cp <-simulate(mod.cp, true.parameter = param.cp, sampling = samp.cp, + method = "euler") In Figure 4 we show the behavior of the simulated trajectories for the COGARCH(1, 1) model G t , the variance V t and the state space X t : R> plot(sim.cp, main = paste("simulated COGARCH(1,1) model driven by a", + "Compound Poisson process")) We use the two step algorithm developed in Section 4 for the estimation of the COGARCH(p, q) and the parameters of the Lévy measure. In the yuima function gmm, we fix objFun = "L2" meaning that the objective function used in the minimization is the mean squared error. Setting also Est.Incr = "Incr", the function gmm returns the estimated increments of the underlying noise.
R> res.cp <-gmm(sim.cp, start = param.cp, objFun = "L2", Est.Incr = "Incr") In this case, the summary reports estimated parameters, their standard error computed according to the matrix V in (43), the logarithm of the L 2 distance between empirical and the theoretical autocorrelation function and some general information about increments. Cogarch(1,1) model: Variance process is positive.
We remark that the standard error for a 0 is not provided in the summary since, as explained in Section 4, its value is obtained once the parameters a and b are estimated and then the variance-covariance matrix V in (43) refers only to these parameters. Figure 5 applying the plot method to the object res.cp R> plot(res.cp, main = "CP Increments of a COGARCH(1,1) model")

The behavior of increments is shown in
We are able also to reconstruct the original process using the increments stored into the object res.cp using the simulate function.

Two Stages GMM estimation
Call:

COGARCH(2, 1) model
We conclude this section by considering a COGARCH(2, 1) driven by a compound Poisson process where the jump size is normally distributed.

Real data
In this section, we show how to use the yuima package in the estimation of a COGARCH(p, q) model using real financial time series and we perform a comparison with the GARCH(p, q) model.
Our dataset is composed by intraday values of the SPX ranging from 09 July 2012 to 01 April 2015 freely downloadable from the http://thebonnotgang.com/ website. In our analysis, we decide to aggregate the data in order to have observations each five minute and the same number of observations each day. Aggregation is performed using function aggregatePrice of the R package highfrequency (Boudt, Cornelissen, and Payseur 2017). This dataset is  Figure 13 shows the behavior of log prices in the whole period.
In the following, we study the time dependence of daily log returns from a qualitative point of view. In Figure 14 we report the autocorrelation function of log returns and of the squared log returns. As expected, the autocorrelation of squares is higher than those of log returns suggesting a possible ARCH effect. To be more precise we perform an ARCH LM test proposed in Engle (1982) and available in the R package FinTS (Graves 2014). The result is shown below.

ACF squared Log ret
Index ACF Figure 14: Autocorrelation of log returns and of the squared log returns.

COGARCH(11) model
The process is strictly stationary The unconditional first moment of the Variance process exists the Variance is a positive process The Lévy measure parameters are unknown and they are estimated using the gmm function available in yuima where Est.Incr = "IncrPar". Looking to the summary, the mean and the standard error of daily increments are almost −0.0003 and 0.11, respectively. Moreover, applying the function Diagnostic.Cogarch to the object resCog11, the estimates identify a stationary COGARCH(1, 1) model where the variance is a positive process with unconditional mean.
Applying the plot method to the object resCog11, it is possible to generate Figure 16 that shows the behavior of the estimated increments of the underlying noise.

R> plot(resCog11, main = "Noise Increments")
Note that the simulate method is also available for objects of class 'cogarch.est.incr', In our exercise, we are able to retrieve the sample path for the COGARCH(1, 1) model, the variance and the state processes applying the simulate function directly to the object resCog11. The result is reported in Figure 17 obtained using the following command lines.
R> simCog <-simulate(resCog11) R> plot(simCog, main = "Real COGARCH(1,1) sample path") The next step is to verify whether the ARCH effect was removed. We apply the ARCH test directly to the daily noise increments. Our analysis can be replicated using the following command lines: ARCH LM-test; Null hypothesis: no ARCH effects data: dayIncr Chi-squared = 6.5531, df = 5, p-value = 0.2561 The value of the p value suggests that the null hypothesis cannot be rejected, i.e., the ARCH effect was removed. From the in-sample analysis, we can conclude that the COGARCH(1, 1) seems to be a reasonable model for the considered dataset.
The out-of-sample analysis is based on the generation of sample paths through Monte Carlo simulation and the comparison with the observed out-of-sample trajectory. We consider two different simulation strategies. In the first, the underlying noise is generated from a variance gamma where the parameters are estimated previously using the gmm function. In the second approach, we use a subsample of the estimated increments, stored in the slot resCog11@Incr.Lev, from where we simulate. We remark that in the second approach, we decide to use the increments only for the last two weeks stored in the in-sample dataset. This choice is done since the last increments are less influenced from the initial value of the state process and due to the fact that they are more representative of recent market conditions. The command lines below are useful for the generation of the sample path by means of both procedures:  Figure 18, we have trajectories generated assuming the underlying Lévy is the estimated variance gamma on the left hand side while on other side the trajectories are obtained resampling the estimated increments. In both, the bold path is the real data. We conclude this section with a comparison between the forecasting ability of a COGARCH(1, 1) with that obtained using the GARCH(1, 1) model where only daily based analysis is possible. Since, as customary for a discrete time model, the GARCH can be simulated only on a time grid composed by natural numbers. For instance, if we use daily data for the estimation of parameters in a GARCH(1, 1) model, we can simulate returns with daily (or lower) frequency and then the generation of high frequency data is not possible.
Here the GARCH(1, 1) model is estimated through function ugarchfit, available in the R package rugarch (Ghalanos 2015), based on a quasi-maximum likelihood estimation procedure (see Bollerslev 1986, for more details on the behavior of the corresponding estimators). As observed at the beginning of this section, the choice of this package is justified from the fact that, once the GARCH(1, 1) model is estimated, we simulate trajectories resampling from the estimated increments (as done in the COGARCH(1, 1) case) and we are able to compare graphically the one and five days distributions obtained with both models. include.mean = FALSE)) R> resGarch <-ugarchfit(data = daydatain, spec = spec) R> coef(resGarch) omega alpha1 beta1 9.482534e-06 1.822243e-01 6.311268e-01

R> library("rugarch")
To generate the sample paths for cumulative returns we use function ugarchboot and we refer to the rugarch documentation for details of these functions R> set.seed(2) R> bootp <-ugarchboot(resGarch, method = c("Partial", "Full")[1], + n.ahead = 5, n.bootpred = 1000) R> series <-apply(bootp@fseries, 1, cumsum) The two generated distributions are reported in Figure 19. On a daily horizon the two distributions are quite similar while increasing the horizon, for example one week, we observe a small departure. This fact is confirmed using the qqplot (see Figure 20) function available in R. The following table reports the quantiles and mean of absolute distance between the cumulative distribution function for all out-of-sample days.

Conclusion
We proposed a computational scheme for simulation and estimation of a COGARCH(p, q) model in the yuima package. Two simulation algorithms have been developed. The first is based on the Euler discretization for the state process while the second uses the solution of its stochastic differential equation. The COGARCH (p, q) parameters were estimated minimizing some distance between empirical and theoretical autocorrelations and then estimation of Lévy increments is possible. The user is allowed to obtain the Lévy measure parameters from the estimated increments.
We showed also the steps for estimation on real data and performed a comparison with a GARCH(p, q) model. In the empirical analysis GARCH(p, q) and COGARCH(p, q) produced similar returns for a one day horizon. The advantage of using the COGARCH(p, q) model was explicited in the possibility of generating intraday returns even if the model is estimated on daily data.

A. Weighting matrix for GMM estimation
Under some regularity conditions (see Newey and McFadden 1994, for a complete discussion), the GMM estimators are consistent and, for any general positive definite matrix W, their asymptotic variance-covariance matrix V is: Matrix D is defined as: while For the square L 2 norm in (35) matrix V becomes: Here we use the continuously updated GMM estimator (see Hansen, Heaton, and Yaron 1996, for more details) and W is determined simultaneously with the estimates of parameters in vector θ.
Introducing the function ĝ (θ) 2Ŵ as the sample counterpart of the quadratic form in (36), the minimization problem becomes: where functionŴ (θ) maps from R p+q to R d×d and is defined as: Observe thatŴ (θ) is a consistent estimator of matrix S −1 that meanŝ and the asymptotic variance-covariance matrix V in (43) becomes:

B. COGARCH(p, q) moments through Teugels martingales
In this appendix we show the calculations for the moments of a stationary COGARCH(p, q) model and the steps for the derivation of the autocorrelation function of squared returns as explained in Section 2.
Hereafter we assume the underlying Lévy process to be symmetric and centered in zero.
All the derivations in this appendix are based on the Teugels martingales (see Nualart and Schoutens 2000) and Ito's Lemma for semimartingales.

Teugels martingales
We assume that a càdlàg pure jump Lévy process L t with finite variation satisfies the condition: E e cL 1 < +∞, ∀c ∈ (−a, a) with a > 0.
We define a process L (k) t such that: The Teugels martingales are: where m 1 = E [L 1 ] and for k ≥ 2, µ and ρ in Section 2 are respectively m 2 and m 4 defined in (50).
In particular it is possible to show that 3 : or equivalently: Starting from the definition of the quadratic covariation, we have: Starting from the definition of covariation for a finite variation process we have: The relation in (51) is obtained using the Teugels Martingales in (49).
we obtain the following representation using the Teugels martingales and assuming m 3 = 0 we obtain: m 1 and m 3 are both equal to zero if we consider a symmetric Lévy process centered in zero and with finite moments.

Stochastic recurrence equations of the state process Y t
The state space process Y t of a COGARCH(p, q) with parameters A, a, a 0 and driving Lévy process L t can be represented as a stochastic recurrence equation, as shown in Theorem 3.3 in Brockwell et al. (2006): where the sequences {J s,t } 0≤s≤t and {K s,t } 0≤s≤t are random matrices q × q and q × 1 random vectors respectively and the distribution of (J s,t , K s,t ) depends only on t − s. Following the same passages in Brockwell et al. (2006), the expected values are: where e 1 = [1, 0, . . . , 0] .
Under the stationarity condition in (6), there exists Y ∞ for any h ≥ 0 defined as the unique solution of the random fixed point equation: Matrices J s,t and vectors K s,t are constructed explicitly if L t is a compound Poisson process. The coefficients in (54) for general Lévy processes are obtained as a limit of the corresponding quantities for compound Poisson processes (see Theorem 3.5 in Brockwell et al. 2006, for explicit formulas).

B.2. First moment of the stationary state process Y t
We derive the first moment of the state process starting from its stationary solution. Under the assumption that the real part of all eigenvalues of matrix A is negative, we have: Apply the Teugels martingales from where we get: Recall that condition (10) ensures E (Y u ) = E (Y ∞ ) ∀u. Taking expectation of both sides in (58) we have: We compute the integral in (59) based on the observation that u −∞ e A(u−s) ds = +∞ 0 e Ay dy = −A −1 . We have an analytical formula for the first unconditional moment of the stationary state process Y t :

B.3. Covariance of the stationary state process Y t
To determine the stationary covariance of the state process Y t , we introduce matrixÃ := A + m 2 ea , and the condition for its existence implies that all eigenvalues ofÃ have strictly negative real part (this consequence arises as an application of the Bauer-Fike perturbation result on eigenvalues).
The first step is to find the stationary solution of process Y t in terms of matrixÃ. We consider the transformation e −Ãt Y t and apply Ito's Lemma: Given the initial condition Y s we have: and, under the assumption of negative eigenvalues forÃ, we obtain: Consider the transformation Z t = a Y t and compute Z 2 t integrating by parts: From (62) we have: Use (53) in computing the expectation: Recall that E (Z u ) =Z and E Z 2 u =Z 2 , ∀u, we obtain: (65) The stationary variance Z t is computed using formulas in (65) and in (60) and we have: , The stationary covariance of Y t is

B.4. First two moments of a COGARCH(p, q)
From the definition in (13), we have that: since L t is a zero-mean martingale.
For the computation of E G (r) t 2 , we observe that Apply the integration by parts to G 2 r = G r G r and observe that where the quadratic variation [G, G] r is defined as: Substitute (72) and (52) in (71) we have: Compute the expectation of both sides in (73) we get: In order to solve the integral in (74), we need to evaluate the quantity E [V t ]. Under condition (10), the unconditional stationary mean of the variance process V t is given by: Substituting (75) in (74), we obtain:

B.5. Autocovariance function for the squared COGARCH(p, q)
We derive the covariance between the squared increments of a COGARCH(p, q) model: To compute E (G r t ) 2 , G r t+h 2 , we apply the law of iterated expectations: Consider (71) Using the stochastic recurrence equation for the representation of the variance V t , we have for any s < t: Since the distribution of the pair (J s,t , K s,t ) depends only on the time interval t − s and using the properties in (55), we write (80) as follows: Recalling the unconditional mean of the state process in (10) and rearranging the terms in (81), we have: Substituting (82) into (79), we have: E G r t+h 2 |F t+r = m 2 a 0 b q r b q − a 1 m 2 + m 2 a t+h+r t+h eÃ (u−t−r) du (Y t+r − E (Y ∞ )) . (83) Using (94), we get: and by straightforward calculations we have: Combining (76) and (105) into (98) we get: