State Space Modeling Using SsfPack in S+FinMetrics 3.0

This paper presents two illustrations of state space modeling in S-PLUS using the SsfPack 3.0 routines implemented in S+FinMetrics 3.0. The state space modeling functions in S+FinMetrics/SsfPack are extremely ﬂexible and powerful and can be used for a wide variety of linear Gaussian state space models and for some non-linear and non-Gaussian state space models.


Introduction
In S-PLUS (TIBCO Software Inc. 2010), state space modeling is implemented in the addon module S+FinMetrics as described in Zivot, Wang, and Koopman (2004), Tsay (2005), and Zivot and Wang (2006). The state space modeling tools in S+FinMetrics are based on the algorithms in SsfPack 3.0 developed by Siem Jan Koopman and described in Doornik (1999, 2008) 1 . SsfPack is a suite of C routines for carrying out computations involving the statistical analysis of univariate and multivariate models in state space form. The routines allow for a variety of state space forms from simple time invariant models to complicated time-varying models. Functions are available to put standard models like ARMA and spline models in state space form. General routines are available for filtering, smoothing, simulation smoothing, likelihood evaluation, forecasting and signal extraction. Full details of the statistical analysis is provided in Durbin and Koopman (2001).
In S+FinMetrics/SsfPack, the linear Gaussian state space model for a multivariate time series y t is the system of equations (1) where t = 1, . . . , n and and it is assumed that E[ε t η t ] = 0.
The state vector α t contains unobserved stochastic processes and unknown fixed effects and the transition equation (1) describes the evolution of the state vector over time using a first order Markov structure. The measurement equation (3) describes the vector of observations y t in terms of the state vector α t through the signal θ t and a vector of disturbances ε t . The deterministic matrices T t , Z t , H t , G t are called system matrices and are usually sparse selection matrices. The vectors d t and c t contain fixed components and may be used to incorporate known effects or known patterns into the model; otherwise they are equal to zero.
The state space model (1)-(6) may be compactly expressed as α 1 ∼ N (a, P), where The initial value parameters are summarized in the (m + 1) × m matrix For multivariate models, i.e., N > 1, it is assumed that the N × N matrix G t G t is diagonal. State space models in S+FinMetrics/SsfPack utilize the compact representation (7) with initial value information (10).
The variance matrix P of the initial state vector α 1 is assumed to be of the form where P ∞ and P * are symmetric m × m matrices with ranks r ∞ and r * , respectively, and κ is a large scalar value, e.g., κ = 10 6 . The matrix P * captures the covariance structure of the stationary components in the initial state vector, and the matrix P ∞ is used to specify the initial variance matrix for nonstationary components. When the ith diagonal element of P ∞ is negative, the corresponding ith column and row of P * are assumed to be zero, and the corresponding row and column of P ∞ will be taken into consideration. When some elements of state vector are nonstationary, the S+FinMetrics/SsfPack algorithms implement an "exact diffuse prior" approach as described in Durbin and Koopman (2001) and Koopman et al. (2008).
The remainder of this paper is organized as follows. Section 2 gives an overview of state space modeling using S+FinMetrics/SsfPack for the annual flow volume of the river Nile based on the local level model. Section 3 illustrates multivariate state space modeling of affine term structure models, and Section 4 gives some final comments.
2. The local level model for the Nile river data Consider a univariate time series y t representing the annual flow volume of the river Nile over the period 1871 to 1970. This data series is included in S+FinMetrics in the "timeSeries" object nile.dat, and is shown in Figure 1. The local level model for y t has the form where it is assumed that E[ε * t η * t ] = 0. In the local level model, y t is the sum of two unobserved components, α t and ε * t . The component α t is the state variable and represents the trend behavior of y t . The transition equation (12) shows that the trend follows a random walk. The component ε * t represents random deviations (noise) from the trend that are assumed to Figure 1: Nile river data from the S+FinMetrics "timeSeries" object nile.dat. be independent from the innovations to α t . The strength of the signal in the trend relative to the random deviation is measured by the signal-to-noise ratio of variances q = σ 2 η /σ 2 ε . The state space form (7) of the local level model has time invariant parameters

State space parameter List component name
with errors σ η η t = η * t and σ ε ε t = ε * t . Since the state variable α t is I(1), the unconditional distribution of the initial state α 1 doesn't have finite variance. In this case, it is customary to set a = E[α 1 ] = 0 and P =var(α 1 ) to some large positive number, e.g., P = 10 7 , in (14) to reflect that no prior information is available. Using (11), the initial variance is specified with P * = 0 and P ∞ = 1. Therefore, the initial state matrix (10) for the local level model has the form where −1 implies that P ∞ = 1.
In S+FinMetrics/SsfPack, a state space model is specified by creating either a list variable with components giving the minimum components necessary for describing a particular state space form or by creating an "ssf" object. To illustrate, consider creating a list variable containing the state space parameters in (15) In the list variable ssf.ll.list, the component names match the state space form parameters in (7) and (10). This naming convention, summarized in Table 1, must be used for the specification of any valid state space model.
To generate 100 observations on the state variable α t+1 and observations y t in the local level model (12)-(14) calibrated to the Nile river (with initial value a 1 = y 0 = 1120) data use > class(ll.sim) [1] "matrix" > colIds(ll.sim) [1] "state" "response" The function SsfSim returns a matrix containing the simulated state variables α t+1 and observations y t . These values are illustrated in Figure 2.

State space modeling algorithms: Filtering, smoothing and forecasting
The S+FinMetrics/SsfPack function KalmanFil implements the Kalman filter forward recursions in a computationally efficient way, see Koopman et al. (2008). The S+FinMetrics/ SsfPack function KalmanSmo computes the Kalman smoother backwards recursions. The functions KalmanFil and KalmanSmo are primarily used by other S+FinMetrics/SsfPack state space functions that require the output from the Kalman filter and Kalman smoother. Filtered and smoothed estimates of α t and y t , with estimated variances, as well as smoothed estimates of ε t and η t , with estimated variances, are computed using the S+FinMetrics/SsfPack function SsfMomentEst. The function SsfMomentEst may also be used to compute outof-sample forecasts and forecast variances of α t and y t . More conveniently, out-of-sample h−step ahead forecasts for α t and y t , along with forecasts variances, are computed using the S+FinMetrics/SsfPack function SsfForecast. The next sub-sections illustrate the use of the S+FinMetrics/SsfPack functions for implementing the state space algorithms for the local level model (12)-(14) calibrated to the Nile river data.

Kalman filter
The Kalman filter recursions for the local level model calibrated to the Nile river data are obtained using the S+FinMetrics/SsfPack function KalmanFil with the optional argument task = "STFIL" (which stands for state filtering) > KalmanFil.ll = KalmanFil(nile.dat, ssf.ll, task = "STFIL") > class(KalmanFil.ll) The function KalmanFil takes as input a vector of response data and either a list describing the minimal state space form or a valid "ssf" object. The result of KalmanFil is an object of class "KalmanFil" with components > names(KalmanFil.ll) [1] "mOut" "innov" "std.innov" "mGain" "loglike" [6] "loglike.conc" "dVar" "mEst" "mOffP" "n.predict" [11] "task" "err" "call" "positions" A complete explanation of the components of a "KalmanFil" object is given in the online help for KalmanFil. The first column of mOut contains the prediction errors v t , the second column contains the Kalman gains, K t , and the last column contains the inverses of the prediction error variances, F −1 t . Since task = "STFIL" the filtered estimates a t|t and y t|t = Z t a t|t are in the component mEst:  The standardized innovations v t /F t are illustrated in Figure 3.

Kalman smoother
The Kalman smoother backwards recursions for the simulated data from the local level model are obtained using the S+FinMetrics/SsfPack function KalmanSmo > KalmanSmo.ll = KalmanSmo(KalmanFil.ll, ssf.ll) > class(KalmanSmo.ll) [1] "KalmanSmo" The function KalmanSmo takes as input an object of class "KalmanFil" and an associated list variable containing the state space form used to produce the "KalmanFil" object. The result of KalmanSmo is an object of class "KalmanSmo" with components > names(KalmanSmo.ll) [1] "state.residuals" "response.residuals" "state.variance" [4] "response.variance" "aux.residuals" "scores" [7] "positions" "call" The component state.residuals contains the smoothing residuals from the state equation, response.residuals contains the smoothing residuals from the measurement equation. The corresponding variances of these residuals are in the components state.variance and response.variance. A multi-panel timeplot of the standardized residuals in the component aux.residuals, illustrated in Figure 4, is created with the plot method.

Filtered and smoothed moment estimates
Filtered and smoothed estimates of α t and y t with corresponding estimates of variances may be computed using the S+FinMetrics/SsfPack function SsfMomentEst. To compute filtered estimates, call SsfMomentEst with the argument task = "STFIL" (which stands for state filtering) > FilteredEst.ll = SsfMomentEst(nile.dat, ssf.ll, task = "STFIL") > class(FilteredEst.ll) Figure 5: Filtered estimates of α t and y t computed from the local level model, calibrated to the Nile river data.
[1] "SsfMomentEst" > names(FilteredEst.ll) [1] "state.moment" "state.variance" "response.moment" [4] "response.variance" "task" "positions" The function SsfMomentEst takes as input a vector of response data and either a list describing the minimal state space form or a valid "ssf" object. The result of SsfMomentEst is an object of class "SsfMomentEst" for which there is only a plot method. The filtered estimates a t|t and y t|t = c t + Z t a t|t are in the components state.moment and response.moment, respectively, and the corresponding filtered variance estimates are in the components state.variance and response.variance. From the measurement equation (13)  The plot method creates a multi-panel timeplot of the estimates of α t and y t without standard error bars.
A plot of the filtered state estimates with 90% confidence intervals is shown in Figure 5.
The smoothed estimatesα t andŷ t along with estimated variances may be computed using SsfMomentEst with task = "STSMO" (state smoothing) The smoothed state estimates with 90% confidence bands are illustrated in Figure 6. Compared to the filtered state estimates, the smoothed estimates are "smoother" and the confidence bands are slightly smaller.

Forecasting
For a state space model, the Kalman filter prediction equations produces one-step ahead predictions of the state vector, along with prediction variance matrices. You can compute out of sample predictions and associated mean square errors from the Kalman filter prediction equations by extending the sample data set {y 1 , . . . , y n } with a set of missing (NA) values. When y τ is missing, the Kalman filter reduces to the prediction step. As a result, a sequence of m missing values at the end of the sample produces a set of h-step ahead forecasts for h = 1, . . . , m.
To produce out-of-sample h-step ahead forecasts y t+h|t for h = 1, . . . , 10 a sequence of 10 missing values is appended to the end of the Nile river data The forecast values and mean squared errors are computed using the function SsfMomentEst with the argument task = "STPRED" > PredictedEst.ll = SsfMomentEst(nile.dat.new, ssf.ll, task = "STPRED") > nile.dat.fcst = PredictedEst.ll$response.moment > fcst.var = PredictedEst.ll$response.variance The actual values, forecasts and 50% confidence bands are illustrated in Figure 7.
The S+FinMetrics function SsfForecast automates the process of forecasting from a state space model. The function takes the following arguments: > args(SsfForecast) function(mY, ssf, ikf = NULL, newdata = NULL, n.predict = 1) where mY denotes the series to be forecast, ssf denotes the state space object with response variable mY, ikf denotes an object returned from a call to KalmanIni, newdata holds the out-of-sample values of any exogenous variables, and n.predict specifies the number of hstep-ahead predictions to compute. The returned value is an S+FinMetrics "forecast" object for which there are print, summary and plot methods.
To produce 30 h-step-ahead out-of-sample forecasts for the local level model calibrated to the Nile River data, use Figure 7: Actual values, h-step forecasts and 50% confidence intervals for y t from local level model calibrated to Nile river data.
> nile.dat.fcst2 = SsfForecast(nile.dat, ssf.ll, n.predict = 10) > class(nile.dat.fcst2) [1] "forecast" The returned object is of class "forecast". The summary method shows the forecasts with standard errors: Use the plot method to plot the forecasts with standard error bands along with a portion of the original data. The following command plots the forecasts with ± 0.6745 standard error bands (50% confidence bands) and the last 10 observations of the data: > plot(nile.data.fcst2, xold = nile.dat, width = 0.6745, n.old = 10) Figure 8 shows the resulting plot.

Maximum likelihood estimation of state space models
The prediction error decomposition of the log-likelihood function for the unknown parameters ϕ of a state space model may be conveniently computed using the output of the Kalman filter ln L(ϕ|Y n ) = n t=1 ln f (y t |Y t−1 ; ϕ) where f (y t |Y t−1 ; ϕ) is a conditional Gaussian density implied by the state space model (1)-(6). The vector of prediction errors v t and prediction error variance matrices F t are computed from the Kalman filter recursions.
A useful diagnostic is the estimated variance of the standardized prediction errors for a given value of ϕ:σ As mentioned by Koopman et al. (1999), it is helpful to choose starting values for ϕ such that σ 2 (ϕ start ) ≈ 1. For well specified models,σ 2 (φ mle ) should be very close to unity.
In some models, e.g., the local level model, it is possible to solve explicitly for one scale factor and concentrate it out of the log-likelihood function (17). The resulting log-likelihood function is called the concentrated log-likelihood or profile log-likelihood and is denoted ln L c (ϕ|Y n ).
Following Koopman et al. (1999), let σ denote such a scale factor, and let with ε c t ∼iid N (0, σ 2 I) denote the scaled version of the measurement equation (3). The state space form (1)-(3) applies but with G t = σG c t and H t = σH c t . This formulation implies that one non-zero element of σG c t or σH c t is kept fixed, usually at unity, which reduces the dimension of the parameter vector ϕ by one. The solution for σ 2 from (17) is given bỹ and the resulting concentrated log-likelihood function is The S+FinMetrics/SsfPack function SsfFit may be used to compute MLEs of the unknown parameters in the state space model (1) where parm is a vector containing the starting values of the unknown parameters ϕ, data is a rectangular object containing the response variables y t , and FUN is a character string giving the name of the function which takes parm together with the optional arguments in ... and produces an "ssf" object representing the state space form. The remaining arguments control aspects of the S-PLUS optimization algorithm nlminb. An advantage of using nlminb is that box constraints may be imposed on the parameters of the log-likelihood function using the optional arguments lower and upper. See the online help for nlminb for details. A disadvantage of using nlminb is that the value of the Hessian evaluated at the MLEs is returned only if an analytic formula is supplied to compute the Hessian. The use of SsfFit is illustrated with the following examples.
To estimate the unknown parameters ϕ = (σ 2 η , σ 2 ε ) of the local level model, the S+FinMetrics/ SsfPack function SsfFit requires as input a function which takes the parameters ϕ and produces the state space form for the local level model. One such function is: where we note that parm[1] = sig2.n (i.e., σ 2 η ) and parm[2] = sig2.e (i.e., σ 2 ε ) upon entry of the function.
In the local level model, the variance parameter σ 2 ε can be analytically concentrated out of the log-likelihood leaving the signal-to-noise ratio q = σ 2 η /σ 2 ε as the only parameter to be estimated. The advantages of concentrating the log-likelihood are to reduce the number of parameters to estimate, and to improve the numerical stability of the optimization. A function to compute the state space form for the local level model, as a function of q only, is > ll.modc = function(parm) { + parm = exp(2*parm) + ssf.llc = GetSsfStsm(irregular = 1, level = sqrt(parm)) + CheckSsf(ssf.llc) + } where we note that parm = 0.5*log(q) = log(sqrt(q)) (and therefore q = exp(2*parm)) upon entry of the function.

Affine term structure models
Traditionally the study of the term structure of interest rates focuses on either the cross sectional aspect of the yield curve, or the time series properties of the interest rate. Recently, researchers have utilized state space models and Kalman filtering techniques to estimate affine term structure models by combining both time series and cross sectional data. For simple models, the state space representation is often linear and Gaussian and analysis is straightforward. For more general models, the unobserved state variables generally influence the variance of the transition equation errors making the errors non-Gaussian. In these cases, non-standard state space methods are necessary. Duffie and Kan (1996) show that many of the theoretical term structure models, such as the Vasicek (1977) model, Cox, Ingersoll, and Ross (1985) square root diffusion model, Longstaff and Schwartz (1992) two-factor model, and Chen (1996) three factor model, are special cases of the class of affine term structure models. The class of affine term structure models is one in which the yields to maturity on default-free pure discount bonds and the instantaneous interest rate are affine (constant plus linear term) functions of m unobservable state variables X t , which are assumed to follow an affine diffusion process where W t is an m × 1 vector of independent Wiener processes; Ψ is a p × 1 vector of model specific parameters; U(·) and Σ(·) are affine functions in X t such that (20) has a unique solution. In general, the functions U(·) and Σ(·) can be obtained as the solution to some ordinary differential equations. Only in special cases are closed form solutions available. In this class of models, the price at time t of a default-free pure discount bond with time to maturity τ has the form where A(τ, Ψ) is a scalar function and B(τ, Ψ) is an m × 1 vector function. The time-t continuously compounded yield-to-maturity on a pure discount bond with time to maturity τ is defined as which, using (21), has the affine form

State space representation
Although (23) dictates an exact relationship between the yield Y t (τ ) and the state variables X t , in econometric estimation it is usually treated as an approximation giving rise to the measurement equation where t is a normally distributed measurement error with zero mean and variance σ 2 τ . For any time to maturity τ , the above equation can be naturally treated as the measurement equation of a state space model, with X t being the unobserved state variable. To complete the state space representation, the transition equation for X t over a discrete time interval h needs to be specified. Defining Φ(X t ; Ψ, h) = var(X t+h |X t ), Duan and Simonato (1999) show that the transition equation for X t has the form In this transition equation η t ∼ iid N (0, I m ) and Φ(X t ; Ψ, h) 1/2 represents the Cholesky factorization of Φ(X t ; Ψ, h).
In general, the state space model defined by (24) and (25) is non-Gaussian because the conditional variance of X t+h in (25) depends on X t . Only for the special case in which Σ(·) in (20) is not a function of X t , is the conditional variance term Φ(X t ; Ψ, h) also not a function of X t and the state space model is Gaussian 2 . See Lund (1997) for a detailed discussion of the econometric issues associated with estimating affine term structure models using the Kalman filter. Although the quasi-maximum likelihood estimator of the model parameters based on the modified Kalman filter is inconsistent, the Monte Carlo results in Duan andSimonato (1999) andde Jong (1985) show that the bias is very small even for the moderately small samples likely to be encountered in practice.

Illustration
The data used for the following example are in the S+FinMetrics "timeSeries" fama.bliss, and consist of four monthly yield series over the period April, 1964to December, 1997 for the U.S. Treasury debt securities with maturities of 3, 6, 12 and 60 months, respectively. This data was also used by Duan and Simonato (1999). All rates are continuously compounded rates expressed on an annual basis. These rates are displayed in Figure 9.

Estimation of Vasicek's model
In the Vasicek (1977) model, the state variable driving the term structure is the instantaneous (short) interest rate, r t , and is assumed to follow the mean-reverting diffusion process where W t is a scalar Wiener process, θ is the long-run average of the short rate, κ is a speed of adjustment parameter, and σ is the volatility of r t . Duan and Simonato (1999) show that the functions A(·), B(·), a(·), b(·) and Φ(·) have the form where λ is the risk premium parameter. The model parameters are Ψ = (κ, θ, σ, λ) . Notice that for the Vasicek model, Φ(X t ; Ψ, h) = Φ(Ψ, h) so that the state variable r t does not influence the conditional variance of transition equation errors, the state space model is Gaussian.
The exponential transformation is used for those parameters that should be positive, and, since the data in fama.bliss are monthly, the default length of the discrete sampling interval, h, is set to 1/12.
The smoothed estimates of the short-rate and the yields are computed using SsfCondDens with task = "STSMO"  Figure 10 gives the smoothed estimate of the instantaneous short rate r t from (26). The differences between the actual and smoothed yield estimates are displayed in Figure 11. The model fits well on the short end of the yield curve but poorly on the long end.
As another check on the fit of the model, the presence of serial correlation in the standardized innovations is tested using the Box-Ljung modified Q-statistic (computed using the S+FinMetrics function autocorTest). The null of no serial correlation is easily rejected for the standardized innovations of all yields.

Conclusion
This paper presented two examples to illustrate the basics of state space modeling using S+FinMetrics/SsfPack. Full details of the use of S+FinMetrics/SsfPack are given in Zivot and Wang (2006), and several examples in macroeconomics and finance are provided in Zivot et al. (2004) and Tsay (2005). The state space modeling functions in S+FinMetrics/SsfPack are extremely flexible and powerful and can be used for a wide variety of univariate and multivariate linear Gaussian state space models. Users can build custom state space representations from scratch or they can use S+FinMetrics/SsfPack functions for specifying common state space models including ARIMA, seasonal ARIMA, regression, spline, and structural time series models. The state space representation in S+FinMetrics/SsfPack also allows for Markov switching in the system matrices as described in Kim (1994). Model fitting via the prediction error decomposition can be done using maximum likelihood or quasi-maximum likelihood, the latter being useful for linear but non-Gaussian state space models such as the log-normal stochastic volatility model. Because the state space algorithms are implemented in C, they are very fast. However, the internals of the algorithms are hidden from the user and cannot be modified which is a drawback to the user.