%ERA : A SAS Macro for Extended Redundancy Analysis

A new approach to structural equation modeling based on so-called extended redundancy analysis has been recently proposed in the literature, enhanced with the added characteristic of generalizing redundancy analysis and reduced-rank regression models for more than two blocks. In this approach, the relationships between the observed exogenous variables and the observed endogenous variables are moderated by the presence of unob-servable composites that were estimated as linear combinations of exogenous variables, permitting a great ﬂexibility to specify and ﬁt a variety of structural relationships. In this paper, we propose the SAS macro %ERA to specify and ﬁt structural relationships in the extended redundancy analysis (ERA) framework. Two examples (simulation and real data) are provided in order to reproduce results appearing in the original article where ERA was proposed.


Introduction
Within the component analysis (CA) framework (Meredith and Millsap 1985;Schonemann and Steiger 1976), redundancy analysis (RA; Van den Wollenberg 1977) is the simplest type of structural-equation model between two sets of observed variables, in which latent variables are intended as components. The aim of RA is to extract a series of linear components from a set of exogenous variables in such a way that they are mutually orthogonal and successively account for the maximum variance of a set of endogenous variables. In this framework, RA may be viewed as a special type of structural-equation model where: (C1) a formative relationship is always assumed between the unobserved and observed (exogenous) variables, and (C2) endogenous variables are always observed ones.
A few attempts have been made to extend RA to more than two sets of variables. However, they are limited to relationships among three sets of variables (Davies and Tso 1982;Reinsel and Velu 1998) as well as being limited to particular types of models (Bougeard, Qannari, Lupo, and Hanafi 2011). Recently, a new method based on so-called extended redundancy analysis (ERA; Takane and Hwang 2005), which generalizes RA for more than two blocks, has been proposed in the literature. In ERA, the relationships between the observed exogenous variables and the observed endogenous variables are moderated by the presence of linear composites (hereinafter LCs): LCs are estimated as exact linear combinations of formative indicators, and both component weights and component loadings are estimated by consistently minimizing a single objective function.
Unfortunately, software to estimate ERA models is not available, hence in the present paper a SAS (SAS Institute Inc. 2011) macro to specify and fit ERA models is proposed. The paper is organized as follows: Section 2 describes the ERA framework, Section 3 presents and discusses the macro %ERA, Section 4 illustrates two studies (empirical and simulative) to assess if the proposed macro is able to reproduce the results of ERA as given in the original paper where the method was proposed, Section 5 offers conclusions.

The ERA model: Specification and fitting
The ERA model can generally be stated as follows. Let Y denote an n × p matrix consisting of p observed endogenous variables on n subjects. Let X denote an n × q matrix consisting of q observed exogenous (formative) variables. Assume that the columns of the matrices are mean centered with variance scaled to unit. Hence, the model can be expressed as where W denotes a q×D matrix of composite weights, A denotes a D×p matrix of composite loadings, E denotes an n × p matrix of residuals, and F (= XW) denotes an n × D matrix of composite scores. For identification, F is restricted to be diag(F F) = I D . To illustrate further, suppose there are for example three sets of variables: X 1 = [x 1 |x 2 ], X 2 = [x 3 |x 4 ], and Y = [y 1 |y 2 ], and suppose a certain relationship among the three sets of variables is considered, as displayed in Figure 1.
The associated ERA model is: where w 1 = (w 1 , w 2 ), w 2 = (w 3 , w 4 ) are composite weights for X 1 and X 2 , respectively,  scores F = [f 1 |f 2 ] respectively, and suppose that f 1 has also an impact on X 2 = [x 3 |x 4 ], the formative block of f 2 ; this leads to the following structural equations (as depicted in Figure 2): where a 1 = (a 1 , a 2 ), a 2 = (a 3 , a 4 ), and a 3 = (a 5 , a 6 ) are composite loadings for f 1 on X 2 , and for f 1 and f 2 on Y, respectively, whereas E 2 = (e 1 |e 2 ), and E 1 = (e 3 |e 4 ) are two error matrices.
Using the rule that when an observed variable block is exogenous as well as endogenous (X 2 ) it is included in both Y and X (Takane and Hwang 2005), the associated ERA model becomes

Estimation
To estimate parameters, the loss function associated to the model in Equation 1 is where SS[Z] = trace(Z Z) and vec(X) is the vectorization operator, by which X columns are stacked one on the other. Notice that W and A may contain fixed (zero) elements, depending on the specified model. Unfortunately, unlike statistical methods based on singular value decomposition (SVD) or generalized SVD, minimization of the loss function in Equation 6 cannot be achieved in a closed form, due to the fixed values in W and A , containing zeros. Hence, ERA authors employ an alternating least squares (ALS) algorithm, developed by Kiers and Ten Berge (1989). In the algorithm, matrices W and A are alternately updated in a two-step algorithm until convergence is reached.
To this end the loss function in Equation 6 may be expressed in two alternative versions where "⊗" is the Kronecker product.
In the first step, W is updated for fixed A (using Equation 7) and in the second step, A is updated for fixed W (using Equation 8). More explicitly, let w be the vector obtained by eliminating the zero elements of vec(W), and let Ω be the matrix obtained by eliminating the columns of A ⊗ X corresponding to the zero elements of vec(W). The LS estimate of w therefore isw = (Ω Ω) −1 Ω vec(Y).
The Moore-Penrose generalized inverse is used if (Ω Ω) is singular. W is then reconstructed fromw and F is normalized to respect the identification restriction.
In the second step, let a be the vector obtained by eliminating the zero element of vec(A ), and let Γ be the matrix obtained by eliminating the columns of I ⊗ F corresponding to the zero elements of vec(A ). The LS estimate of a subsequently is The Moore-Penrose generalized inverse is used if (Γ Γ) is singular. A is then reconstructed from a.
The above steps are iterated until convergence. Nonparametric bootstrap was employed to estimate the standard errors of parameter estimates whereas the critical ratio (CR), i.e., the parameter estimate divided by its standard error, can be used to test the significance of the estimates.
Finally, ERA allows the evaluation of the total fit of a hypothesized model, measured by the total variance of the observed endogenous variables explained by the exogenous variables. The fit index Ψ is whereỸ collects all endogenous variables. They may be represented by a single block (see Equation 2, whereỸ = Y) or by multiple blocks (see Equation 5, whereỸ = [X 2 |Y]). This fit index ranges from 0 to 1. The larger the fit value, the more variance of the endogenous variables is explained by the exogenous variables.
In accordance with the traditional use of CA, the ERA approach is viewed as purely descriptive, without any pretension of inferential aspects. However, beside the overall fit statistic (Ψ), some other diagnostics and tests on residuals can be used to evaluate the adequacy of the model. For each dependent variable stored in the Y matrix, residuals can be plotted against the predicted values. There should be no pattern to these plots if residuals are "white-noise". As a formal test we can use the White test for assessing residual homoscedasticity (we suggest this test since it does not require the prior knowledge of the form of heteroscedasticity). Tests for residual randomness, such as a modified version of the Durbin-Watson test and the run test (Pratschke 1971) can also be considered. However, unlike the case of time series data, both tests have limited applicability, since they are affected by subjective choices in ordering the residuals that may give misleading results. More studies are needed in this context.

Imposing additional constraints on parameters
A variety of structural hypotheses on model parameters (in W or A ) may be incorporated in the form of linear constraints by the reparameterization method (Takane and Hwang 2005), which specifies the space spanned by column vectors of a constraint matrix H on a vector of parameters α . Let H define the linear constraints on a, incorporated by specifying a = Hα, for some vector α of unconstrained parameters. After the estimation of α (α * ) obtained by projecting ΓH on vec(Y), as in Equation 10, the constrained estimate of a (a * ) is estimated by a * = Hα * . Another possibility to specify constraints (e.g., for a) is in the null-space form P a = 0, where P is a matrix incorporating a system of constraints among parameters (a single constraint is thus incorporated in the vector form p a = 0). Virtually identical considerations can be made for w (see Takane and Hwang 2005, for details).

The macro %ERA
Excerpts of SAS code will now be given to specify the model of Equation 2 (see Figure 1).

Path diagram declaration
The structural equations have to be declared in the cards of the first part of the macro. XF contains the first segment of the path diagram: x 1 and x 2 form the LC f 1 , whereas x 3 and x 4 form the LC f 2 . FY contains the second segment of the path diagram: f 1 goes towards y 1 and y 2 , f 2 does the same.

Estimation steps
After manipulating this input into the corresponding variable matrices X and Y and parameters matrices W and A , the algorithm can start.

Bootstrap procedure
The algorithm estimates standard errors using bootstrap, firstly replicating B times (the number of replic in the %ERA macro) the original dataset.
proc surveyselect data=data_nomiss method=urs rate=1 seed=813616000 rep = &replic out = data_boot; run; /* expands numberhits > 1 */ data data_boot; set data_boot; do i = 1 to numberhits; output; end; drop i numberhits; run; then for each of the replications, it proceeds to fit the model as specified above %do rep = 1 %to &replic; %put EXECUTING REPLICATION &rep OUT OF &replic; use zdata_boot where (replicate = &rep); The macro uses the multiple datasets sampled with the surveyselect procedure. Each of them has an index (replicate), which will be matched to the &rep macro variable to get the specified bootstrap sample estimates.

Bootstrap estimates
Using all the stored replications, mean bootstrap estimates, bootstrap s.e. and critical ratios (one-shot parameters divided by their bootstrap s.e.) can be calculated. At the end of the %ERA macro execution all outputs are stored in the SAS work directory, with the following data tables:

Mean bootstrap estimates
• a_est_oneshot, the one-shot estimates for W and A.
• b1_est_boot, the bootstrap average estimates for W and A.
• b2_bias__boot, the bootstrap estimates of bias for W and A.
• b3_corrected__boot, the bootstrap corrected estimates for W and A.
• c_se__boot, the bootstrapped standard errors for the estimates.
• d_cr__boot, the bootstrapped critical ratios for the estimates.
• f_replic_w, all the estimates for each bootstrap replicate, for W.
• f_replic_a, all the estimates for each bootstrap replicate, for A.
• out_era, the output dataset, containing the standardized dataset used for the one-shot estimation with additional variables such as fitted values (endogenous variables) and latent composite scores, as well as residuals and leverages.
• zdata__boot, the data table containing all the bootstrap samples.
• The original data table and the data table with only non-missing observations.

Convergence and initial values
The ALS steps described in Section 2.1 are iterated in the macro until at least one of the following criteria is met: • The difference in absolute value between A (and W) in the (k − 1)th iteration and in the k-th iteration is lower than the specified threshold c (the value threshold in the macro, e.g., c = 0.0001).
• To ensure faster convergence and to minimize the risk of local minima, as suggested by Takane and Hwang (2005), starting values for W and A are chosen as follows: • The PCA solutions on the X matrix are computed and used as starting values for W, and the principal coordinates are used to obtain F.

Imposition of linear constraints
Restrictions on parameters can be applied, as shown in Section 2.2, by declaring the matrices of linear constraints B and H (for W and A respectively), for example by imposing the constraint w 4 = 0 (and no constraint on A ) in the path of Figure 1. This leads to the following constraint matrices: The code below specifies B and H. The estimation procedure thus incorporates these two matrices into the steps of the algorithm.

Macro invocation
The code below invokes the macro.
%macro ERA(dataset, iter, replic, threshold, startw, starta); The macro parameters are: • dataset: The input SAS dataset to be used for the model, including the library path (e.g., lib.data with lib as the SAS library).
• iter: The maximum number of iterations M of the ALS algorithm. Must be integer and greater than 1 (e.g., M = 100). This parameter is essential to stop the macro if convergence is not reached in M ALS iterations. An output alert message will be shown in the SAS log window if convergence does not occur.
• replic: The number of Bootstrap replications to be generated to obtain standard errors and critical ratios for the estimates. Must be integer and greater than 1 (e.g., B = 100).
• threshold: The convergence threshold c for the estimates. Must be between 0 and 1 (e.g., c = 0.0001). This parameter has higher priority than iter: If convergence is reached before the maximum number M of ALS iterations the algorithm can stop. However, both parameters must be properly declared in the macro.

User-defined functions
To succesfully implement the ERA algorithm, several functions have been used in the script: • delrow(x, i), that deletes the i-th row from a generic matrix X.
• delcol(x, i), that deletes the i-th column from a generic matrix X.
• vectorise(x), that applies the vectorization operator to a generic matrix X.
• normalize(x), that transforms a generic matrix X so that diag(X X) = I.
• update(X, y, c), that reconstructs the original matrix X to which row/column elimination has been previously applied.

Application of the macro %ERA
To assess the %ERA macro performance, it will be used in two empirical studies that the authors performed in the paper where the ERA model was proposed. An attempt to replicate the same results will be made, firstly in an empirical application using the World Health Organization (WHO) data, and secondly in a simulation study where the authors analyzed the small sample behavior of ERA.

The WHO data application
As far as the empirical study is concerned, Takane and Hwang (2005) selected 51 nations from the 1999 World Health Report, published by the World Health Organization in the United Nations, for which the data were non-missing for the following variables: infant mortality rate (IMR), maternal mortality ratio (MMR), -both considered as endogenous variables -and the following as exogenous variables: real gross domestic product (GDP) per capita adjusted for purchasing power parity, expressed in 1985 US dollars, the average number of years of education given for females aged 25 years and above (FEDU), the percentage of children immunized against measles in 1997 (Measles), and total health expenditures as a percentage of GDP in 1995 (HealthExp). The authors identify two components for the last four observed exogenous variables. One component called "social and economic factor" (SE) was defined as a linear combination of GDP and FEDU, and the other called "health services factor" (HS) composed of Measles and HealthExp. The specified two-component model (Equation 2) is the same of that presented in Figure 1, where X %ERA(dataset = who, iter = 100, replic = 100, threshold = 0.0001, startw = 0, starta = 0); Table 1 reports the original ERA results (thereby called "H-T", the initials of the authors) and those reproduced by the macro %ERA, for the unconstrained model specified in Figure 1, as well as for the constrained model. In the latter we assumed the component weight for   In Table 1 bootstrapped standard errors are obtained with 100 bootstrap samples, for both the unconstrained and constrained model. To fit the constrained model, before the macro invocation of the unconstrained model (and after the input of both path diagrams) the following additional code must be inserted: data con_w; input w01-w04; cards; 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 ;run; %ERA(dataset = who, iter = 100, replic = 100, threshold = 0.0001, startw = 0, starta = 0); Differences between the estimates are quite negligible and are probably due to technical details in the macro such as convergence thresholds or different numerical methods for approximation used by the authors. Moreover, as additional results, Table 2 reports the mean of the bootstrap estimates and the "bias corrected" versions of estimated parameters.
For what concerns diagnostics in the unconstrained model, Figure 3 reports, for each dependent variable, the plot between residuals and predicted values. Homoscedasticity is clearly rejected at a 0.05 significance level, since p = 0.02 for IMR and p = 0.03 for MMR. The result is also confirmed by Figure 3.
Finally, to complete the picture of diagnostics, Figure 4 reporting, for each dependent variable, studentized residuals versus leverages, does not suggest any influential observation, but only few outliers and unusual observations.

The simulation study
The same simulation study which appeared in the original paper of Takane and Hwang (2005) is conducted with the macro %ERA. Specifically the model in Equation 2, detailed in Figure 1, is chosen with the following parameter values: For the generation of X and Y, normality assumption was taken, with X ∼ N (0, Θ) and The sample sizes considered for this Monte Carlo simulation consisting of N = 1000 samples are n = 100, 200, 400, with X generated only once for each sample size. Furthermore, the initial values of the parameters used in the algorithm are indeed the true values, to ensure a faster convergence to the global minimum.
To measure the degree of recovery of the population parameters the mean congruence coefficient between real parameters (θ) and estimates ( θ) stacked vectors is used .
For the simulation the %ERA macro has been slightly modified in terms of input handling and output datasets. Details are not provided since it is out of the scope of this paper.
A deep look at the results of Table 3 also indicates two main considerations: • When n ≥ 200 the %ERA loadings as a group appear less biased than those of H-T, (with unclear pattern when n < 200) whereas standard errors are always higher than those of H-T.    Table 3: True parameter (Value), estimate (Est.), standard error (S.E.) and congruence coefficient by sample size (n) for the macro ERA (%ERA macro) and those reported in the original ERA paper (H-T results).
• When n ≥ 200, the %ERA weights standard errors are generally lower than those of H-T, whereas there is not a clear pattern concerning the bias.
Trying to explain this trade-off between bias and standard errxor for loading (at least for the %ERA macro), simulation results for sufficiently large samples confirm a direct relation between reduction of biases (at 2n with respect to n) and reduction of standard errors: Components loadings, exhibiting lower bias reduction, also demonstrate a lower reduction of standard errors. Moreover, the fact that at different n the best approach (%ERA and H-T) in term of bias does not outperform the other in term of standard errors may reflect that ERA, and ALS algorithms in general, are prone to the problem of local minima, especially for loading parameters.

Conclusion
Extended redundancy analysis is a data-analytical technique for complex structural models in a component analysis framework. The ERA estimates component weights and component loadings by minimizing an overall model fit, such as the sum of squares of discrepancies between the observed endogenous variables and their predicted counterparts from the exogenous without any explicit distributional assumptions. Although ERA has potential application in diverse areas within and outside the behavioral sciences, its use is hampered by a main issue: Easily accessible software for estimating ERA models is lacking so far. In this paper, we address this issue, presenting the %ERA macro, which was built in SAS. Two possible extensions can be proposed in future works: The first is the inclusion of external covariates which do not belong to formative LC indicators, but instead may have an impact on these LCs and a direct effect on the endogenous variables too; the second extension is linked to a specification of structural links between an exogenous and an endogenous LC that, at present, must be mediated by the formative indicators of an endogenous LC. Potential candidates to accommodate both issues may be found in more general methods, such as Generalized redundancy analysis, which may offer promising results (Lovaglio and Vittadini 2014).