%GI: A SAS Macro for Measuring and Testing Global Imbalance of Covariates within Subgroups

The global imbalance (GI) measure is a way for checking balance of baseline covariates that confound efforts to draw valid conclusions about treatment effects on outcomes of interest. In addition, GI is tested by means of a multivariate test. The GI measure and its test overcome some limitations of the common way for assessing the presence of imbalance in observed covariates that were discussed in D'Attoma and Camillo (2011). A user written SAS macro called %GI, to simultaneously measure and test global imbalance of baseline covariates is described. Furthermore, %GI also assesses global imbalance by subgroups obtained through several matching or classification methods (e.g., cluster analysis, propensity score subclassification, Rosenbaum and Rubin'84), no matter how many groups are examined. %GI works with mixed categorical, ordinal and continuous covariates. Continuous baseline covariates need to be split into categories. It also works in the multi-treatment case. The use of the %GI macro will be illustrated using two artificial examples.


Introduction
Assessing balance of non-equivalent groups is fundamental before estimating effects of treatments on outcomes of interest, especially in the presence of observational data where the rule that governs treatment assignment is generally unknown, and either units are self-selected into treatments or they are non randomly selected to receive a treatment. Various methods are used to balance groups with unequal distribution of covariates -i.e., matching, cluster analysis, propensity score (PS) adjustments. The most widely used and applied in various fields is the PS adjustment (Rosenbaum and Rubin 1983). PS is the conditional probability that a unit will be assigned to the treatment condition based on a set of observed covariates. Then, propensity score adjustments (e.g., PS subclassification) are used to balance groups with unequal distributions of covariates. Another method is cluster analysis that balances unequal group distributions by stratifying on clusters based on several covariates D'Attoma and Camillo 2011;Peck 2005). Unlike PS, cluster analysis does not create a single aggregate score, permitting each covariate to maintain its functional form.
In conjunction with the increasing use of methods to balance non-equivalent groups different criteria for checking balance have been proposed (Rosenbaum and Rubin 1984;Rubin 2001;Baser 2006), but most of them assess separately balance variable-by-variable. In this paper a macro that allows to simultaneously measure and test imbalance of a set of baseline covariates is provided. The macro computes and tests the global imbalance (GI) measure introduced in D'Attoma and Camillo (2011), mainly based on the concept of between-groups inertia of a factorial predictor space. According to it, perfect balance occurs when the between-groups inertia equals zero; whereas, perfect imbalance occurs when the between-groups inertia equals the total inertia (I T ), which indicates that the observed total variability of the X-space is completely due to the selection mechanism. Thus, the proposed GI measure in data varies in [0, I T ]. A SAS macro %GI is described that measures and tests global imbalance on subgroups. The macro mainly uses the SAS/IML language. The final summary about balance is saved in a SAS dataset in the directory specified by the user. This paper after a review of several tools that address the balance checking problem, briefly introduces the GI measure and its related test, then describes the macro -its arguments, implementation and the output dataset it produces-and finally presents two examples demonstrating the use of the macro for assessing the global imbalance of a set of baseline covariates by subgroups: One for the binary treatment case and another for the multi-treatment case.

Measuring balance: A review
The success of various methods in reducing bias of the estimated effects mainly depends on the balance criterion adopted. According to Rubin (2001), balance concerns similarity in covariate distributions across treatment groups. As reported in Ho, Imai, King, and Stuart (2007) it holds when the treatment (T) and the covariates (X) are unrelated such thatp(X | T = 1) =p(X | T = 0); wherep denotes the observed empirical density of data. Balance is commonly evaluated by conducting hypothesis testing. The standard practice involves the use of t-test for the difference in means for each continuous covariate or the χ 2 test for each categorical covariate. However, this practice starts to be criticized (to cite few: (Imai, King, and Stuart 2008;Iacus, King, and Porro 2011). The main critique is that researchers used to ignore the multivariate balance. In recent works such multivariate aspect starts to be taken into account (Hansen and Bowers 2008;Li, Maasoumi, and Racine 2009;Camillo and D'Attoma 2010;D'Attoma and Camillo 2011;Iacus et al. 2011). Hansen and Bowers (2008) propose a simultaneous balance test on multiple X. The hypothesis of no association between a treatment variable and the X covariates is assessed by comparing the differences of means (or regression coefficients), without standardization, to their distribution under hypothetical shuffles of the treatment variable, a permutation or randomization distribution (Bowers, Fredrickson, and Hansen 2011). This test balances not only on each X separately, but also on all linear combinations of them. Its law is a χ 2 -approximation. The test is implemented within the RItools package and the xBalance function (Bowers et al. 2011). It should work also when the treatment variable is not binary, but it doesn't seem clear with which kind of covariates (categorical, continuous, ordinal) it works. Li et al. (2009) propose a nonparametric test for equality of two multivariate densities with mixed categorical and continuous data. The test statistic, I n (Li et al. 2009), is constructed based on the integrated squared density difference given by where F (·) and G(·) are the cumulative distribution function for X and Y . X and Y are the multivariate vectors of dimension q+r where q denotes the number of continuous variables and r the number of discrete/categorical variables. Li et al. (2009) demonstrate that under the null of equality of distributions the I n statistics can be approximate by a normal standard. The test is implemented within the R software using the np package (Hayfield and Racine 2008) and the npdeneqtest function (Li et al. 2009;Racine 2012). The test has the advantage of working with mixed categorical and continuous data. Iacus et al. (2011) propose a multivariate imbalance measure based on the L 1 difference between the multidimensional histogram of all pre-treatment covariates in the treatment group and that in the control group. To obtain the measure, they cross-tabulate the discretized variables and the categorical variables as X 1 × X 2 . . . × X k for the treated and control groups separately, and record the k-dimensional relative frequencies for the treated f l 1 ...l k and control g l 1 ...l k units. Finally, they take the absolute difference over all the cell values: The L 1 measure is implemented within R using the cem package and the imbalance function (Iacus, King, and Porro 2009). It is also implemented in Stata using the cem package and the imb function (Blackwell, Iacus, and King 2009). An undoubted advantage of such a measure is its simplicity and intuitive interpretation. Furthermore, it should work with multicategory treatments and with any kind of variables.

Description of the GI measure and its related test
The present section provides a brief description of the GI measure and its related test. For a more comprehensive treatment of the theoretical framework within the GI measure and its related test are developed see Camillo and D'Attoma (2010) and D'Attoma and Camillo (2011), and for an application see Peck, Camillo, and D'Attoma (2010). The %GI macro uses the SAS/IML language to compute the GI measure expressed as where Q denotes the number of baseline covariates, T denotes the number of treatment levels, J Q denotes the set of all categories of the Q baseline variables, b tj is the number of units with category j ∈ J Q in the treatment group t ∈ T , k .t is the group size t ∈ T , and k . j is the number of units with category j ∈ J Q . The GI measure is the result of using the conditional multiple correspondence analysis (MCA) framework (Escofier 1988) to quantify the between groups inertia 1 . In fact, when the dependence among categorical baseline covariates (X) and the treatment assignment (T) is outside the control of researchers, displaying the relationship among them on a factorial space represents a first step for discovering the hidden relationship. In the presence of dependence, any descriptive factorial analysis may exhibit this link. Commonly, the problem of the factorial decomposition of the variance related to the juxtaposition of the X matrix and T is faced within the MCA framework (Lebart, Morineau, and Warwick 1984). With reference to MCA, the structure of the data matrix eigenvectors and eigenvalues decomposition process, could be strongly influenced by the presence of an external conditioning variable (i.e., the treatment assignment T). Hence, a conditional analysis is used in order to isolate the part of the variability of the X-space due to T. With reference to the Huygens' inertia decomposition of total inertia (I T ) as within-groups (I W ) and betweengroups (I B ), conditional MCA (Escofier 1988) consists in the factorial decomposition of the within-group inertia. In this sense, it could be also considered as an intra analysis that detects and describes differences among units within each group by not considering the effect due to the partition's structure induced by the non random selection process. The space generated by the conditional MCA is continuous, and thus, in the computation of distances between groups and between units, becomes possible to use a standard metric based on the criterion of the variance minimization. The key result of using conditional MCA is represented by the quantified between groups inertia that represents the measure of global imbalance in data (D'Attoma and Camillo 2011). Then, to determine the significance of the detected imbalance, %GI macro performs a multivariate imbalance test. The null hypothesis of no dependence among X and T is specified as On the basis of the asymptotic distribution function of I B (Estadella, Aluja, and Thi-Henestrosa 2005) expressed as The interval of plausible values for GI is defined as With n as the sample size, Q as the number of baseline covariates and χ 2 (T −1)(J−1) as the χ 2 value with (T − 1)(J − 1) degrees of freedom. If the measured GI is outside the interval, then the null hypothesis of no dependence among X and T is rejected and data are deemed unbalanced. The main advantage of the GI measure is its simplicity of interpretation. The proposed measure varies in [0, I t ]. Perfect balance occurs when I B = 0; whereas, perfect imbalance occurs when I W = 0 and I B = I T which indicates that the observed total variability of the X-space is completely due to the influence of conditioning (T ). An index that ranges between 0 and 1 is represented by the Multivariate Imbalance Coefficient (MIC) which is defined as one minus the ratio between the within-groups inertia relative to the total inertia: M IC = 0 denotes perfect balance; whereas, M IC = 1 indicates perfect imbalance. The GI measure works with categorical nominal or ordinal variables. Continuous variables need to be previously discretized. Furthermore, it also works in a multitreatment environment.
It is very similar for its simplicity to the measure proposed by Iacus et al. (2011). At the same time, it is more exhaustive than the L 1 measure since it considers the variability of a global space, its decomposition in between and within variability and also the asymptotic distribution function of I B , that allows to define an interval of plausible values of the Global Imbalance measure. In addition, the use of the Bart table and the Burt band (for more details see D'Attoma and Camillo 2011) in the computation of the between groups inertia is more exhaustive than cross-tabulating the discretized and categorical variables for the treated and control groups separately. The Burt table is the symmetric matrix of all two-way crosstabulations between the categorical, nominal or ordinal variables, and has an analogy to the covariance matrix of continuous variables. It simultaneously displays information on the occurrence of category combinations (frequencies) for all variables. The Burt Band crosses the categories of the X variables with T levels. Furthermore, in the computation of the between groups inertia a more appropriate distance measure is used that is the χ 2 metric. The χ 2 metric includes a coefficient that re-evaluates elements with low frequency and resizes those with high frequency by weighting each element by the inverse of its importance on the total frequencies. Such a metric avoids to pay attention in the data pre-processing to the equilibrium between categories. It will be no more necessary to avoid categories with low frequency or variables with a lot of categories. The %GI macro produces as output a SAS dataset that reports for each group (e.g., a PS bin, a cluster, a stratum): The group size (n), the number of units in the treatment group 1 (n_t1), the number of units in the treatment group 2 (n_t2), the number of units in the treatment group n (n_tn), the group identifier (id_clu), the Global Imbalance measure (GI), the upper limit of the interval of the plausible values (CHI), the significance level used in the balancing test (alpha), the number of treatment levels in the entire dataset considered (multitreat), the MIC coefficient (MIC), the number of treatment levels in the specific subgroup (LEVELT) and the balance summary (Balance).
Balance equals yes if the group is balanced, equals no if the group is unbalanced and equals no common support if units are observed only in a particular state without units in the other state.

List of parameters in the macro
Based on the GI measure and its related test presented in Section 3, a SAS/IML (SAS Institute Inc. 2008) macro program to measure and test Global Imbalance is written. A complete list of the parameters in %GI is as follows: %GI(library=, dsn=, out=, firstclu=, lastclu=, id=, group_var=, balance_var=, Q=, treat=, alpha=, multitreat=); where library: Name of the directory in which information is located.
dsn: Name of the SAS data set to be read. It must contain Q categorical covariates, the treatment indicator variable, the ID variable and the group membership variable. A group could be the result of any classification analysis conducted separately before running %GI.
out: Name of the SAS output data set.
firstclu: Number of the first group to analyze. It is a numeric value.
lastclu: Number of the last group to analyze. It is a numeric value.
id: ID variable.
group_var: Name of the variable that denotes the group membership.
balance_var: Includes the name(s) of the baseline categorical variable(s) to be balance checked. The name(s) may be listed in any order and separated by blanks. The variable(s) must be numeric. No missing values are allowed.
Q: Number of categorical variables on which simultaneously check imbalance.
treat: Name of the treatment indicator variable. It must be a numerical value.
alpha: Significance level to be used in testing GI.
multitreat: Denotes the number of treatment levels.
The macro computes for each group the GI measure using the SAS/IML language. At this end, first it counts treatment and control units for each group, then creates a disjunctive table for each group. In particular, to compute the GI measure the following matrices will be created and used within the IML procedure: Z that includes the Q baseline categorical covariates in disjunctive form, L that includes the t treatment levels indicators, B that is the Burt table, the result of the inner product of the indicator matrix Z, the Band matrix that is a contingency table which crosses the categories of the baseline categorical covariates with each treatment level. Before quitting, the %GI macro deletes temporary datasets created during the implementation to avoid cluttering and errors in case the macro is invoked again.

Examples
This section will work through the use of the %GI macro with two artificial examples: One considering a binary treatment case and another a multi-treatment environment. Raw data as well as code for performing both example analyses are provided with this article. In the binary treatment case results are compared to the L 1 distance measure of Iacus et al. (2011), the I n test statistic of Li et al. (2009) and to the Hansen and Bowers test. Whereas, in the multitreatment case, for the sake of brevity, only the GI index is considered.

The dataset
The present example measures the effect of a binary treatment T on an outcome of interest by subgroups. Assume that no random assignment to treatment conditions is feasible. The aim is to show how the macro works to check balance. Assume to have a dataset containing 1775 instances and five baseline categorical variables with different levels: X 1 with two levels, X 2 with two levels, X 3 with three levels, X 4 with two levels and X 5 with two levels. All possible combinations of covariates (2 × 2 × 3 × 2 × 2 = 48 cells) are considered. Units within each of 19.18 X 1 = 1; X 3 = 2 = 6X 1 + 3.3X 2 + 4.1X 3 + 0.31X 4 + 8X 5 = Y (1) + 29.24 −29.24 those 48 cells are allocated on the basis of different proportions (π) to a different treatment level in order to create dependence among X and T (see Table 9 in Appendix A).
Assume that all covariates involved in the assignment to treatment process are perfectly known and no confounding variables exist. Assume this dataset is available before seeing any outcome. Suppose that the 1775 instances are classified employing a cluster analysis on multiple correspondence analysis coordinates and that 29 groups are chosen on the basis of the visual inspection of the resulting dendrogram. Before estimating any treatment effect the GI is measured and tested in each subgroup. Since in real applications may be less appropriate to expect that treatment effects are the same on all units than considering treatment effect heterogeneity within subgroups, heterogeneous treatment effects are simulated. At this end, different potential outcomes Y (0) and Y (1) are generated for observations with a different set of covariates combinations (Table 1).
Then, the observed outcomes (Y i,obs ) are obtained as in the following equation: By design, different average treatment effects (ATE) exist: 19.18; −29.24; 2.41; 25.6. The effect of treatment is estimated as the comparison of means between treatment and comparison cases. The dataset called example_binary contains the following information: The ID, the treatment indicator variable (t), the baseline covariates (X1, X2, X3, X4, X5), the group membership indicator (Cluster) and the observed continuous outcomes (Y_obs) 2 . Assume that the data are in SAS format and are stored in the directory specified by the user. Finally, invoke the %GI macro.

The implementation
Specify the %INCLUDE statement to indicate the location of the macro file, input values for various arguments as shown in the previous section and in the code below and invoke the macro. The macro will create a new SAS file, save it as balance_binary in the specified directory. This would be accomplished with the following macro call in SAS: %GI(library = work, dsn = example_binary, out = balance_binary, firstclu = 1, lastclu = 29, id = id, group_var = cluster, balance_var = X1 X2 X3 X4 X5, Q=5, treat = t, alpha = 0.05, multitreat = 2)

The output
The output includes a dataset called balance_binary (Figure 1) that contains information about balance for each subgroup.
Specifically, it displays the number of units within treatment group (n_t1), the number of units in the control group (n_t2), the group membership indicator (Id_clu), the GI measure (GI), the upper limit of the confidence interval (CHI), the MIC coefficient (MIC), the significance level (alpha), the number of treatment levels present in the entire dataset (multitreat) and the balance result (Balance). Balance equals yes if the GI measure is lower than the upper limit of the interval; otherwise, it equals no. Finally, Balance equals no common support in case lacks the common support. As reported in Figure 1 only one group over 29 is deemed unbalanced. In the remaining 28 balanced clusters the treatment effect of interest is computed. Table 2 reports the estimated effects with standard errors and shows that in 25 over 28 clusters the simulated heterogeneous effects are exactly reproduced. Results do support the conclusion that the estimated effects are unbiased where baseline characteristics are by the GI measure computation exogenous to the treatment and this is confirmed by the percent bias reduction (Rubin 1973) that reaches its maximum in almost all balanced subgroups.

Comparison of results
Results in terms of L 1 distance are obtained running the imbalance function of the cem    Finally, results about the Hansen and Bowers simultaneous balance test are obtained using the RItools package and the xbalance function (Bowers et al. 2011). First, balance on the overall dataset is assessed using our GI measure and its related test and compared to the L 1 distance, the I n statistic and the Hansen and Bowers (H&B) simultaneous test. As emerges from Table 3 all compared measures let us conclude that balance does not hold in the entire dataset. By considering subgroups, our GI measure and L 1 distance give the same results (Table 4). As confirmed by the treatment effect estimation and the percent bias reduction reported in Table 2 both measures correctly assess balance. Whereas, only in 5 clusters over 29 the I n statistic confirms GI and L 1 results. An important difference between the I n statistic and the other measures concerns the definition of common support. According to all measures, with the exception of the I n statistic, the common support set holds if at least one observation is present in all treatment options. This allows to measure balance in any case and let practitioners to define how much restrictive the definition of common support must be. We suppose the I n statistic fails in checking balance because data are not a mix of continuous and discrete/categorical variables. Also the Hansen and Bowers test fails in detecting balance. It gives results different from those of GI and L 1 in 22 clusters over 29. We can conclude that only the GI measure and the L 1 distance correctly detect balance, and this conclusion is supported by the estimated effects that are unbiased in almost all cases, as showed by the percent bias reduction reported in Table 2.

Multitreatment case
The dataset In the present example, a treatment T with 3 levels is considered. Assume to have a dataset containing 15645 instances and five baseline categorical variables with different levels: X 1 with two levels, X 2 with two levels, X 3 with three levels, X 4 with two levels and X 5 with two levels. As in the binary case, all possible combinations of covariates (2×2×3×2×2 = 48 cells) are considered. Then, units within each of those 48 cells are allocated on the basis of different proportions (π) to a different treatment level in order to create dependence among X and T (Table 9, Appendix A). As in the binary treatment example, this dataset is available before seeing any outcome. After balance checking, in order to verify if treatment effects are unbiased in balanced clusters, the presence of heterogeneous treatment effects is simulated. Assume to have 3 multiple treatments (T = {1, 2, 3}). Therefore, each subject has 3 potential outcomes, Y (1), Y (2) and Y (3). At this end, Y (1), Y (2), Y (3) are generated for each observation who receives a treatment t. For each unit, potential outcomes, Y i (t), are generated with the following model and assuming a zero error term and a zero intercept: In particular, four different set of parameters are generated in order to create heterogeneous treatment effects (Table 5). Despite three potential outcomes exist, only one outcome under the assigned treatment can be observed. Following Feng, Zhou, Zou, Fan, and Li (2012) the observed outcome for each subject i, Y i,obs , is computed as: where I(T i = t) is the indicator of receiving treatment t:     Table 7: Balance in the overall data in multi-treatment case.
and Y i (t) denotes the potential outcome of subject i if the subject has been assigned treatment t. Finally, the true ATEs are estimated. If all the potential outcomes are observed, the ATE of treatment j versus treatment k with j = k are estimated by As displayed in Table 6, by design, 3 true treatment effects exist for each simulated parameter set and they are estimated as the comparison of means of potential outcomes. 4 Once the simulated data are ready, balance is checked on the overall simulated data. Being the overall data unbalanced (Table 7) a subgroup analysis is performed using a cluster analysis on multiple correspondence analysis coordinates. On the basis of the examination of the dendrogram, 29 groups are retained. Before estimating the treatment effect of interest the GI is measured and tested in each subgroup. At this end, assume that the data are in SAS format and are stored in the directory specified by the user. Finally, invoke the the %GI macro. The dataset called example_multi contains the following information: The ID, the multi-treatment indicator (t), the baseline covariates (x 1 , x 2 , x 3 , x 4 , x 5 ), the group membership indicator (cluster) and the observed continuous outcome (Y obs ).

The implementation
Specify the %INCLUDE statement to indicate the location of the macro file, input values for various arguments as shown in the previous section and in the code below and invoke the macro. The macro will create a new SAS file, save it as balance_multi in the specified directory. This would be accomplished with the following macro call in SAS: %GI(library = work, dsn = example_multi, out = balance_multi, id = id, firstclu = 1, lastclu = 29,group_var = cluster, balance_var = X1 X2 X3 X4 X5, Q = 5, treat = t, alpha = 0.05, multitreat = 3)

The output
The output will include a dataset called balance_multi (Figure 2).  It displays the number of units within treatment group 1 (n_t1), the number of units in the treatment group 2 (n_t2), the number of units in the treatment group 3 (n_t3) the group size (n), the group membership indicator (Id_clu), the GI measure (GI), the upper limit of the confidence interval (CHI), the significance level (alpha), the MIC coefficient (MIC), the number of treatment levels present in the entire dataset (multitreat), the number of treatment levels present in the specific subgroup (LEVELT) and the balance result (Balance). Balance equals yes if the GI measure is lower than the upper limit of the interval; otherwise, it equals no. Finally, Balance equals no common support in case lacks the common support 5 . As reported in Figure 2, only 3 groups over 29 are deemed unbalanced. In the remaining clusters the treatment effects of interest are computed. Table 8 reports the estimated effects with simultaneous confidence limits in brackets and shows that, on average, the bias is reduced around 60%.
We acknowledge that it is a result not so excellent as that obtained in the binary case (Table 2). Such a result might be due to the increased number of combinations of treatment levels and covariates. At the same time, we consider the result satisfactory if compared to bias reduction obtained by adopting a PS Subclassification analysis 6 (Figure 3), where, on average, bias is reduced around 30% in case the propensity score is forced to be split in 29 bins. From Figure 3 5 For the multiple treatment case the common support set is in general determined by the minimum of the maximum and the maxima of the minimum participation probabilities for the various treatment options (Frölich, Hesmati, and Lechner 2004) 6 The propensity score is estimated using a generalized multinomial logit and the SAS catmod proc and using as independent variables all the five simulated variables X1, X2, X3, X4, X5  emerges that in most of splitting cases bias is increased rather than reduced and such a result might be due to the fact that propensity score subclassification in the present multi-treatment example is not able to balance pre-treatment covariates.

Concluding remarks
The %GI macro enables one to measuring and testing balance of categorical, ordinal and continuous covariates according to the GI measure and its related test introduced in D'Attoma and Camillo (2011). The macro is illustrated using two artificial examples showing that it works in both binary and multi-treatment environments. The macro described in this paper encourages analysts to globally checking balance rather than performing variable-byvariable tests, which do not consider interactions among baseline covariates. Compared to other measures, in the binary case it correctly detects balance as the L 1 distance, and such correcteness is supported by the percent bias reduction that reaches its maximum in most of examined clusters. The I n test statistic and the Hansen and Bowers test fail in correctly detect balance. The I n test statistic and the Hansen and Bowers test probably fail for two main reasons. First, the nature of covariates used in the present examples might be not appropriate for the mentioned two tests. Second, the two tests might be influenced by the sample size of the groups being compared. Furthermore, the I n test statistic is designed to work with mixed categorical and continuous data. We think it might not work when data are not mixed, but exclusively categorical or continuous as in the examples here presented.
For what concerns the multi-treatment case, only the GI measure is considered. In terms of percent bias reduction its performance is worse than in the binary case and such a result might be due to the increased number of combinations of treatment levels and combinations. It was not our intent here to provide proofs of the theoretical superiority of GI measure over other examined measures; instead, we provide a brief introduction to the concept of GI and a simple illustration of its computation using the proposed %GI SAS macro. Our main goal has been to show how the macro works in both binary and multi-treatment case. Comparing the GI measures and %GI macro to the other measures and their related tools, we learn that not all examined tools work with all kind of covariates and not all tools produce the same results. The limit of our proposed measure is that it does not work with continuous covariates that must be previously discretized with some discretization method that we do not suggest. In sum, the macro makes easy to check balance by subgroups on which estimate binary or multiple treatment effects of interest under non-experimental conditions, where a subgroup could be the result of any classification analysis or a bin of a PS subclassification (Dehejia and Wahba 2002). In doing that, the multivariate structure of data is taken into account. The main strength of the %GI macro is that it allows to solve complex problems, because especially in a data mining perspective, it does not suffer from the number of variables and observations. This paves the way for applications business-oriented (e.g., marketing or redemption campaigns) that might need a continuous monitoring. In fact the use of the %GI macro for its simplicity makes easy to monitor the effect of any kind of private or public policy by subgroups and even in a continuous way and, as such, might reverse the concept of evaluation, that might be considered not only as a one-time action, but as a process.