SAS Macro BDM for Fitting the Dale Regression Model to Bivariate Ordinal Response Data

A SAS macro for ﬁtting an extension of the Dale (1986) regression model to bivariate ordinal data is provided. The macro is described in detail and examples from Dale (1986) and McMillan, Hanson, Bedrick, and Lapham (2005) are discussed.


Motivating example and objectives
The BDM SAS macro was developed to fit the Bivariate Dale Model (BDM) to bivariate ordinal level data. The motivating problem was modeling alcohol use frequency and quantity. One data collection instrument that alcohol researchers and clinicians generally rely on is called the Quantity-Frequency survey. Quantity-frequency surveys usually consist of two items that query the respondent on average frequency of drinking events and the average quantity consumed per such event. The Alcohol Use Disorders Identification Test (AUDIT), for example, asks, "On average, how often do you drink alcohol?", which corresponds to the frequency measure. The respondent is then asked, "On days that you drink, on average how much alcohol do you consume?", which corresponds to the quantity measure. Response levels for the frequency and quantity measures used in the AUDIT questionnaire are shown in Table 1.
An appropriate statistical methodology is necessary for quantifying risk factors of alcohol consumption pattern and intervention effect sizes using Quantity-Frequency survey data, which have certain particular features that must be addressed. First, the frequency item includes the response "I Never Drink" for individuals who are teetotalers. This frequency level makes the quantity measure (drinks per drinking occasion) meaningless since one cannot specify the number of drinks per drinking occasion if one never has any drinking occasions! If abstainers are in the sample considered, then a complete analysis requires a two-part analysis of (a) the event at which one drinks at all, and (b) alcohol consumption pattern given that one is not an

Frequency of alcohol use
Quantity of alcohol use On average, how often do you On days that you drink, on average drink alcohol? how much alcohol do you consume? (Never) 1-2 drinks Monthly or less 3-4 drinks 2-4 times per month 5-6 drinks 2-3 times per week 7-9 drinks 4 or more times per week 10 drinks or more abstainer. This is not a particularly complex problem and is analogous to two-part analyses commonly used in medical-cost analysis (Lachenbruch (2001)).
Second, quantity-frequency measures are usually expressed on an ordinal scale. There is no interval or ratio scale difference between "Monthly" and "2-3 times per week" on the AUDIT frequency of consumption items except to say that the former is less frequent than the latter. Alcohol quantity-frequency modeling therefore requires methods suitable for ordinal data. Finally, clinicians and alcohol epidemiologists agree that alcohol frequency and quantity of consumption are not independent behaviors, and the degree of association between quantity and frequency likely varies among sub-populations (Makela (1996)). For example, alcoholics drink frequently to mitigate the somatic and psychological effects of alcohol withdrawal, but physiological tolerance, which increases the rate of alcohol metabolism and reduces the intoxicating effect of individual doses, requires increased quantities of alcohol per consumption event to obtain the desired "high." Alcohol quantity-frequency modeling requires statistical methods for the bivariate ordinal nature of the quantity-frequency data, while also allowing for sub-population variability in the degree of association between quantity and frequency of alcohol consumption.
We suggest using the Bivariate Dale Model (Dale (1984), Dale (1985), Dale (1986)) to model the quantity and frequency of alcohol consumption and to estimate risk factors for alcohol consumption patterns. The BDM allows us to model the joint distribution of the quantity and frequency of alcohol consumption when recorded on an ordinal scale. BDM parameter estimates are expressed in log-odds ratios and are interpreted in exactly the same manner as ordinal logistic regression results. The BDM also allows one to infer the correlation between quantity and frequency of alcohol consumption, and to model variation in this association as a function of covariates. While the motivating example pertains to alcohol use, the BDM macro can be used to fit the Bivariate Dale Model to any data set containing bivariate ordinal response data.

Statistical development
The model is comprised of two components. Let d be the binary indicator that an individual drinks (d = 1 denotes drinking, and d = 0 denotes abstinence) and, conditional on the event that a person drinks (i.e., d = 1), let f and q denote the frequency and quantity that an individual drinks on the r and c level ordinal scales. For example, r = 4 and c = 5 in the AUDIT example shown in Table 1. We model the probability that an individual with covariate vector x drinks at all using logistic regression as β 1 is a vector of regression coefficients for the log odds that a person with covariate combination x drinks, and is interpreted in the usual way (Collett (1991)). Conditional on d = 1 we model the bivariate ordinal vector (f, q) using a BDM (Dale (1986), Molenberghs and Lesaffre (1994)). The marginal probabilities P (f ≤ i), i = 1, . . . , r − 1, and P (q ≤ j), j = 1, . . . , c − 1, are modeled using ordinal logistic regression: Note that x defines covariates applicable to each of these models, but does not necessarily overlap among models. For example, age might be important in predicting the probability that one drinks or frequency of drinking given that one drinks, but might not be included in the model of the quantity that one drinks per drinking occasion. Each β 2,j parameter expresses the log-odds of drinking at or below frequency level i for an individual with covariate value x j relative to one with covariate value x j − 1. A similar interpretation holds for the β 3 parameters, but with respect to quantity consumed. The {θ f,1 , . . . , θ f,r−1 ; θ q,1 , . . . , θ q,c−1 } terms are intercepts expressing the log odds of drinking at or below frequency level i or quantity level j. In this particular parameterization of the ordinal logit model, the intercept terms increase from the lowest to the highest levels on each ordinal scale. Note that the highest level category does not have an intercept term. In particular, P (f ≤ r) = P (q ≤ c) = 1, thus the highest levels need not be considered in the specification of the model. The interpretation of regression coefficients for the ordinal logistic model appears in most introductory texts on categorical data analysis (e.g. Everitt (1994), Agresti (2002)).
Possible dependence between f and q is modeled using a global cross-ratio (GCR) model (Dale (1986), Molenberghs and Lesaffre (1994)). The GCR is a useful measure of association for contingency tables in which the row and column responses are ordered variables with greater than two levels each (Dale, 1984). The GCR is defined for a pair of "cutpoints" (i, j) on the quantity and frequency scales (e.g. Figure 1, Lesaffre and Molenberghs (1991)). Cutpoints refer to particular levels on the quantity and frequency scales about which the level of association between the two is measured. The GCR is equal to the cross-product ratio for a table dichotomized at cutpoints (i, j). This is the odds ratio of cumulative quantity and frequency levels, and is interpreted as the ratio of the odds of f being at or under category i given q is at or under category j to the odds of f being at or under category i given q is greater than category j. Details of the GCR, and its relation to other measures of association are in Dale (1984). We assume the log-linear model The GCR is modeled as a function of the frequency and quantity cutpoints (i, j), as well as covariate vector x. Note that ψ r,c is undefined, and that a α r−1 = γ c−1 = 0 for modeling purposes so that log(ψ r−1,c−1 ) = ∆ + x β 4 . Thus, there are r − 2 {α i } terms and c − 2 {γ j } terms in the model. When ψ ij does not depend on cutpoints (i, j), then the constant GCR model ( = e ∆+x β 4 ) obtains for any given covariate vector x over the entire table (Dale, 1986).
The contribution of an individual who abstains from drinking to the overall likelihood is simply p(d|β 1 ). The contribution from an individual who drinks is given by the product of mass functions p(d|β 1 )p(f, q|d = 1, τ ), where τ is the vector of parameters associated with the BDM. For the motivating example τ is .
p(f, q|d = 1, τ ) has been defined (Dale (1986), Molenberghs and Lesaffre (1994)). The full likelihood function factors into two separate functions of β 1 and τ , the former based on all subjects and the latter based on only those subjects who drink. Because the likelihood factors, the two separate models (logistic for whether someone drinks and BDM to model the alcohol consumption pattern in those that drink) are fit to the data, but the resulting inferences can be interpreted simultaneously. We can perform a formal statistical test of the null hypothesis of no association between quantity and frequency of alcohol consumption. If the null hypothesis of no association is rejected, then results of simple univariate ordinal regression analysis that assume independence between the quantity and frequency of alcohol use should be regarded with skepticism. The log-likelihoods of independent ordinal logit models fit individually to the quantity and frequency measures are additive. The drop in deviance from the sum of these independent models to the GCR model has a chi-square distribution with k + 1 degrees of freedom, where k is the number of predictors in the GCR model described above. A statistically significant drop in the deviance after incorporating the GCR into the analysis indicates that the association between the quantity and frequency of alcohol consumption is important and should be considered during statistical analysis.

Code description
The BDM macro models P (d = 1) using PROC LOGISTIC, and bivariate ordinal responses are modeled using the PROC NLMIXED procedure in SAS Version 8.2. We chose to write the program using SAS/STAT procedures as they are widely used by epidemiologists and biostatisticians. SAS/IML or PROC NLP in the SAS/OR module could accomplish the same thing given a reasonable amount of programming experience. The BDM macro allows one to control the predictor set for each part of the model, including the cutpoint effects for the GCR model. Predictors can be on either quantitative or categorical scales, but categorical predictors must be coded as dummy variates. Procedure output includes parameter estimates, standard errors, confidence intervals, and p-values for the hypothesis test of no difference from zero. The drop in deviance from the marginal-only models to the model including the GCR portion of the model is also shown to test the improvement in fit when the association between ordinal outcomes is included in the model. The macro generates a table of observed and predicted counts for each covariate combination included in the model, and outputs raw and Pearson residuals for model criticism.
Output datasets include: • BDM_BIN_EST = Covariance matrix for the Bernoulli part (if applicable).
• BDM_BIN_PARMEST = Parameter estimates for the Bernoulli part (if applicable).
• BDM_SPECS = Model specification for the BDM.
• BDM_FITS = Fit statistics for the BDM.
• BDM_FINAL = Observed and predicted counts, with raw and Pearson residuals.

Data formatting
Each row of the input dataset must pertain to one level of the row response, one level of the column response, and a unique covariate combination within the row-column combinations. Levels of each dependent variable are ordered from 1 to #, with 1 being the lowest and # levels being the highest. The data may be in summarized or unsummarized from. It is advisable to limit the variable name lengths to 8 characters or less, and variable names should not end with digits. This causes instability the algorithm. For example, Dale (1986) provides the following data These data needs to be modified to appear as follows:

Inputs
The macro is called as: Inputs are defined as follows.
• dat = Name of input dataset. See below for data formatting.
• Cond = Name of outcome measure for Bernoulli portion of two-part model. This is left blank for standard BDM model fitting.
• Condpred = Predictors used for the Bernoulli piece, separated by blanks.
• Rowvar = Name of the ordinal response for the row variable.
• Colvar = Name of the ordinal response for the column variable.
• Ct = Name of the cell count variable.
• Rowpred = Predictors used for the row variable marginal model, separated by blanks.
Leaving this blank will only model the Row response with intercept terms.
• Colpred = Predictors used for the column variable marginal model, separated by blanks.
Leaving this blank will only model the Column response with intercept terms.
• Gcrpred = Predictors used for the GCR model.
• Gcrrow = 'row' indicates the GCR model will contain row level terms. Leaving this blank will omit the row level in the GCR model.
• Gcrcol = 'col' indicates the GCR model will contain column level terms. Leaving this blank will omit the column level in the GCR model.

Example 1
This example is a replication of an analysis in Dale (1986). In this example, Dale fits the BDM to self-reported pain level and medication requirements, each of which is measured on an ordinal scale. The data are shown in Table 1, above. The following SAS code reads the table, formats it for use with the BDM, and calls the BDM macro to fit the BDM. The fitted model includes no predictors on either ordinal response scales, but has operation type 'VH' in the GCR model, along with terms that vary the association between pain level and medication requirement over levels of the medication requirements. Data and code for this example are included with the BDM macro.
[2] Convergence status of the BDM model.
[3] Results of the test of the null hypothesis of no association between the ordinal responses after model adjustment. The drop in deviance that occurs after adding the GCR portion of the model is highly statistically significant, indicating a marked improvement in model fit once the association between pain level and medication requirements is built into the analysis.
[4] Parameter estimates. These are interpreted in the usual way for ordinal logit models. Parameters relevant for each response variable are given that variable name as a prefix. For example, the medication requirement intercepts are given the prefix MED. The int suffix identifies the intercept terms. The GCR prefix denotes parameter estimates for the GCR portion of the model. Detailed interpretation of the parameter estimates is given in Dale (1986).  Table 2: DWI offender alcohol quantity-frequency data by age, gender, and history of physical/sexual abuse.
[5] Table of observed and predicted cell counts for each covariate combination and response level. None of the Pearson's residuals exceed 2, indicating a reasonable model fit to the data.

Example 2
This example, from McMillan et al. (2005), concerns the relationship between the consumption of beer and physical or sexual abuse among DWI offenders. These data include non-beer drinkers, demonstrating the two-part model described in the prequel. See details therein for detailed description of the study sample. The original data is shown in Table 2.
Note that the data are relatively sparse for certain covariate combinations. This poses no problem to the modeling framework proposed here, although complex three-way interaction effects (e.g. age by gender by smoking status) might not be estimable. Such conditions are familiar to epidemiologists working with data in which the outcome measure is strongly separated by regions in the predictor space. This phenomenon is referred to as"quasi/complete separation" and results in infinite maximum likelihood estimates. Also note that the data appear to be concentrated on the diagonals or upper right areas of each table. This strongly suggests a high degree of association between the quantity and frequency variables. The quantity and frequency variables were re-ordered so that 4 corresponds to the lowest quantity (1 beer per occasion) and lowest frequency (up to 1-2 times per month) categories. This reordering is necessary so that the regression coefficients are expressed in log-odds of drinking at or above each quantity or frequency level. Positive coefficients therefore express greater risk of drinking more frequently or with greater quantity, which is more easily understood by alcohol researchers. The macro code is called up to fit the BDM. The output is identical to that of the first analysis, with the addition of the logistic regression results for the probability that one drinks. Covariate combinations with Pearson's residuals larger than about 3 are poorly fit and might indicate some degree of model inadequacy. In this result, old men who have suffered physical or sexual abuse as a child have substantially lower predicted probabilities of being in the highest quantity and frequency group (observed count = 14, expected = 4.4). Also, younger men who have suffered physical or sexual abuse as a child are more common that expected in the highest quantity / lowest frequency levels of the responses. These subjects are not well fit by the model.