%lrasch_mml : A SAS Macro for Marginal Maximum Likelihood Estimation in Longitudinal Polytomous Rasch Models

Item response theory models are often applied when a number items are used to measure a unidimensional latent variable. Originally proposed and used within educational research, they are also used when focus is on physical functioning or psychological well-being. Modern applications often need more general models, typically models for multidimensional latent variables or longitudinal models for repeated measurements. This paper describes a SAS macro that ﬁts two-dimensional polytomous Rasch models using a speciﬁ-cation of the model that is suﬃciently ﬂexible to accommodate longitudinal Rasch models. The macro estimates item parameters using marginal maximum likelihood estimation. A graphical presentation of item characteristic curves is included.


Introduction
Item response theory (IRT) models were developed to describe probabilistic relationships between correct responses to a set of test items and continuous latent traits (Linden and Hambleton 1997). IRT models were originally developed and used in educational testing, where the models describe how the probability of a correct answer to an item in a test depends on ability, but they are applicable whenever location of persons and items on an underlying latent scale is of interest. Traditional applications in education often use dichotomous (correct/incorrect) item scoring, but polytomous items are common in other applications. The use of IRT models in new research fields increases the need for implementation in standard statistical software like SAS (SAS Institute Inc. 2013) or SPSS (IBM Corporation 2015). Estimation in IRT models using SAS has been the topic of several research papers (Rijmen, The macro fits polytomous longitudinal Rasch models using marginal maximum likelihood (MML; Bock and Aitkin 1981;Thissen 1982;Zwinderman and Wollenberg 1990). It estimates item parameters and the parameters of a two-dimensional latent distribution and plots item characteristic curves. It is sufficiently flexible to model item parameter drift and response dependence across time points. Recently IRT models have been used increasingly in health status measurement and evaluation of patient reported outcomes (PROs) like physical functioning and psychological well-being (Reeve et al. 2007). E.g., the simplest IRT model, the Rasch (1960) model (Fischer and Molenaar 1995;Christensen, Kreiner, and Mesbah 2013), is increasingly used for validation of measurement instruments (Tennant and Conaghan 2007) and has been shown to be superior to classical approaches (Blanchin et al. 2011). The macro is illustrated using a data set from a longitudinal study of health related quality of life (HRQoL) in non-melanoma skin cancer patients.
It is of considerable importance that model assumptions are checked before the results of a statistical analysis are reported. This is especially important for IRT models where the model requirements are a mathematical formulation of measurement requirements. Thus, existing methodology should be used to make sure that the Rasch model fits at each time point. This can be done using the fit statistics implemented in proprietary software packages like RUMM (Andrich, Sheridan, and Luo 2010) and WINSTEPS (Linacre 2011) or in free stand-alone software like DIGRAM (Kreiner 2003). In SAS the macros %AnaQol (Hardouin and Mesbah 2007), GLIMMIX_Rasch (Chen, Li, and Kromrey 2013) and %rasch_mml ) that report infit and outfit test statistics can be used.
Beyond the requirement of fit of the Rasch model at each time point special requirements arise from the use of the Rasch model for longitudinal data. Two of these requirements, item parameter drift and local dependence across time points, can be tested in a likelihood frame work using the SAS macro %lrasch_mml.

Item parameter drift
To measure trends over time is important in many areas, also for latent variables (Goldstein 1983), but it is important that the item parameters do not change over time (Miller and Fitzpatrick 2009;Chan, Drasgow, and Sawin 1999;Chan et al. 1999;Wells, Subkoviak, and Serlin 2002). The macro %lrasch_mml can be used to test this assumption since models where one or more items do not stay constant over time can be fitted. Different methods for detection of item parameter drift exist (Donoghue and Isham 1998;DeMars 2004;Galdin and Laurencelle 2010). The macro %lrasch_mml makes it possible to test the assumption of item parameter invariance using likelihood ratio tests.

Local dependence across time points
Local dependence has been extensively studied in the psychometric literature, especially within the class of Rasch models (Kelderman 1984;Kreiner andChristensen 2004, 2007;Marais and Andrich 2008a,b). However in the context of longitudinal studies references are sparse. In a study of response dependence and measurement of change, Marais (2009) found that when the assumption of local independence across time is violated it can lead to incorrect conclusions. A few tests of this assumption have been discussed . Andrich and Kreiner (2010) proposed a way of quantifying local dependence for two dichotomous items. Their method is based on splitting an item into two new ones according to the responses to another item. This method has been generalized to polytomous items and to other IRT models and can be used to overcome local response dependence across time points (Olsbjerg and Christensen 2014). The macro %lrasch_mml makes it possible to include splitted items and to test the assumption of local independence across time points using likelihood ratio tests.

The unidimensional polytomous Rasch model
IRT models can be seen as a mathematical formalization of the following measurement requirements: (i) items measure only one latent variable, (ii) expected item scores increase with the underlying latent variable, (iii) items are sufficiently different to avoid redundancy, and (iv) items function in the same way across sub-populations. Let θ denote the latent variable and X = (X i ) i∈I a vector of item responses. The requirements can be written as The requirements (i)-(iv) are referred to as unidimensionality, monotonicity, local independence and absence of differential item functioning (DIF), respectively. Fit of observed data to an IRT model thus implies that these requirements are met and evaluation of model fit is crucial. Assume that item i ∈ I has m i + 1 response categories represented by the numbers 0, . . . , m i and let the stochastic variable X i with realization x i denote the response. For items i ∈ I the polytomous Rasch model is given by probabilities ..,m i and η i0 = 0. An alternative parametrization can be obtained by replacing the vector of item parameters η i with the thresholds defined by β ik = −(η ik − η ik−1 ). These can be interpreted as the locations on the latent continuum where the probabilities of choosing adjacent categories intersect. Under the assumption (iii) of local independence the distribution of the vector X = (X i ) i∈I with realization x = (x i ) i∈I is given by the probabilities where K(θ) = i∈I K(θ, η i ) and with r = i∈Iv x i and K Iv (θ v ) = i∈Iv K i (θ v ). Restrictions are needed to ensure that the model in Equation 4 is identified since for all (θ, η i ) for θ * = θ − κ and η * i = (η ih + κh) h=1,...,m i . We can make sure that the model is identified by imposing a linear restriction on the β's or on the θ's.
Jointly estimating all parameters yields inconsistent estimates, since the number of parameters increases with the sample size (Neyman and Scott 1948). This can be overcome by assuming that θ is drawn from a normal distribution and maximizing the marginal log likelihood (Bock and Aitkin 1981;Thissen 1982;Zwinderman and Wollenberg 1990). The contribution for person v with item responses (x i ) i∈Iv , I v ⊂ I to the marginal likelihood is This model can be fitted with PROC NLMIXED in SAS using an adaptive Gaussian quadrature. If the mean of the normal distribution is fixed at zero all item parameters can be identified, and if a linear restriction is imposed on the item parameters the mean and variance of the normal distribution can be identified. Thus depending on the linear restriction imposed to ensure identifiability two different ways of parameterizing the model exist: The polytomous Rasch model was proposed by Andersen (1977). Masters (1982) called this model the partial credit model deriving probabilities in Equation 1 from the requirement that the conditional probabilities P(X i = k|X i ∈ {k − 1, k}; θ) fit a dichotomous Rasch model.

The two-dimensional polytomous Rasch model
Assume that the set of items can be split into two new, i.e., with items in I 1 and I 2 measuring latent variables θ 1 and θ 2 respectively. Two situations are common: (i) θ 1 and θ 2 are distinct, but correlated latent variables, (ii) θ 1 and θ 2 represent repeated measurements of the same latent variable. The latter is the focus in this paper. If the distribution of the items is as specified by the polytomous Rasch model and the assumption (iii) of local independence holds the vector X with realization x is distributed as follows where Again Neyman's factorization theorem shows that R 1 = i∈I 1 X i and R 2 = i∈I 2 X i are sufficient for θ 1 and θ 2 , respectively.
The contribution to the joint log likelihood for a person v with item responses (x i ) i∈Iv , I v ⊂ I is given by with K 1 (θ 1 ) = K Iv∩I 1 (θ 1 ) and K 2 (θ 2 ) = K Iv∩I 2 (θ 2 ). Again, jointly estimating all parameters does not provide consistent estimates and restrictions are needed in order to ensure that the model is identified. This can be done by placing a restriction on either the item parameters or the person parameters.

Estimation of item parameters
Assuming that (θ 1 , θ 2 ) is drawn from a two-dimensional normal distribution yields the two-dimensional marginal log likelihood function to which the contribution of a person v with item responses ( Two-dimensional Rasch models of this kind were originally discussed by Andersen (1985) and Embretson (1991), who both considered longitudinal measurement. Both models fit within the general framework for multidimensional Rasch models described by Adams and colleagues (Adams et al. 1997a;Adams, Wilson, and Wu 1997b).

Parameter restrictions
Restrictions are needed in order to ensure that the model is identified. Typically this is done by assuming that µ 1 = µ 2 = 0. For the special case of longitudinal data further restrictions are often imposed and other parametrizations are of interest. Consider the situation where the same set of items is used at two time points and let the response to item i at time t be denoted by X it . In the longitudinal Rasch model it is a natural restriction that the item parameters do not change over time, i.e., that for all items i When this restriction is imposed there is no need for the assumption that µ 1 = µ 2 = 0. In fact, for longitudinal data a model where the mean changes over time, but the item parameters are invariant as in Equation 11, is often preferable. The original formulations of the longitudinal Rasch model for dichotomous items (Andersen 1985;Embretson 1991) used this restriction, but a more general model where only a subset of the items are restricted to be equal would also be identified. Depending on the linear restriction imposed to ensure identifiability different ways of parameterizing the model exist. The macro %lrasch_mml uses The assumption that item parameters do not change over time can be relaxed. This is outlined in the following Section 3.3.

Relaxing assumptions
The longitudinal Rasch model formulated for the situation where θ 1 and θ 2 are (correlated) values of the same latent variable at two distinct time points and the two sets of items I 1 and I 2 are identical. This can be relaxed in two ways: by allowing item parameter drift (as outlined in Section 1.1) and by allowing local dependence across time points (as outlined in Section 1.2).
Item parameter drift can be allowed by specifying a model where the item sets I 1 and I 2 are overlapping, but not identical. To illustrate how this could be done, note that allowing an item i ∈ I to change over time can be specified using I 1 = (I\{i})∪{i 1 } and I 2 = (I\{i})∪{i 2 } where the item is modeled as two separate items, each one only administered at one of the time points.
Two similar items are highly correlated, and the correlation could be even higher than what the underlying latent variable accounts for. For this reason the requirement (iii) of local independence is formulated as a requirement of non-redundancy. In the longitudinal Rasch model this requirement is imposed at each time point, but an additional requirement of local dependence across time points is also imposed: Responses given to the same item at the two points should be independent given (θ 1 , θ 2 ). In the unidimensional Rasch model local dependence can be modeled using log linear Rasch models (Kelderman 1984;Kreiner and Christensen 2004), but also by "splitting" one item with respect to the observed values of another (Andrich and Kreiner 2010). In our implementation of the longitudinal Rasch model the latter of these approaches is used. A detailed account of how splitting an item at the second time point based on observed responses at the first time point is provided by Olsbjerg and Christensen (2014). Briefly, a model where local dependence across time points is modeled for an item i ∈ I can be formulated by defining m i + 1 time 2 items where . denotes a missing value. The extended model uses I 1 = I and Since missing values are easily included in the implemented MML framework the parameters of this model are readily estimated.

Implementation in SAS
The SAS macro %lrasch_mml uses PROC NLMIXED to estimate item parameters, and the parameters of the two-dimensional normal distribution. PROC NLMIXED fits nonlinear mixed models (Rijmen et al. 2003;Smits et al. 2003) and is very flexible because the conditional distribution given the random effects can be specified to be a general distribution using SAS programming statements. The NLMIXED procedure maximizes an approximation to the likelihood integrated over the random effects. Different integral approximations are available, the principal one being adaptive Gaussian quadrature.
The macro has two required input statements: • DATA specifying the input data set; • ITEM_NAMES identifying the names and response formats of the items.
The macro also has four optional input statements: • ANCHOR specifying the items that should be restricted to have the same item parameters at the two time points (default value ALL, specifying that all items have the same parameters, cf. Equation 11); • SPLIT specifying the items that should be split for local dependence, cf. Olsbjerg and Christensen (2014) (default value NONE, indicating that no items should be split); • ICC specifying whether or not the macro should plot item characteristic curves (default value NO). Note that these curves should rightly be referred to as categories probability curves.
• OUT specifying the prefix of output data sets generated by the macro (default value LRASCH).
The assumption that item parameters do not change over time can thus be relaxed. This makes it possible to test for item parameter drift, and to specify models where item parameter drift is present. The SAS macro %lrasch_mml creates three output data sets: • The data set OUT_logl contains the maximum value of the conditional log likelihood function and the Akaike information criterion (AIC; Akaike 1974), the Bayesian information criterion (BIC; Schwarz 1978) and the sample size corrected version of the AICC (Burnham and Anderson 2002).
• The data set OUT_thresholds contains the item parameter estimates (using the PCM parametrization), the standard errors of the item parameter estimates, and 95% confidence intervals.
• The data set OUT_poppar contains population parameter estimates, i.e., the change in mean µ, the standard deviations σ 1 and σ 2 and the latent correlation ρ.

Example: Longitudinal HRQoL data
Patient-reported outcomes (PROs) are used to capture patients' perception of a disease and its impact on daily living. SCQoL is a HRQoL questionnaire developed for non-melanoma skin cancer (NMSC; Vinding, Christensen, Esmann, Olesen, and Jemec 2013) that consists of 9 items scored on a standard 4-point Likert scale. The data used in this example were collected from patients with NMSC undergoing surgery at the Department of Plastic Surgery at Roskilde Hospital.
The analyses presented here can be performed using the SAS macro %lrasch_mml.sas and the supplementary files available along with this manuscript (or from the homepage http: //biostat.ku.dk/~kach/index.html#lrasch_mml): • the example data scqol.sas7bdat containing the responses to the SCQoL questionnaire, • the sample code v67c02.sas described in the following.
Participants responded to the skin cancer quality of life (SCQoL) questionnaire before the operation and 3 months after. A total of 101 patients responded at baseline, 14 patients did not respond at follow-up. The marginal distribution of item responses is computed using the statements and is shown in Table 1.
We note that the maximum response '3' is not observed for the item SC01 at baseline and for the items SC03 and SC06 at follow-up. In order to fit the longitudinal Rasch model to these data we have to specify the structure of the input data. We specify variable names and response formats by defining the ITEM_NAMES data set, writing data inames; input item_no item_names1 $ item_names2 $ item_text $ max1 max2; datalines; 1 b_SC01 fu_SC01 SC01 2 3 2 b_SC02 fu_SC02 SC02 3 3 3 b_SC03 fu_SC03 SC03 3 2 4 b_SC04 fu_SC04 SC04 3 3 5 b_SC05 fu_SC05 SC05 3 3 6 b_SC06 fu_SC06 SC06 3 2 7 b_SC07 fu_SC07 SC07 3 3 8 b_SC08 fu_SC08 SC08 3 3 9 b_SC09 fu_SC09 SC09 3 3 ; run; the ITEM_NAMES data set contains information about the items, specifically the variable item_names1 contains item names of time 1 items, the variable item_names2 contains item names of time 2 items, the variable item_text contains text for plots, and the variables max1 and max2 contain the maximum score for time 1 and time 2 items, respectively. Thus the code specifies that the item SC01 at baseline and the items SC03 and SC06 at follow-up are to be considered to have only three response options. In this situation it does not make sense to require invariance of items 1, 3 and 6.
We include the SAS macro (provided in the supplements) using the statement %include lrasch_mml.sas ; or alternatively including it directly from the web via FILENAME lrasch URL http://biostat.ku.dk/~kach/macro/lrasch_mml.sas ; %include lrasch; We fit the longitudinal Rasch model using the statement %lrasch_mml(DATA=data.scqol, ITEM_NAMES=inames, ANCHOR=b_SC02 b_SC04 b_SC05 b_SC07 b_SC08 b_SC09, SPLIT=NONE, OUT=SCQOL, ICC=YES); This statement specifies that the parameters of the items SCO02, SC04, SC05, SC07, SC08, and SC09 do not change over time. The log likelihood value −2 log L = 2948.2 is saved in the data set SCQOL_logl and printed in the output window.

scqol: LRASCH MML estimation loglikelihood
Descr Value -2 Log Likelihood 2948.2 AIC (smaller is better) 3022.2 AICC (smaller is better) 3023.9 BIC (smaller is better) 3118.9 Estimates of the item parameters are saved in the data set SCQOL_thresholds and are also printed in the output window.   The ICCs for selected items are shown in Figures 1 and 2.

Local dependence across time points
The macro %lrasch_mml makes it possible to split items and to test the assumption local independence across time points we thus specify %lrasch_mml(DATA=data.scqol, ITEM_NAMES=inames, ANCHOR=b_SC02 b_SC04 b_SC05 b_SC07 b_SC08 b_SC09, SPLIT=b_SC02, OUT=SP2); where the item SC02 is split. The log likelihood value is −2 log L = 2933. The insignificant result means that for the item SC02 there is no evidence of local dependence across time points. The observed correlation is explained solely by the latent variables θ 1 and θ 2 . Similar tests should be carried out for the other items in the scale. This can be done in steps: by first testing local dependence across time points for each item against the simple longitudinal Rasch model, and then evaluating the combined evidence. The flexibility of the proposed macro then makes it possible to add local dependence across time for a single item yielding an extended model. Local dependence across time points for the remaining items can then be tested against this extended model.

Discussion
The proprietary software packages RUMM  and WINSTEPS (Linacre 2011) for fitting Rasch models are widely used. All of these fit unidimensional models only, even though many applications deal with multidimensional or longitudinal data. The twodimensional Rasch model as originally discussed by Andersen (1985) and Embretson (1991) was formulated for longitudinal data. To obtain consistent item parameter estimates MML estimation (Bock and Aitkin 1981;Thissen 1982;Zwinderman and Wollenberg 1990) is used. This approach to item parameter estimation assumes that the latent variables are sampled from a population and introduces an assumption about the distribution of the latent variable.
A weakness of the proposed macro is that fit indices, like INFIT or OUTFIT, are not implemented. However, since these fit test statistics are implemented in the SAS macros for unidimensional Rasch models AnaQol (Hardouin and Mesbah 2007), GLIMMIX_Rasch (Chen et al. 2013), and %rasch_mml  testing the fit of the Rasch model at each time point is feasible. While tests of item fit in the longitudinal model are not implemented, some of the features of the longitudinal model structure can be tested using SAS macros for unidimensional Rasch models. Indeed, because local dependence across time points can be modeled using item splitting (as implemented using the SPLIT option), a SAS macro for unidimensional Rasch models that can accommodate missing data can be used to test the fit of locally dependent items in a longitudinal Rasch model. Another weakness of the proposed macro is that the implemented graphical display only shows curves of the expected categories probabilities, but that no comparison with observed data is possible. Again SAS macros for unidimensional Rasch models like %AnaQol (Hardouin and Mesbah 2007), %rasch_mml , and %rasch_cml (Christensen 2013) can be used for this.
The general implementation allows the user to specify models where the item parameters do change over time. The macro can be used to test the assumption of item parameter invariance using likelihood ratio tests, thus adding to existing methods for detection of item parameter drift (Donoghue and Isham 1998;DeMars 2004;Galdin and Laurencelle 2010). The macro also makes it possible to study local dependence across time points, by splitting of the item at follow-up into new items according to the responses given at baseline (Olsbjerg and Christensen 2014). The macro %lrasch_mml makes it possible to include splitted items and to test the assumption of local independence across time points using likelihood ratio tests, thus adding to existing tests of this assumption .