The R Package CDM for Cognitive Diagnosis Models

This paper introduces the R package CDM for cognitive diagnosis models (CDMs). The package implements parameter estimation procedures for two general CDM frameworks, the generalized-deterministic input noisy-and-gate (G-DINA) and the general diagnostic model (GDM). It contains additional functions for analyzing data under these frameworks, like tools for simulating and plotting data, or for evaluating global model and item ﬁt. The paper describes the theoretical aspects of implemented CDM frameworks and it illustrates the usage of the package with empirical data of the common fraction subtraction test by Tatsuoka (1984).


Introduction
In recent years educational research was characterized by an increasing demand of complex information on students' achievement. This may be caused by a growing interest in explaining the results of international comparative studies like the trends in international science study (TIMSS; Mullis, Martin, Ruddock, O'Sullivan, Arora, and Erberer 2005), the progress in international reading study (PIRLS; Mullis, Martin, Kennedy, and Foy 2007) and the programme for international student assessment (PISA; OECD 2010). It may also be caused by a strong need to explain the social and ethnic disparities detected in these studies (e.g., Mullis et al. 2007).
At present, the preferred method to model students' responses in an achievement test are item response theory (IRT) models (e.g., van der Linden and Hambleton 1997). These models deliver a quantification of the items' difficulties and the respondents' abilities through real-valued parameters which are measured on one or more continuous factors. In order to obtain qualitative interpretations of the parameters, each of the continuous factors may be partitioned into discrete hierarchically ordered competence levels through cut points by means of a so-called standard setting procedure (Cizek, Bunch, and Konns 2004). A discrete classification of the students' abilities into these competence levels may then serve as a kind of feedback, for example, for evaluating if students fulfill educational standard norms (e.g., Martin, Mullis, and Kennedy 2007).
Another method allowing diagnostic conclusions about students' abilities is a cognitive diagnosis model (CDM; Rupp, Templin, and Henson 2010), which comprises two steps: In a first qualitative step of a CDM analysis, educational and didactic experts have to subdivide the tested ability into different basic abilities, which are termed skills. For the description of the skills and their connections to the tested ability the experts often rely on information given in so-called competence models. In these competence models the skills may be defined through components in the process of task solving, for example, finding a common denominator in fractional arithmetics (Tatsuoka 1984), or they represent coarsely defined abilities, for example, the six basic skills systems, classification, observation, measurement, prediction and data (Ackerman, Zhang, Henson, and Templin 2006;Rupp et al. 2010, p. 31f.). After defining the number and content of the skills, the experts expose in a so-called Q-matrix which skills are required to master each item (Tatsuoka 1983). Then, in a second quantitative step of the CDM analysis, the students are classified into dichotomous latent skill classes, which predict their presence or absence of the before defined skills. The main results are threefold: Firstly, the distribution of the skill classes allows for statements how many students in the test population possess certain combinations of skills. Secondly, the skill mastery probabilities include information about the percentage of students in the test population possessing the individual skills. Thirdly, for each individual student a skill class is deduced which is called the student's skill profile and which predicts the possession or nonpossession of the individual skills. Together all three issues provide a solid empirical base for targeted pedagogical interventions both on the level of the test population and on the individual student's level. In contrast to information obtained from the IRT framework, the CDM results may be interpreted directly in terms of competences the respondents possess. For further methodical differences between the CDM and the IRT models see for example Kunina-Habenicht, Rupp, and Wilhelm (2009).
CDMs have several connections to well-established areas in psychometrics: firstly, the theory of classification, where basic ideas can be found in the mastery model by Macready and Dayton (1977) and in restricted latent class models by Haertel (1989). Secondly, IRT with initial approaches in the multicomponent model by Whitely (1980) and in the linear logistic test model by Fischer (1973), and thirdly, in mathematical psychology, and here especially the field of knowledge space theory, see for example Doignon and Falmagne (1999). Based on the multitude of different approaches, CDMs have many names, as for example diagnostic classification models, cognitive psychometric models, or structured IRT models.
The CDM framework incorporates a huge variety of models which differ basically in three aspects: firstly, the combination in which students have to possess the skills for successfully mastering an item, that is, compensability: In some CDMs all assigned skills have to be possessed for mastering the items (noncompensatory models), in other CDMs just one of the assigned skills has to be possessed, and other CDMs require one of several special combinations of the assigned skills (compensatory models). Secondly, CDMs vary in their item dimensionality: CDMs in which exactly one skill is required for mastering each item are called CDMs with between item dimensionality, whereas CDMs requiring more than one skill per item have a within item dimensionality. Finally, CDMs differ in the way the stochastic component is introduced into the model, that is, students can slip or guess in items, in skills, or in both. In an achievement test each item may follow another CDM. The different CDM rules for the items are sometimes also called the items condensation rules or simply the item rules (DiBello, Roussos, and Stout 2006). Furthermore, CDM approaches may be partitioned into specific CDMs and generalized frameworks. The latter allow specifying many of the specific CDMs by constraining their parameters and defining their link functions. There are three common generalized CDM frameworks: the generalized-deterministic input noisy-and-gate (G-DINA; de la Torre 2011) model, the generalized diagnostic model (GDM; von Davier 2008) and the log-linear cognitive diagnosis model (LCDM; Henson, Templin, and Willse 2009).
The present work introduces and discusses the R (R Core Team 2016) package CDM (Robitzsch, Kiefer, George, and Ünlü 2016) for estimating and analyzing CDMs. The package contains routines for the estimation of the noncompensatory deterministic input noisy-andgate (DINA ;Haertel 1989;Junker and Sijtsma 2001) model, the compensatory deterministic input noisy-or-gate (DINO; Junker and Sijtsma 2001) model, and their generalized version, the G-DINA. These three model types and their parameter estimation process are reviewed in Sections 2.2, 2.3, and 2.4. Section 2.5 reviews the additional possibilities of the GDM and gives first prospects of its implementation. The LCDM framework is not exclusively implemented as it is very similar to one form of the G-DINA. Further implemented methods for the analysis of the before mentioned CDM frameworks are described in Section 2.6. In Section 3 the structure of the R package CDM is described, and Section 4 illustrates all models and methods with data of the fraction subtraction test by Tatsuoka (1984). Finally, Section 5 sheds light on parameter recovery and speed of the R package CDM.

Terminology and notation
Consider an achievement test in which I students respond to J items. A value of 1 indicates a correct item response and a value of 0 an incorrect one. The dichotomous manifest response of student i, i = 1, . . . , I, to item j, j = 1, . . . , J, is denoted by X ij . The empirical responses of all I students to all J items are given in a I × J binary data matrix X. The i-th row X i of X represents the responses of student i to all J items and is called the i-th student's response pattern.
Educational experts define K (K ≤ J) skills α k , k = 1, . . . , K, which students have to possess for mastering each of the J items under consideration. For each student i a latent dichotomous skill profile α i = [α i1 , . . . , α iK ] denotes his possession (α ik = 1) and nonpossession (α ik = 0) of the K predefined skills. Obviously, these individual skill profiles are unknown (it is one goal of the CDM analysis to estimate them). Educational experts also define which skills are required to master which item in a J × K matrix Q (Tatsuoka 1983): The (j, k)-th element q jk of Q equals 1 if skill k is relevant for the mastery of item j and equals 0 otherwise. Additionally, the experts have to specify the items' condensation rules (i.e., the specific CDM each item follows).
CDMs assume that the manifest response X ij of student i to item j arises as a result of his possessed skills i, the skills required for item j defined in the j-th row q j of the Q-matrix, and the j-th item's condensation rule. Because the skills α k are unknown, a CDM algorithm deduces from the manifest responses, the Q-matrix, and the condensation rules information on the K skills the student possesses.
This procedure comprises two steps: In a first step, all students are classified into skill classes α l , l = 1, . . . , L, satisfying a global optimization criterion. Note that the largest possible number of disjunctive skill classes is L = 2 K , arising from all combinations of the assumed K skills. From this first step the distribution of the skill class probabilities, that is, the relative frequencies P (α l ), l = 1, . . . , L, of students classified into the skill classes α l , is obtained. Further, we obtain the skill mastery probabilities P (α k ), k = 1, . . . , K, giving for each skill α k the relative frequency of students in possession of it. Both the distribution of the skill classes and the skill mastery probabilities allow for diagnostic statements about the test population, that is, which skills are possessed in which combination and frequency.
In a second step, the CDM algorithm deduces the most probable skill classes each individual i, i = 1, . . . , I, belongs to. The i-th student's vector of present and absent skills is also called the i-th student's skill profile and is denoted by α i = [α i1 , . . . , α iK ]. The skill profiles provide the basis for individual based diagnostic feedback, forming a solid empirical base for further instruction and learning.
Note that we have to distinguish between the skill classes α l , l = 1, . . . , L, in the population and the individual skill profiles α i , i = 1, . . . , I, both being K-length dichotomous vectors. The L possible skill classes cover all I skill profiles. For the ease of notation we use the same symbol "α". It will always become clear from the context whether α 1 refers to the first skill class or the first individual skill profile.

DINA and DINO
Because of its simplicity and its parsimony the DINA (Haertel 1989;Junker and Sijtsma 2001, function din in the R package) model is one of the most commonly used and most popular core CDMs. The DINA model's noncompensability asserts that students have to possess all skills assigned to an item for successfully mastering it. On the contrary, the compensatory DINO (Junker and Sijtsma 2001, function din) model requires students to possess at least one of the assigned skills for successfully mastering the respective item.
In both, DINA and DINO models, the i-th student's probability to master the j-th item involves two components: a deterministic and a probabilistic one. The former states whether the student is expected to master the j-th item given his possessed skills. In the DINA model the deterministic component is expressed through the dichotomous latent response The vector [q j1 , . . . , q jK ] again denotes the q-th row of the Q-matrix which indicates the skills required for the mastery of item j. A student who possesses all or even more than these required skills is expected to master the item and η ij = 1. Otherwise, for a student who is not expected to master the item η ij = 0. In the DINO model, the deterministic latent response of student i to item j equals 1 if the student possesses at least one of the required skills. The second, that is the probabilistic component, possible deviates from these expectations: On the one hand, if student i is expected to master the item, he nevertheless may slip and fail the item. On the other hand, even if the student is not expected to master the item, he may succeed by a lucky guess. The probabilities g j of guessing and s j of slipping are modeled as item specific parameters with The probability of student i to solve item j results from combining both components Hence, the probability P(X ij = 1|α i , g j , s j ) is modeled by only two values: All students who are not expected to master the item (i.e., η ij = 0) have the chance g j to solve the item by lucky guess, and all students who are expected to master the item (i.e., η ij = 1) have the chance 1 − s j to actually solve the item without slipping.

Parameter estimation in the DINA and DINO model
Parameter estimation of the DINA and DINO model is performed by means of marginal maximum likelihood (MML) estimation. A pertinent way to implement this method is the expectation-maximization (EM; Dempster, Laird, and Rubin 1977) algorithm (de la Torre 2009b). The EM algorithm iterates between an E-step and an M-step: In the E-step, expected counts for each item and each group are calculated. Then, the M-step updates the parameter estimates for the DINA or DINO model using maximization methods. Finally, the E-step and M- Step alternate until a previously set convergence criterion or a fixed maximum number of iterations is attained. Note that the process of parameter estimation is carried out in the same way if different items follow different condensation rules. Note also that for reasons of simplicity we present the parameter estimation process by the example of the DINA model (i.e., the DINO model, depending on the definition of the dichotomous latent response η ij ), but it may be extended to more complex CDMs in a straightforward manner (Xu and von Davier 2008a).
Up to this point, the probability P(X ij = 1|α i ) is interpreted as the probability of student i to master item j given his skills α i . This notion facilitates the interpretation and the understanding of the models. But the students' skill profiles are unknown and the goal is to estimate them. Thus, we should understand P(X ij = 1|α i ) as the probability of student i to master item j if he would be classified in skill class l = i, l = 1, . . . , L. This convention is used throughout the following steps.
It is assumed that the responses X i of student i to the items are independent conditional on α i (local independence assumption). Furthermore, it is assumed that examinees are mutually independent as well, because we expect them to represent a random sample of the population. Let δ = [g, s] be the vector of all item parameters with g = [g 1 , . . . , g J ] and s = [s 1 , . . . , s J ]. Furthermore let be the probability of response vector X i if student i possesses the skills of skill class α l , l = 1, . . . , L.
For estimating the DINA model, the marginal log-likelihood is maximized with respect to the item parameters δ and the parameters γ = [γ 1 , . . . , γ L ] describing the skill class distribution P(α l ), l = 1, . . . , L. In case of the estimation of an unrestricted skill space it holds L = L and γ l = P(α l ) for all l = 1, . . . , L. In case of an restricted skill space (i.e., through the application of a log-linear skill space or skill hierarchies) L ≤ L − 1. For the ease of presentation in the following we assume unrestricted skill class distributions.
Prior to the first iteration of the EM algorithm, initial item parameters δ and skill distribution parameters γ have to be chosen. Then, the EM algorithm alternates between the E-step and the M-step described in the following:
b. Two types of expected counts are derived from the posterior: The first count is the expected number of students which are classified into skill class α l for item j, j = 1, . . . , J. Note that in case of no missing data T jl = T j l for all j, j = 1, . . . , J. The second count describes the expected number of students classified in skill class α l while responding item j correctly.

M-Step
a. The item parameter vector δ = [g, s] is updated. The estimating equations are obtained by setting the first derivative of the expected complete log-likelihood with respect to the item parameters equal to zero. The derivative only involves the two counts obtained in the E-step. Let be the expected number of students lacking at least one of the skills required for the mastery of item j (i.e., η lj = 0) and be the expected number of students among T (0) j who correctly respond to item j. Furthermore, let T (1) j and R (1) j have the same interpretation except that they belong to students who possess all skills required for item j (i.e., η lj = 1). Based on this definitions, the items parameters of item j are updated according tô b. The skill class distribution P(α l ; γ) is updated. Therefore the expected number n l of students in skill class α l is calculated, namely Then the skill class distribution is updated by P(α l |γ) = n l I l = 1, . . . , L. and the skill mastery probabilities are defined as The E-and M- Step alternate until convergence. Convergence may be achieved if the maximal change between the parameter estimates or the relative change in the deviance is below a specific predefined value or after a maximum number of iterations. Finally, the covariance matrix for the estimatedδ andγ parameters (i.e., the empirical standard errors) is computed using the empirical cross-product approach (Paek and Cai 2014). As an other possibility, statistical inference based on resampling methods (jackknife, repeared replicate weights, bootstrap) can be conducted by applying IRT.jackknife to a fitted model.
Once the algorithm converged, the individual student classifications or individual skill profiles can be deduced from the probabilities P(α l |X i ), according to three methods: Firstly, following the approach of maximum a priori (MAP) classification, the largest value of P(α l |X i ) gives the skill class into which student i is classified: Secondly, an individual classification of student i based on maximum likelihood estimation (MLE) is obtained by maximizinĝ Finally, for a classification of student i based on expected a posteriori probabilities (EAP), the marginal skill probability P(α k | X i ) of student i for mastering skill k is computed as the sum of all P(α l | X i ) corresponding to mastery of skill k (i.e., having a 1 in the k-th element) Then, the i-th student's EAP skill profile is estimated bỹ For deducing the dichotomous skill classα i;EAP in which student i is classified according to EAP, each marginal skill mastery probability P(α k |X i ), smaller than 0.5 is set to 0, whereas each one larger or equal to 0.5 is set to 1. For a comparison of MAP, MLE, and EAP classification methods see Huebner and Wang (2011).
It remains noting three aspects concerning the estimation algorithm: Firstly, the algorithm may also handle sampling weights, which is not presented here. Secondly, in particular situations not all L = 2 K skill classes have to be estimated, for example assumptions of hierarchy between the K skills may be implemented in predefining specific skill class occurrence probabilities P(α l ) to be zero. Thirdly, in cases where models have almost as many parameters as observations, which, consequently, would lead to weakly or nonidentifiable skill classes, Xu and von Davier (2008b) proposed to change from the unreduced skill space P(α l ), l = 1, . . . , L, to a log-linear smoothed form of the skill space.

G-DINA
For relaxing this restrictive two-probability constraint, de la Torre (2011) introduced the G-DINA (function gdin in the R package) framework. In this model, students exhibiting different sets of required skills have different probabilities of mastering item j. The G-DINA model employs the item response function (2) Here α * j;i = [α * j;i1 , . . . , α * j;iK * j ] is the shortened skill profile of student i, which includes only the skills relevant for item j. Thus K * j = K k=1 q jk represents the number of skills which are necessary for the mastery of item j, that is, K * j is the sum of ones in the j-th row of the Q-matrix. In the following, for notational convenience and without loss of generality, let the first K * j of all K skills be the ones required for item j. Then the former skill profiles α i decompose into different reduced skill profiles depending on item j, which necessitates the notation of an additional item index in each skill profile. Furthermore δ j = [δ j;0 , δ j;1 , . . . , δ K * j ;1 , δ j;12 , . . . , δ j;12...K * j ] includes the item parameters with respect to item j, with δ j;0 being an intercept parameter, δ j;k representing first-order effects, δ j;kk second-order effects and so on.
If only first-order effects δ j;k are modeled (i.e., all other parameters are defined to be zero), the resulting models are called G-DINA 1way models. In the same manner, G-DINA models with first-order effects δ j;k and second-order interaction effects δ j;kk are called G-DINA 2way models. Note that other versions of the G-DINA model use the logit or log link instead of the identity link (cf. Equation 2) for modeling the response probabilities. Note also that many core CDMs, as for example the DINA and DINO model, may be derived from the G-DINA framework by restricting parameters.

Possibilities of the GDM
The GDM (von Davier 2008, function gdm in the R package) framework includes nearly all common CDMs for both dichotomous and polytomous response data. Furthermore, with GDMs polytomous and even (quasi-)continuous skills can be established. Hence, the class of GDMs also includes a partial credit model for polytomous response data as well as uniand multidimensional IRT models. Furthermore, in this framework Q-matrices with polytomous entries may be handled. The estimation of GDMs is based on MML methods and is implemented in the R package CDM based on an EM algorithm (Xu and von Davier 2008b).

Model fit
The R package CDM supports the evaluation of model fit via the Akaike information criterion (AIC; Akaike 1973, function AIC) and the Bayesian information criterion (BIC; Schwarz 1978, function BIC) or an χ 2 overall goodness-of-fit measure which is based on the idea of local dependence between the items (Chen and Thissen 1997, function IRT.modelfit).
The AIC and BIC indices follow the well known formulas AIC = −2 log L(X) + 2 · p and BIC = −2 log L(X) + log I · p, where log L(X) denotes the maximal log-likelihood of the model and p represents the total number of estimated parameters. For example for DINA or DINO models with K underlying skills it holds p = 2 · J + L − 1.
The χ 2 measure evaluates, roughly spoken, the difference between the observed and the modelpredicted response probabilities: While the predicted response probabilities are calculated assuming local independence between the items, the observed probabilities may deviate from this assumption. Consequently, if the analyzed CDM model approximates the (dependencies in the) data well, the χ 2 measure provides low values.
Let N (X j = n, X j = m) be the observed absolute number of students' responses fulfilling X j = n and X j = m, n, m = 0, 1. Let be the predicted absolute number of students in the same response category, where P(α l |X i ) is the individual posterior from the fitted model and P(X ij = n|α l ) = P(X ij = n|α l ; δ) denotes the DINA model's item response function. Then the difference between the observed and the predicted probabilities for two arbitrary items j and j , j, j = 1, . . . , J, j = j , is defined by This fit statistic is chi square distributed with one degree of freedom. As a criterion of absolute fit, the maximal χ 2 jj is considered (Chen, de la Torre, and Zhang 2013) and compared to the 1 − α * quantile χ 2 1−α * ,1 of the chi squared distribution with one degree of freedom. Here, α * is the Holm corrected (Holm 1979) significance level of the desired level α.

Item fit
The R package CDM provides as an item fit statistic the so called root mean square error of approximation (item-fit RMSEA; Kunina-Habenicht, Rupp, and Wilhelm 2009, function itemfit.rmsea), which indicates how good an item coincides with the model. The item-fit RMSEA of item j compares the model-predicted item response probabilities P(X j = 1|α l ) with the predicted absolute number of correct responsesN (X j = 1|α l ) in each skill class α l : Here p(α l ) is the frequency of students classified in skill class α l andN (X j |α l ) the predicted total number of responses (i.e., correct and incorrect ones) to item j given by students in skill class α l . Kunina-Habenicht et al. (2009)

Item discrimination index
In DINA and DINO models the additional constraint g j < 1−s j ensures that the probability of mastering an item in possession of all required skills without slipping is higher than the probability of guessing an item while lacking at least one required skill. Because this constraint is not considered in the estimation process it may be checked with the item discrimination index IDI j = 1 − s j − g j (Lee, de la Torre, and Park 2012, function summary.din), where negative IDI values indicate a violation of this constraint. The IDI may also be seen as a diagnostic index, reporting for each item how it discriminates between students possessing all skills (i.e., having a response probability of 1 − s j ) and students lacking in at least one skill (i.e., guessing with probability g j ). Thus, IDIs close to 1 signal a good discrimination or "diagnosticity" of the item, whereas IDI values close to 0 detect items with a low discrimination.

Classification criteria
In some studies it is not only of interest to assess the global model fit but also to evaluate how well the response behavior of individual students is described by the model. In these cases classification criteria as the model's classification accuracy and consistency might be utilized. Classification accuracy is a measure of how well individual students are correctly classified in their true skill classes, whereas classification consistency is a measure for the consistence of the classifications in two parallel test forms with the same items and parameters. In the R package CDM, the classification accuracy and consistency (function cdm.est.class.accuracy) are assessed via simulation methods (cf. DiBello et al. 2006) and analytically by the method of Cui, Gierl, and Chang (2012). Concerning the former, the simulation is conducted with the estimated model parametersδ andγ. Note that both, the accuracy and consistency measures, rely on the assumption that the specified model is the correct one.

Tetrachoric correlation between skills
It may also be of interest to analyze correlations between the estimated skills. The calculation is based on the assumptions that latent continuous variables are underlying the latent dichotomous skill variables and that, under bivariate normality assumptions, the correlation between two skill variables equals the correlation between the two underlying continuous variables. With 1 denoting the indicator function, this so-called tetrachoric correlation between two skills α k and α k (cf. Templin, Henson, Templin, and Roussos 2008) is inferred on the basis of a two times two frequency table with 1{α lk = n}1{α lk = m}P(α l |γ).

Structure of the package
The R package CDM provides the three main functions din, gdina, and gdm which allow the estimation of DINA and DINO, G-DINA and GDM models, respectively. For objects returned by the model estimation functions dina, gdin, and gdm, S3 methods for generics like print and summary as well as some additional functions for further analysis of the respective CDMs are available. The package includes the simulated example data sets sim.dina and sim.dino for DINA and DINO models while simultaneously it allows to construct response data from DINA, DINO, and G-DINA models by using the simulation tools sim.din and sim.gdina. Table 1 yields a scheme of the package structure, whereas the basic output values of the main functions din, gdin, and gdm and their descriptions are given in Table 2. The functions attribute.patt, skill.patt, and pattern are not defined for the gdm function because they are not generalizable for the case of nondichotomous data and skills.

Examples
The following examples are conducted using data of the well-known fraction-subtraction data (Tatsuoka 1984), which comprises dichotomous responses of I = 536 junior high school students to J = 20 fraction subtraction items. The items, the definition of the K = 8 underlying skills, and their assignment to the items are given in Table 3. In the R package CDM, the response data is included in the object fraction.subtraction.data and the respective Q-matrix in the fraction.subtraction.qmatrix object.

DINA
The estimation of a DINA model, in which it is required that the students possess all assigned skills for mastering the items, is conducted with the command R> colnames(fraction.subtraction.qmatrix) <-paste("Skill", 1:8, sep = "_")   Tatsuoka (1984). In the table the skills are denoted as follows: α 1 describes the skill of converting a whole number to a fraction, α 2 is separating a whole number from a fraction, α 3 denotes simplifying before subtracting, α 4 stands for finding a common denominator, α 5 indicates the skill of borrowing from whole part, α 6 expresses column borrowing to subtract the second numerator from the first, α 7 represents subtracting numerators, and α 8 stands for reducing answers to simplest form.

R> summary(fractions.dina)
important information about the model is obtained, as for example the values and the number of the estimated model parameters, goodness-of-fit measures and the tetrachoric correlations between the skills. All model parameters can be accessed through

R> coef(fractions.dina)
Additionaly a comprehensive item parameter summary, including item parameters and the item-fit RMSEA measure, is stored in the object

R> fractions.dina$item
The following output is only given for the first 5 out of the altogether 20 items: According to this result only 58% of the junior high school students are able to convert a whole number to a fraction (α 1 ), but 81% of the students master subtract numerators (α 7 ) and to reduce answers to simplest form (α 8 specifying the MLE classification of the first student. Accordingly, the MAP and EAP classifications can be obtained by specifying the respective type. Thus, student 1 is classified into class "11101111" via MLE meaning that he probably lacks skill α 4 . The classifications via MAP and EAP correspond. The main results of a DINA model, including the EAP classification of an arbitrary student (here: Student 1), may be illustrated by R> par(mfrow = c(2,2)) R> plot(fractions.dina, pattern = "00010011011101110111", ask = FALSE) All plots are shown in Figure 1. The first plot on the top left hand side of Figure 1 depicts the DINA model's item parameters, that is, the guessing and nonslipping probabilities for each item. The plot signalizes low discriminations for Items 5 and 8. The second plot on the top right hand side illustrates the skill mastery probabilities: Students most probably master skills α 8 reduce answers, α 7 substract numerators and α 6 column borrow. The third plot on the bottom left hand side shows the skill class distribution. Skill classes in which students are most frequently classified are labeled. The fourth plot analyzes the possessed skills of an individual student, in this case Student 1. According to the model Student 1 possesses all skills except of α 4 find a common denominator.

G-DINA
The command R> fractions.gdina1 <-gdina(fraction.subtraction.data, + fraction.subtraction.qmatrix, rule = "GDINA1") fits a G-DINA 1way model for the fraction subtraction data and the respective Q-matrix. The distribution of the skill classes, the skill mastery probabilities, and the individual skill profile for Student 1 may again be accessed via
The model's summary shows, among others, the common item difficulty parameters b.Cat1.
The following output is again only given for the first 5 out of the altogether 20 items.

Parameter recovery and speed
This section focuses on parameter recovery issues of the EM algorithm for estimating CDMs which is implemented in the R package CDM. For illustrational purposes, DINA models including J = 30 items, K = 5 skills, and sample sizes of I = 100, 300, and 1000 students are simulated, with all guessing and slipping parameters varying between 0 and 0.29. The Qmatrix used for the simulation can be found in Table 1 of de la Torre (2009b). 5000 data sets are simulated for each of the three sample sizes. Each of the simulated data sets is fitted with a DINA model by setting the convergence criterion conv.crit of the din function to 0.0001. Table 4 gives the mean bias and the minimal and maximal absolute bias between the true and the estimated guessing, slipping, and skill class parameters. Table 4 also gives the mean, minimal, and maximal root mean square error (RMSE) between the true and the estimated guessing, slipping, and skill class parameters. The results indicate that the algorithm can provide accurate parameter estimates. All parameter estimates are unbiased and the RMSE decreases with increasing sample size. Parameter recovery for the above mentioned data sets is also tested with algorithms in other noncommercial software packages as for example the Ox (Doornik 2009) script by de la Torre (2009b), an lem (Vermunt 1997) procedure, and the mdltm software by von Davier (2008). The estimated item parameters and skill class probabilities of all the three before mentioned software packages and the R package CDM coincided up to the third or fourth position after the decimal point, which shows a reasonable agreement between the results. We may note that at the time of writing, no R function allowed the estimation of restricted latent class models as required for fitting CDMs.
Furthermore, we compared the speed of the CDM R algorithm with the Ox code for the DINA model provided by de la Torre (2009b). In both programs a DINA model is fitted to a rules (gdina.wald; de la Torre and Lee 2013) or for assessing differential item functioning (gdina.dif; Hou, de la Torre, and Nandakumar 2014).
Alternatively to conducting polytomous GDMs, it is further possible to estimate cognitive diagnosis models for polytomous attributes (pgdina; Chen and de la Torre 2013). Polytomous items can be handled by using the gdm function or by transforming the polytomous items in a set of dichotomous pseudo items (sequential.items; Tutz 1997), which are then deployed subsequently in DINA or GDINA models. Furthermore, multiple choice CDMs (de la Torre 2009a) may also be analyzed with the R package CDM as well as the generalized distance discrimination method (Sun, Xin, Zhang, and de la Torre 2013).
In principle, the implementation of the structured latent class model (slca Formann 1992;Formann and Kohlmann 1998) in the R package CDM covers most, if not all, of the implemented cognitive diagnostic models in the package. This function can also, and more generally, be used for the specification of constrained latent class analysis or many multidimensional item response models. For example, recent developments in the field of cognitive diagnostic models (e.g., Hong, Wang, Lim, and Douglas 2015; Huo and de la Torre 2014) may be implemented as a particular structured latent class model using the slca function (c.f. von Davier 2009).