randomLCA : An R Package for Latent Class with Random Eﬀects Analysis

Latent class is a method for classifying subjects, originally based on binary outcome data but now extended to other data types. A major diﬃculty with the use of latent class models is the presence of heterogeneity of the outcome probabilities within the true classes, which violates the assumption of conditional independence, and will require a large number of classes to model the association in the data resulting in diﬃculties in interpretation. A solution is to include a normally distributed subject level random eﬀect in the model so that the outcomes are now conditionally independent given both the class and random eﬀect. A further extension is to incorporate an additional period level random eﬀect when subjects are observed over time. The use of the randomLCA R package is demonstrated on three latent class examples: classiﬁcation of subjects based on myocardial infarction symptoms, a diagnostic testing approach to comparing dentists in the diagnosis of dental caries and classiﬁcation of infants based on respiratory and allergy symptoms over time.


Introduction
Latent class models (Lazarsfeld and Henry 1968) are a method originally developed for sociology where they are used to identify clusters or sub-groups of subjects, based on multivariate binary observations, and as such are a form of finite mixture model. Their application has been further expanded into many areas, such as psychology, market research and medicine. Diagnostic classification using latent class methods has been applied in a number of areas, with early applications by Golden (1982) to dementia, Young (1982) to develop diagnostic criteria for schizophrenia and Rindskopf and Rindskopf (1986) for myocardial infarction. An advantage of using latent class analysis over other classification methods is that the classification is model based, allowing use of model selection techniques to determine which classifi-cation scheme is most appropriate. This compares with classifications developed from simple observation, that may give undue weight to one or more symptoms or outcomes. An example of the problems with this type of analysis is demonstrated by Nyholt, Gillespie, Heath, Merikangas, Duffy, and Martin (2004) who used latent class analysis of headache symptoms to show that classification of migraine with and without aura as separate diagnoses is not supported. While latent class methods have been extended to any outcomes with a variety of distributions, binary is the most commonly used.
The assumption of latent class models is conditional or local independence, where the outcomes are independent conditional on the latent class. This assumes that subjects within a class are homogeneous. Where this does not apply, that is the true classes are heterogeneous, a consequence of this will be to increase the number of latent classes required in attempting to explain the heterogeneity, with possible consequent difficulty in interpretation. A solution is to incorporate random effects so that the outcomes are independent conditional on the latent class and random effect or effects.
There are a number of packages capable of fitting latent class models in R (R Core Team 2017). Two of these solely for fitting of latent class models are poLCA (Linzer and Lewis 2011) and BayesLCA (White and Murphy 2014). BayesLCA is particularly designed to perform Bayesian analyses, but also offers the choice of the EM algorithm and variational Bayes, but has limited facilities for producing plots and summaries. poLCA is a more fully featured package which allows for polytomous outcomes and latent class regression, which are not available in randomLCA. The advantage of randomLCA over the other packages is that it will fit both standard latent class models and those incorporating random effects. This is important for use with diagnostic tests, as it allows for the variation of the test response between subjects, but may be also used to model heterogeneity in other applications, for example see Muthén (2006). Commercial software packages that also allow latent class with random effects are Mplus (Muthén and Muthén 2015) and Latent GOLD Syntax Module (Vermunt and Magidson 2013), both of which require the model to be defined using a symbolic language.
The purpose of this paper is to describe the randomLCA package (Beath 2017). Package randomLCA is available from the Comprehensive R Archive Network (CRAN) at https: //CRAN.R-project.org/package=randomLCA. The remainder of the paper is organized as follows. Section 2 describes the models, starting with standard latent class and then continuing with the random effect extensions, including references allowing for further investigation. Section 3 describes three examples with explanation of how the features of the package may be used. Section 4 summarizes the capabilities of the package and describes some areas in which the package could be extended.

Latent class model
The basis of latent class analysis is that each subject is assumed to belong to one of a finite number of classes, with each class described by a set of parameters that define the distribution of outcomes or manifest variables for a subject, and is a form of finite mixture model (McLachlan and Peel 2000). Generally, the number of classes is unknown and must be determined from the data. For binary outcomes, the model is where y ij is the jth binary outcome for subject i, π cj is the probability of the jth outcome being equal to 1 for a subject in class c, k is the number of outcomes and c i is the class corresponding to the ith subject. The marginal probability, obtained by summing over the classes, for each subject is where η c is the probability of a subject being in class c with C c=1 η c = 1 and C is the number of classes. From this can be obtained the marginal likelihood.
A requirement for the estimates of the probabilities π cj is that they be restricted to the interval zero to one, and that the η c sum to one, something that did not always occur with the original methods used for analysis. A solution developed by Formann (1978Formann ( , 1982 is to use a logistic (or alternatively probit) transformation, allowing unconstrained estimation of parameters but correctly restricting the probabilities to between zero and one. This can be obtained for the logistic using the following relations π cj = e a cj / (1 + e a cj ) and η c = e θc / C =1 e θ . Hence we estimate the a cj and θ c , rather than π cj and η c . Similar equations apply for the probit transformation.
When used as a classification algorithm the model does not simply return the most likely class for each subject but returns a probability of class membership, based on the observed outcomes. The posterior probability for a set of given observed outcomes can be obtained from Bayes theorem. Given the observed outcomes y i1 , y i2 , . . . , y ik then the probability that the subject is in class d is:

Latent class with random effect model
A major difficulty with latent class models is the requirement for local independence or equivalently homogeneity of the outcome probabilities within each class. When the classes are heterogeneous the assumption that the manifest outcomes are independent, conditional on the latent class, does not apply. Uebersax (1999) describes the problems associated with conditional dependence as "to add spurious latent classes that are not truly present at the taxonic level" and Vacek (1985) showed that ignoring the conditional dependence produced biased estimates in the context of diagnostic testing. Pickles and Angold (2003, p. 530) discuss the classification of diseases in psychology as categories or by severity, arguing that "most forms of psychopathology (indeed, most forms of pathology of any sort) manifest both continuous and discontinuous relationships with other phenomena", and provide examples of where this may occur.
A solution to the problem of heterogeneity was developed by Qu, Tan, and Kutner (1996) combining a latent class model with a random effect to explain the heterogeneity. The probabilities are transformed to the probit scale and a normally distributed random effect added for each subject, before transforming back to probabilities. An alternative to the probit scale is the logit scale. A model for latent class incorporating a random effect λ ∼ N (0, 1) is: where, either, for a probit scaling of the random effect or, for a logistic scaling and a cj determines the conditional class probability for a value of zero for the random effect, and b cj scales the random effect, and is usually known as the loading or discriminant. The loadings will generally be constrained to be equal between classes, and either the same loading for each outcome (b cj = b) or independent loading for each outcome (b cj = b j ) are used. The marginal likelihood is obtained by integrating over the random effect and summing over the latent classes, which is then maximized to obtain the parameter estimates. Posterior class probabilities can be obtained as for the standard latent class using Bayes theorem.

Two level latent class with random effect model
The previously described models for latent class can be considered to be for a single time point. Where the outcomes are observed at multiple time points then consideration must be made for the correlation between time points. A method described for this is the mixed latent Markov model (Langeheine and Van de Pol 1990). This assumes that the population is a mixture of latent Markov models which consists of a Markov chain which is a latent class at each time point. An alternative model is that of Beath and Heller (2009) which extends the model to two levels with an additional random effect modeling the correlation between outcomes at each time point. This has some similarities to the model by Muthén and Shedden (1999) for longitudinal normally distributed data.
We now define y ijt as the jth binary outcome for subject i at time t, π cjt is the probability of the jth outcome to be equal to 1 for a subject in class c at time t, k is the number of outcomes, T is the number of time points and c is the class corresponding to the ith subject. An additional random effect τ t ∼ N (0, 1) is incorporated to model the additional correlation between outcomes at a time point, and c scales this to the appropriate variance. .
Again, we usually constrain b cj = b j and c = .

Model selection
An important aspect of using latent class and latent class with random effect models is the choice of the number of classes, and whether inclusion of a random effect is required. This presents two difficulties. The first involves a test for a parameter on the boundary of the parameter space. For the number of classes, this is the proportion in a class that is zero, and for the random effect, that the random effect variance is zero. As a consequence asymptotic likelihood theory does not hold so other methods must be used. One method is the bootstrapped likelihood ratio test as described by McLachlan (1987). However, a second difficulty is that when models may include a random effect then comparisons must be made between non-nested models. As an example we may wish to compare a latent class model with two or more classes to a latent class with random effect model with a single class. This is equivalent to an item response theory (IRT) model, either the Rasch model when the loadings are constant and a two parameter logistic otherwise (Bartholomew, Steele, Moustaki, and Galbraith 2002). This is an important choice as it determines if the underlying latent variable is categorical or continuous (Muthén 2006).
The usual method used is an information criterion (Lin and Dayton 1997) with the two main ones that are used being the Akaike information criterion (AIC) and the Bayesian information criterion (BIC). Nylund and Muthén (2007) showed using simulation that BIC is superior to AIC for selection in latent class models, and this is the method most often used in applied publications. With BIC the penalty is greater than for AIC and dependent on the number of observations, so the BIC will select models with a smaller number of classes. An alternative version of AIC, AIC3 with a penalty of 3 was shown to have better performance with latent class models by Dias (2006). This will select models with the number of classes between that chosen by BIC and AIC inclusive. No evaluation of the criteria has been performed for latent class with random effect models, and it has seen little use in applied publications. An important point is that with any model selection it is desirable to make use of existing information. So for example, if it is already known that there are at least 2 classes then the model choice should be restricted to these. In the paper I have used BIC for model selection, but as no research has been performed on model selection for random effect latent class models the other information criteria are also provided.

Identifiability
A difficulty that is sometimes encountered in fitting latent class models is lack of identifiability, which occurs when the value of the maximum likelihood occurs for more than one unique set of parameter estimates. This implies that is there is not a single global maximum. A minimum condition is that the number of parameters is less than the number of patterns, but this is not always sufficient (McHugh 1956). This prevents the fitting of even a 2 class latent class model with only two outcomes, as this requires 5 parameters with only 4 patterns. A 2 class model is just identifiable when fitted to data on 3 outcomes, as there are 7 parameters and 8 possible patterns, however if a random effect is included then 4 outcomes are required for identifiability. A check is made by randomLCA to determine if the model is identifiable based on the number of parameters. As this is not always sufficient, special cases, for example the requirement of at least 5 outcomes to fit a 3 class latent class model, are also flagged as nonidentified. A further check is provided by determining that the rank of the Hessian is not less than the number of estimated parameters (Skrondal and Rabe-Hesketh 2004, pp. 150-151), however this may also occur if a parameter is on the boundary of the parameter space.

Computational methods
The standard latent class models are fitted using an EM algorithm and then switching to a quasi-Newton method when near convergence. To increase the probability that the algorithm converges to the global maximum rather than a local maximum, the EM algorithm is performed a number of times with randomly generated starting values. The estimated parameters from the best fitting EM algorithm are then used as the starting values for the quasi-Newton method. For the latent class with random effect models it is necessary to integrate over the random effects for each subject, for which it is necessary to use an approximation. One of the methods that has been used for this is Gauss-Hermite quadrature, where the integral is approximated by a weighted sum of the function evaluated at defined points, which works well when the function is approximately standard normal. However, for random effect models the likelihood is not, but the standardized likelihood is. Therefore, Gauss-Hermite quadrature is applied to the standardized likelihood, known as adaptive Gauss-Hermite quadrature (Liu and Pierce 1994). This has the advantage over standard Gauss-Hermite quadrature that integration is only performed in the region of the mode, reducing the number of quadrature points required. For the two level random effect model, adaptive Gauss-Hermite quadrature is again used. An orthogonal transform is applied to reduce the integration to two one-dimensional integrations and the method of moments is used to determine the location of the modes, as described in Rabe-Hesketh, Skrondal, and Pickles (2005).
The random effect latent class models are fitted using a generalized EM algorithm (GEM) (Little and Rubin 2002, p. 173) with maximization using a quasi-Newton method. At each expectation step the location of the modes for the adaptive quadrature are recalculated, and the maximization step performed based on these locations. The maximization is not run to convergence at each step, but terminated after a number of quasi-Newton steps. For all models, the EM or GEM algorithm switches to a quasi-Newton for all parameters when close to convergence. For the random effect models starting values are obtained from the model without random effects, and a search performed over possible starting values for the random effect variance.
A difficulty with latent class models is the calculation of standard errors when the parameter estimates are near the boundary of the parameter space. There are several options, one of which is to use Bayesian maximum a posteriori (MAP) estimation (Galindo Garre and Vermunt 2006). This is a form of Bayesian estimation in that a prior probability is placed on the parameters, however the posterior distribution is then maximized similar to maximum likelihood estimation. This is equivalent to the penalized likelihood described by Firth (1993) in which a penalty is placed on extreme values of the logit or probit scale outcome probabilities.
The posterior distribution will consequently have a mode, whereas the likelihood may not.
In randomLCA the prior distribution used is the Dirichlet as in Galindo Garre and Vermunt (2006) but with a much smaller default penalty, where the penalty argument is equal to the number of extra observations that are effectively added to each cell. A penalty of 0.01 is used, which reduces the risk of numerical problems, without greatly affecting the estimated probabilities. A sensible upper limit for the penalty is 0.5, which has been found to perform well by a number of authors, for example Rubin and Schenker (1987) in application to binary proportions, and may be used when necessary. This will also produce results similar to the Latent GOLD software default settings (Vermunt and Magidson 2013). Setting the penalty to zero will produce results identical to maximum likelihood.

Myocardial infarction example
This example demonstrates the fitting of data from Rindskopf and Rindskopf (1986), where latent class analysis is used to determine diagnostic classifications based on medical tests. Although this example is for medical data, the model is simply standard latent class so the methods can be applied to data from other areas, for example psychology and sociology. The maximum number of classes that can be fitted is limited to 2 due to identifiability, so we fit models for 1 and 2 classes, assuming that the outcome probabilities are homogeneous in each class.
The fitting function for randomLCA is the randomLCA function which fits both the standard and random effect models. The command, where only the patterns parameter is required, is: randomLCA(patterns, freq = NULL, nclass = 2, calcSE = TRUE, notrials = 20, random = FALSE, byclass = FALSE, quadpoints = 21, constload = TRUE, blocksize = dim(patterns)[2], level2 = FALSE, probit = FALSE, level2size = blocksize, qniterations = 5, penalty = 0.01, EMtol = 1.0e-9, verbose = FALSE, seed = as.integer(runif(1, 0, .Machine$integer.max)) For a standard latent class model the parameters of interest are: patterns: Data frame or matrix of 0s and 1s defining the outcome patterns. This may be either raw data or summarized data with a frequency vector giving the corresponding frequency for each pattern. The patterns may also include missing values, with randomLCA using maximum likelihood to fit the models using all available data.
freq: Vector containing frequencies for each outcome pattern. Where this is missing it is created within the randomLCA function and used in the fitting algorithm, substantially reducing execution time.
nclass: Number of classes to be fitted.
notrials: The number of random starting values used. It is necessary to use different starting values to reduce the risk that the global maximum of the likelihood is not found. This should be increased in increments of ten until there is no decrease in the maximum log-likelihood. While this does not guarantee a maximum likelihood, the parameter estimates found will usually be close to those at the true global maximum.
penalty: Penalty applied to the likelihood for the outcome probabilities. See Section 2.6 for details.
The remaining arguments will be considered when discussing the relevant models or details may be obtained from the package documentation. Fitting the standard latent class models for one and two classes (note that a three class model is not identifiable) requires only the basic arguments: R> myocardial.lca1 <-randomLCA(myocardial[, 1:4], freq = myocardial$freq, + nclass = 1) R> myocardial.lca2 <-randomLCA(myocardial[, 1:4], freq = myocardial$freq, + nclass = 2) The BIC values may be extracted from the fitted objects, and combined into a data frame: R> myocardial.bic <-data.frame(classes = 1:2, + bic = c(BIC(myocardial.lca1), BIC(myocardial.lca2))) R> print(myocardial.bic, row.names = FALSE) classes bic 1 524.7441 2 402.2951 Using BIC as a selection method, this selects the 2 class model, indicating a breakdown into diseased and non-diseased, which is assumed to represent those with and without myocardial infarction, although the true nature of classes is always debatable. A characteristic of this data is that a single class random effect model has a lower BIC than the 2 class standard latent class, so we need to assume that there are at least two classes, or that the underlying latent variable is categorical rather than continuous.
An alternative is to use the parametric bootstrap (McLachlan 1987) to determine the number of classes, and this is easily performed using the simulate function to generate the samples under the null hypothesis and then refit to refit both models. The simulate function returns a list of data frames simulated from the specified model, and each of these can then be refitted using the specified null and alternative model, as shown in the following code.
One aspect of latent class is that no subject is uniquely allocated to a given class, although in some cases a subject may have an extremely high probability of being in a given class. The posterior class probabilities can be obtained as This shows subjects with 3 or 4 positive tests to be strongly classified as having myocardial infarction, and even some with 2 positive tests are well classified. Having only one positive test makes it unlikely that it is myocardial infarction.

Dentistry example
An important area of application of latent class and random effect latent class is the development or comparison of diagnostic testing methods where there is no gold standard test. A gold standard test is one that is the best available and can usually be assumed to be close to perfect, but usually being more expensive or difficult to perform (Kraemer 1992). Given a gold standard it is easy to construct new tests or compare existing tests, as we know the true disease status of each subject. Latent class methods allow the construction of tests based on the assumption that subjects fall in either two or more classes, with diseased or non-diseased as a minimum, except that the classes can only be inferred from the observed test results. This has the consequence that the status of the subjects is not known exactly, which reduces the accuracy and relies upon the assumptions made about the test result distribution.
There are other methods that have been proposed, the major ones being discrepant resolution and composite reference (Pepe 2003, Chapter 7), both of which have disadvantages and advantages compared to latent class methods. The advantage of the latent class method is its statistical basis, however it has the disadvantage to depend on assumptions about the diagnostic tests, especially the assumption of either conditional independence or normally distributed heterogeneity.
The further arguments to the randomLCA function required for a random effect model are: random: Specifies whether a random effect should be included.
byclass: Allow loadings for the random effect to vary by class.
quadpoints: Number of quadrature points for the adaptive quadrature. These specify how accurate the numerical approximation to the marginal likelihood is, and should be increased until there is negligible improvement in model fit.
constload: The same loading is used for all outcomes when using a random effect model.
blocksize: Blocks allow for the case where loadings are not common to all outcomes but common to subsets of outcomes, allowing for the number of estimated b cj parameters to be reduced compared to individual loadings for each outcome, with blocksize specifying the size of each block and the b cj constrained to be equal between blocks. For example with a blocksize of four and four blocks then every fourth outcome will have the same loading, that is b c1 = b c5 = b c9 = b c13 and similarly for b c2 , b c3 and b c4 .
probit: Fit a probit model rather than a logistic for the relationship between parameters and outcome probabilities. This is the relationship typically used in some disciplines.
This example shows the fitting of the dentistry data from Qu et al. (1996). The data consists of the results of five dentists evaluating X-rays for presence or absence of caries. For consistency with the original paper I have also set probit = TRUE to give the probit link. Fitting first the three possible models for one class: R> dentistry.lca1 <-randomLCA(dentistry[, 1:5], + freq = dentistry$freq, nclass = 1) R> dentistry.lca1random <-randomLCA(dentistry[, 1:5], + freq = dentistry$freq, nclass = 1, random = TRUE, probit = TRUE) R> dentistry.lca1random2 <-randomLCA(dentistry[, 1:5], + freq = dentistry$freq, nclass = 1, random = TRUE, probit = TRUE, + constload = FALSE) This can then be repeated for 2 to 4 classes, and using BIC the BIC can be extracted for each model, and then combined into a data frame to summarize. Note that we cannot use a parametric bootstrap based likelihood ratio test to compare the standard to random effect latent class model as the models are not nested. Here the number of quadrature points will need to be increased for some models to allow convergence.
We can display the BIC values in a table, with the first column for standard latent class, the second for random effects with a constant loading for each dentist, and the third with loading varying by dentist. Note that it is not possible to fit a 4 class random effect model with individual loadings for each dentist due to non-identifiability. For the standard latent class models the minimum BIC of 14962.9 is obtained for the 3 class model. With addition of the random effect with constant loading, minimum BIC is obtained with a 2 class model with a decrease from the latent class model to 14944.7. Allowing the loadings to vary by dentist (2LCR model obtained by Qu et al. 1996) the minimum BIC of 14938.3 was obtained using a single class model, equivalent to a single factor item response theory (IRT) model (Bartholomew et al. 2002, Chapter 7). This assumes that rather than subjects being grouped into classes they simply have different levels of an underlying latent variable, possibly severity. In the absence of any assumptions about the appropriate model this would be the model to be used, and we could conclude that severity of caries was on a continuous scale with each dentist having different thresholds for determining the presence or absence. In the paper by Qu et al. (1996) the assumption is made that the underlying latent variable is categorical, that is that there are two distinct types of subjects, and so the 2 class with random effect with constant loading will be used. Clearly, based on the lower outcome probabilities, Class 1 is the non-diseased and Class 2 the diseased.
For latent class models with random effects there are two additional arguments to plot: graphtype: Type of graph, either "marginal" or "conditional". For marginal the outcome probabilities integrated over the random effect are plotted, and for conditional they are plotted conditional on the random effect, with zero being the default.
conditionalp: For a conditional graph the percentile corresponding to the random effect at which the outcome probability is to be calculated. Fifty percent is the default, corresponding to a random effect value of zero. The marginal outcome probabilities, obtained by integrating over the random effect can be plotted, as in Figure 2. The marginal outcome probabilities reflect the average probability for any subject in the class having a positive outcome. This differs from the conditional outcome probabilities which are for a subject with zero random effect, and thus represent a typical subject.
We can demonstrate the effect of the random effect by plotting the outcome probabilities by varying percentiles of the random effect, using the following code, which will place each percentile in a different panel.
R> dentistry.lca2random <-randomLCA(dentistry[, 1:5], freq = dentistry$freq, + nclass = 2, random = TRUE, quadpoints = 71, probit = TRUE) R> probs <-outcomeProbs(dentistry.lca2random, boot = TRUE) It is necessary to determine which is the class with higher outcome probabilities, as it is the diseased class. The variable diseased gives the number of the diseased class, and notdiseased for the non-diseased, and is based on the outcome probabilities being lower for the nondiseased class. The true and false positive rates can be calculated from the outcome probabilities, similar to sensitivity and sensitivity, and plotted for each dentist, and are shown in This gives a better explanation. It appears that the difference between dentists is mainly related to the threshold for what they classify as diseased. Dentist 5 is more likely to correctly identify teeth as diseased but at the expense of being more likely to identify non-diseased teeth as diseased. Note that this is different from an ROC curve where the same data is used but the test threshold is adjusted. Here the dentists may choose different thresholds but may also have different levels of performance.

R>
Posterior class probabilities may again be obtained with postClassProbs. Clearly, as the number of dentists identifying the subject as diseased increases, the posterior probability of being diseased increases, until it is almost one when all dentists identify the subject as diseased. The observed and fitted values may be obtained using the fitted method which returns a data frame containing them. Again, differences from the Qu et al. (1996) paper result from a model with different loading for each dentist. We can obtain the fitted values for the two models as follows: R> dentistry.lca2.fitted <-fitted(dentistry.lca2) R> dentistry.lca2random.fitted <-fitted(dentistry.lca2random) R> dentistry.fitted <-merge(dentistry.lca2.fitted, + dentistry.lca2random.fitted, by = names(dentistry.lca2.fitted)[1:6]) R> names(dentistry.fitted)[6:8] <-c("Obs", "Exp 2LC", "Exp 2LCR") R> print(dentistry.fitted, row.names = FALSE)  It can be seen how the random effect model more accurately models the data, with fitted values closer to the observed data.

Symptoms example
This comprises data on the presence or absence of respiratory and allergy symptoms in the Childhood Asthma Prevention Study (CAPS; Mihrshahi, Peat, Webb, Tovey, Marks, Mellis, and Leeder 2001) and was used as the example in Beath and Heller (2009). The symptoms of night cough, wheeze, itchy rash and flexural dermatitis since the previous visit were recorded at one month, then quarterly for the first year and then twice yearly until the age of two years. For analysis these are aggregated for each six month period to avoid numerical problems associated with very small probabilities. The aim of the analysis is to identify the number of classes of subjects based on their respiratory and allergy symptoms combined. As well as allowing for the classes to be defined by different levels of the symptoms it will also allow for changes over time.
The first model fitted is a standard latent class, to allow for no subject or period effect.
R> symptoms.lca2 <-randomLCA(symptoms[, 1:16], freq = symptoms$Freq, + nclass = 2) A variation of the random effect latent class model can be fitted allowing the loadings (b cj parameters) for the outcomes to be repeated, that is wheeze at the different time points, for example, will always have the same loading, using the blocksize argument. This is equivalent to the 2 level model with the time-dependent random variable having a variance of zero, that is no time-dependent effect. The outcomes have been set to have non-constant loading, although in practice models for constant loading would also be fitted.
R> symptoms.lca2random <-randomLCA(symptoms[, 1:16], freq = symptoms$Freq, + random = TRUE, nclass = 2, blocksize = 4, constload = FALSE) The two level models are specified through the level2 argument and the number of outcomes at each time through the level2size argument. For these models the penalty is increased to 0.1 to reduce the execution time, but for a two class model this will be about an hour and for a three class model about two hours.
R> symptoms.lca2random2 <-randomLCA(symptoms[, 1:16], freq = symptoms$Freq, + random = TRUE, level2 = TRUE, nclass = 2, level2size = 4, + constload = FALSE, penalty = 0.1) Repeating for up to five classes or when the BIC increases gives the following results. It should be noted that the two level models can take considerable time to fit. This is due to the relatively large number of quadrature points required for this data, as a consequence of a large number of patterns consisting entirely of zeroes. The marginal outcome probabilities are plotted in Figure 5 as follows.

Summary
It has been shown how the randomLCA package may be used to determine classes of subjects based on observed binary data, and how this may be used to determine sensitivity and specificity for diagnostic tests. In the dentistry and symptoms examples an assumption of heterogeneous classes, where the outcome probabilities are allowed to vary within a class was shown to be an improvement over traditional latent class analysis. The randomLCA package also produces a range of plots for describing the classes, allows for bootstrapped standard errors, calculation of marginal outcome probabilities when using random effect models and the use of penalized likelihoods. A further extension to randomLCA would be the extension to latent class regression models (Dayton and Macready 1988), where the class probabilities are determined by the covariates. Another extension is to allow for ordinal data using a graded response model (Samejima 1969). A further possible extension to the package is to allow for other data types. However for the random effect models it is difficult to extend the models except for ordinal data.