Least-Squares Means: The R Package lsmeans

Least-squares means are predictions from a linear model, or averages thereof. They are useful in the analysis of experimental data for summarizing the eﬀects of factors, and for testing linear contrasts among predictions. The lsmeans package (Lenth 2016) provides a simple way of obtaining least-squares means and contrasts thereof. It supports many models ﬁtted by R ( R Core Team 2015) core packages (as well as a few key contributed ones) that ﬁt linear or mixed models, and provides a simple way of extending it to cover more model classes.


Introduction
Least-squares means (LS means for short) for a linear model are simply predictions -or averages thereof -over a regular grid of predictor settings which I call the reference grid. They date back at least to Harvey (1960) and his associated computer program LSML (Harvey 1977) and the contributed SAS procedure named HARVEY (Harvey 1976). Later, they were incorporated via LSMEANS statements for various linear model procedures such as GLM in the regular SAS releases. See also Goodnight and Harvey (1997) and SAS Institute Inc. (2012) for more information about the SAS implementation.
In simple analysis-of-covariance models, LS means are the same as covariate-adjusted means (Oehlert 2000, p. 456, for example). In unbalanced factorial experiments, LS means for each factor mimic the main-effects means but are adjusted for imbalance. The latter interpretation is quite similar to the "unweighted means" method for unbalanced data (Milliken and Johnson 1992, Section 11.2).
LS means are not always well understood, in part because the term itself is confusing. The most important things to remember are: • LS means are computed relative to a reference grid.
• Once the reference grid is established, LS means are simply predictions on this grid, or marginal averages of a table of these predictions.
A user who understands these points will know what is being computed, and thus can judge whether or not LS means are appropriate for the analysis.

The reference grid
Since the reference grid is fundamental, it is our starting point. For each predictor in the model, we define a set of one or more reference levels. The reference grid is then the set of all combinations of reference levels. If not specified explicitly, the default reference levels are obtained as follows: • For each predictor that is a factor, its reference levels are the unique levels of that factor.
• Each numeric predictor has just one reference level -its mean over the dataset.
So the reference grid depends on both the model and the dataset.
Example: Orange sales. To illustrate, consider the oranges data provided with lsmeans. This dataset has sales of two varieties of oranges (response variables sales1 and sales2) at 6 stores (factor store), over a period of 6 days (factor day). The prices of the oranges (covariates price1 and price2) fluctuate in the different stores and the different days. There is just one observation on each store on each day.
For starters, let's consider an additive covariance model for sales of the first variety, with the two factors and both price1 and price2 as covariates (since the price of the other variety could also affect sales).

Analysis of Variance
As outlined above, the two covariates price1 and price2 have their means as their sole reference level; and the two factors have their levels as reference levels. The reference grid thus consists of the 1 × 1 × 6 × 6 = 36 combinations of these reference levels. LS means are based on predictions on this reference grid, which we can obtain using predict or summary: These results, as indicated in the annotation in the output, are in fact the averages of the predictions shown earlier, for each day, over the 6 stores. The above LS means are not the same as the overall means for each day: R> with(oranges, tapply(sales1, day, mean)) 1 2 3 4 5 6 7.87275 7.10060 13.75860 8.04247 12.92460 11.60365 These unadjusted means are not comparable with one another because they are affected by the differing price1 and price2 values on each day, whereas the LS means are comparable because they use predictions at uniform price1 and price2 values.
Note that one may call lsmeans with either the reference grid or the model. If the model is given, then the first thing it does is create the reference grid; so if the reference grid is already available, as in this example, it's more efficient to make use of it.

Altering the reference grid
The at argument may be used to override defaults in defining the reference grid. The user may specify this argument either in a ref.grid call or an lsmeans call; and should specify a list with named sets of reference levels. Here is a silly example: R> lsmeans (oranges.lm1, "day", at = list(price1 = 50, price2 = c(40, 60), + day = c("2", "3", "4"))) Here, we restricted the results to three of the days, and used different prices. One possible surprise is that the predictions are averaged over the two price2 values. That is because price2 is no longer a single reference level, and we average over the levels of all factors not used to split-out the LS means. This is probably not what we want. To get separate sets of predictions for each price2, one must specify it as another factor or as a by factor in the lsmeans call (we will save the result for later discussion): R> org.lsm <-lsmeans(oranges.lm1, "day", by = "price2", + at = list(price1 = 50, price2 = c(40, 60), day = c("2", "3", "4"))) R> org.lsm

Working with the results
The ref.grid function produces an object of class "ref.grid", and the lsmeans function produces an object of class "lsmobj", which is a subclass of "ref.grid". There is really no practical difference between these two classes except for their show methods -what is displayed by default -and the fact that an "lsmobj" is not (necessarily) a true reference grid as defined earlier in this article. Let's use the str function to examine the "lsmobj" object just produced: R> str(org.lsm) lsmobj object with variables: day = 2, 3, 4 price2 = 40, 60 We no longer see the reference levels for all predictors in the model -only the levels of day and price2. These act like reference levels, but they do not define the reference grid upon which the predictions are based.
There are several methods for "ref.grid" (and hence also for "lsmobj") objects. One already seen is summary. It has a number of arguments -see its help page. In the following call, we summarize days.lsm differently than before. We will also save the object produced by summary for further discussion.
R> ( org.sum <-summary(org.lsm, infer = c(TRUE,TRUE), null = 7.50, + level = 0.90, adjust = "bon", by = "day") ) The infer argument causes both confidence intervals and tests to be produced; the default confidence level of 0.95 was overridden; and the default null hypothesis that a mean is zero is overridden. A Bonferroni adjustment was applied to both the intervals and the p values (which causes some of the latter to hit their maximum of 1.0). Because of the by specification, the tables are organized the opposite way from what we saw before.
What kind of object was produced by summary? Let's see: R> class(org.sum) [1] "summary.ref.grid" "data.frame" The "summary.ref.grid" class is an extension of "data.frame". It includes some attributes that, among other things, cause additional messages to appear when the object is displayed. But it can also be used as a "data.frame" if the user just wants to use the results computationally. Observe also that the summary is just one data frame with six rows, rather than a collection of three data frames; and it contains a column for all reference variables, including any by variables.
Besides str and summary, there is also a confint method, which is the same as summary with infer = c(TRUE, FALSE), and a test method (same as summary with infer = c(FALSE, TRUE)). There is also an update method which may be used for changing the object's display settings.

Contrasts in general
Often, people want to do pairwise comparisons of LS means, or compute other contrasts among them. This is the purpose of the contrast function, which uses a "ref.grid" or "lsmobj" object as input. There are several standard contrast families such as "pairwise", "trt.vs.ctrl", and "poly". In the following command, we request "eff" contrasts, which are differences between each mean and the overall mean: R> contrast(org.lsm, method = "eff") Results are averaged over the levels of: store P value adjustment: fdr method for 3 tests Note that this preserves the by specification from before, and obtains the effects for each group. In this example, since it is an additive model, we obtain exactly the same results in each group. This isn't wrong, it's just redundant.
Note that by default, lsmeans results are displayed with confidence intervals while contrast results are displayed with t tests. One can easily override this; for example, R> confint(contrast(days.lsm, "trt.vs.ctrlk")) (Results not shown.) In the above examples, a default multiplicity adjustment is determined from the contrast method. This may be overridden by adding an adjust argument.

Pairwise comparisons
Often, users want pairwise comparisons among the LS means. These may be obtained by specifying "pairwise" or "revpairwise" as the method argument in the call to contrast. For group labels A, B, C, "pairwise" generates the comparisons As a convenience, a pairs method is provided that calls contrast with method="pairwise" (or optionally, revpairwise): R> pairs(org.lsm, reverse = TRUE) Results are averaged over the levels of: store Confidence level used: 0.95 P value adjustment: tukey method for comparing a family of 6 estimates significance level used: alpha = 0.1 Two LS means that share one or more of the same grouping symbols are not significantly different at the stated value of alpha, after applying the multiplicity adjustment (in this case Tukey's HSD). By default, the LS means are ordered in this display, but this may be overridden with the argument sort = FALSE. cld returns a "summary.ref.grid" object, not an "lsmobj".

Two-sided formulas
In its original design, the only way to obtain contrasts and comparisons in lsmeans was to specify a two-sided formula, e.g., pairwise~treatment, in the lsmeans call. The result is then a list of lsmobj objects. In its newer versions, lsmeans offers a richer family of objects that can be re-used, and dealing with a list of objects can be awkward or confusing, so its continued use is not encouraged. Nonetheless, it is still available for backward compatibility.
Here is an example where, with one command, we obtain both the LS means and polynomial contrasts for day, averaged over price2, for the object org.lsm:

Additional examples
While estimates and pairwise comparisons are the fundamental things we need with LS means, other descriptions and analyses are helpful. This section introduces some more data exam-ples and discusses lsmeans's capabilities for displaying results graphically, dealing with transformed responses (or link functions in generalized linear models), estimating trends when covariates interact with factors, and dealing with messy data.
Example: Oat yields. The Oats dataset in the nlme package (Pinheiro, Bates, and R Core Team 2015) has the results of a split-plot experiment discussed in Yates (1935). The experiment was conducted on six blocks (factor Block). Each block was divided into three plots, which were randomized to three varieties (factor Variety) of oats. Each plot was divided into subplots and randomized to four levels of nitrogen (variable nitro). The response, yield, was measured once on each subplot after a suitable growing period.
We will fit a model using the lmer function in the lme4 package (Bates, Mächler, Bolker, and Walker 2015). This will be a mixed model with random intercepts for Block and Block:Variety (which identifies the plots). A logarithmic transformation is applied to the response variable (mostly for illustration purposes, though it does produce a good fit to the data). Note that nitro is stored as a numeric variable, but we want to consider it as a factor in this initial model. R> data("Oats", package = "nlme") R> library("lme4") R> Oats.lmer <-lmer(log(yield)~Variety * factor(nitro) + + (1 | Block / Variety), data = Oats) R> anova(Oats.lmer) Apparently, the interaction is not needed. But perhaps we can further simplify the model by using only a linear or quadratic trend in nitro. We can find out by looking at polynomial contrasts:

Analysis of Variance
R> contrast(lsmeans(Oats.lmer, "nitro"), "poly") These LS means follow the same quadratic trend for each variety, but with different intercepts.
Fractional degrees of freedom are displayed in these results. These are obtained from the pbkrtest package , and they use the Kenward-Rogers method. (The degrees of freedom for the polynomial contrasts were also obtained from pbkrtest, but the results turn out to be integers.) Incidentally, the polynomial model above could have been specified using nitro + I(nitro^2) instead of poly(nitro, 2). What is not recommended is to use nitro + nitrosq, where nitrosq is a previously calculated variable equal to nitro^2. The reason is that with cov.reduce = TRUE, lsmeans will use separate covariate means for nitro and nitrosq, and the latter will probably not be equal to mean(nitro)^2 as would be required for consistency.

Displaying LS means graphically
There is a plot method for "ref.grid" and "lsmobj" objects that displays side-by-side confidence intervals, and/or "comparison arrows" that may be used to graphically judge which LS means differ significantly from one another based on whether or not they overlap. In the command below, we request both intervals and comparison arrows. By default, the comparisons will be done using a 0.05 Tukey-adjusted significance level. However, we use int.adjust to request that the confidence intervals (defaulting to the 95% level) be unadjusted.
R> plot(Oats.lsm2, intervals = TRUE, int.adjust = "none", comparisons = TRUE) The results are shown in Figure 1. Owing to the fact that the model is additive, the comparisons are the same within each Variety. The overlapping arrows for nitro of 0.4 and 0.6 indicate a nonsignificant difference. The remaining differences are quite significant. Note that it is a mistake to try to use confidence intervals to judge comparisons. In this example, the standard errors of comparisons are much smaller than those of the LS means, because the between-block and between-plot variations cancel out in the comparisons.
In certain other situations where the standard errors of differences vary widely, the comparisonarrow display may not work. In that case, a warning message is displayed.
The lsmeans package also includes a function lsmip that displays predictions in an interactionplot-like manner. It uses a formula of the form curve.factors~x.factors | by.factors The function requires the lattice package (Sarkar 2008) to be installed. In this formula, curve.factors specifies factor(s) used to delineate one displayed curve from another (i.e., groups in lattice's parlance). x.factors are those whose levels are plotted on the horizontal axis. And by.factors, if present, break the plots into panels.
To illustrate, let's do a graphical comparison of the two models we have fitted to the Oats data.
R> lsmip(Oats.lmer, Variety~nitro, ylab = "Observed log(yield)") R> lsmip(Oats.lsm2, Variety~nitro, ylab = "Predicted log(yield)") The plots are shown in Figure 2. Note that the first model fits the cell means perfectly, so its plot is truly an interaction plot of the data. The other displays the parabolic trends we fitted in the revised model.

Transformations
When a transformation or link function is used in fitting a model, ref.grid (also called by lsmeans) stores that information in the returned object, as seen in this example: This allows us to conveniently unravel the transformation, via the type argument in summary or related functions such as lsmip and predict. Here are the predicted yields (as opposed to predicted log yields) for the polynomial model: R> summary(Oats.lsm2, type = "response") It is important to realize that the statistical inferences are all done before reversing the transformation. Thus, t ratios are based on the linear predictors and will differ from those computed using the printed estimates and standard errors. Likewise, confidence intervals are computed on the linear-predictor scale, then the endpoints are back-transformed.
This kind of automatic support for transformations is available only for certain standard transformations, namely those supported by the make.link function in the stats package. Others require more work -see the documentation for update for details.

Trends
The lstrends function is provided for estimating and comparing the slopes of fitted lines (or curves). To illustrate, consider the built-in R dataset ChickWeight which has data on the growths of newly hatched chicks under four different diets. The following code produces the display in Figure 3.
R> library(lattice) R> xyplot(weight~Time | Diet, groups =~Chick, data = ChickWeight, + type = "o", layout = c(4, 1)) Let us fit a model to these data using random slopes for each chick and allowing for a different average slope for each diet: q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q qq q q q q q q q q q q qq q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q qq q q q q q q q q q q qq q q q q q q q q q q q q q q q q q q q q q q q q q q q q We can then call lstrends to estimate and compare the average slopes for each diet.

Messy data
To illustrate some more lsmeans capabilities, consider the dataset named nutrition that is provided with the lsmeans package. These data come from Milliken and Johnson (1992), and contain the results of an observational study on nutrition education. Low-income mothers are classified by race, age category, and whether or not they received food stamps (the group factor); and the response variable is a gain score (post minus pre scores) after completing a nutrition training program.
Consider the model that includes all main effects and two-way interactions. A Type-II (hierarchical) analysis-of-variance table is also shown using the car package (Fox and Weisberg 2011).
R> nutr.lm <-lm(gain~(age + group + race)^2, data = nutrition) R> library("car") R> Anova(nutr.lm) Anova  Figure 4: Predictions for the nutrition data. Figure 4 shows the predictions from this model. One thing the output illustrates is that lsmeans incorporates an estimability check, and returns a missing value when a prediction cannot be made uniquely. In this example, we have very few Hispanic mothers in the dataset, resulting in empty cells. This creates a rank deficiency in the fitted model, and some predictors are thrown out.
We can avoid non-estimable cases by using at to restrict the reference levels to a smaller set. A useful summary of the results might be obtained by narrowing the scope of the reference levels to two races and the two middle age groups, where most of the data lie. However, always keep in mind that whenever we change the reference grid, we also change the definition of the LS means. Moreover, it may be more appropriate to average the two ages using weights proportional to their frequencies in the data set. With those ideas in mind, here are the LS means and comparisons within rows and columns: R> nutr.lsm <-lsmeans(nutr.lm,~group * race, weights = "proportional", + at = list(age = c("2", "3"), race = c("Black", "White"))) Here are the results: Results are averaged over the levels of: age The general conclusion from these analyses is that for age groups 2 and 3, the expected gains from the training are higher among families receiving food stamps. Note that this analysis is somewhat different than the results we would obtain by subsetting the data before analysis, as we are borrowing information from the other observations in estimating and testing these LS means.
The weights argument is a convenience for certain standard weighting schemes. More general weighting schemes may be implemented by writing a custom function and using fac.reduce.

Interfacing with multcomp
The multcomp package (Hothorn, Bretz, and Westfall 2008) supports a greater variety of corrections for simultaneous testing than are available in lsmeans. Its glht (general linear hypothesis testing) function and associated "glht" class are similar in some ways to "lsmeans" and "lsmobj" objects, respectively. So we provide methods such as as.glht for working with glht. To illustrate, consider testing comparisons of consecutive means in days.lsm using the two packages: > ( days.consec <-contrast(days.lsm, "consec", adjust = "mvt") ) If we had specified test = adjusted("single-step"), we would have obtained exactly the same results as we did above with the "mvt" adjustment. But with the Westfall adjustment, the smallest adjusted P value is the same, but the others are smaller because of the stepwise nature of the adjustment method.
If you are already working primarily with glht, the lsmeans package provides an lsm function that can be used as an alternative to multcomp's mcp function, but with the flexibility and context of an lsmeans call:

> Oats.glht <-glht(Oats.lmer, lsm(~nitro | Variety))
If there is a by variable in effect, as in this example, glht or as.glht returns a list of glht objects -one for each by level.
> names(Oats.glht) [1] "Variety = Golden Rain" "Variety = Marvellous" "Variety = Victory" There is a courtesy summary method for this "glht.list" class to make things a bit more userfriendly. Recall the earlier example result org.lsm, which contains information for LS means for three days at each of two values of price2. Suppose we are interested in pairwise comparisons of these LS means, by price2. If we call R> summary(as.glht(pairs(org.lsm))) (results not displayed) we will obtain two glht objects with three contrasts each, so that the results shown will incorporate multiplicity adjustments for each family of three contrasts. If, on the other hand, we want to consider those six contrasts as one family, use R> summary(as.glht(pairs(org.lsm), by = NULL)) . . . and note (look carefully at the parentheses) that this is not the same as R> summary(as.glht(pairs(org.lsm, by = NULL))) which removes the by grouping before the pairwise comparisons are generated, thus yielding 6 2 = 15 contrasts instead of just six. Finally, we note that as.glht may be used as a back-door way to get glht to work with a model supported by lsmeans, but not multcomp.
Extended capabilities for some are available by way of optional arguments passed via ref.grid or lsmeans. Details may be found using help("models", package = "lsmeans"). In this section, we illustrate some examples of these extensions.

Multivariate models
The oranges data has two response variables. Let's try a multivariate model for predicting the sales of the two varieties of oranges, and see what we get if we call ref.grid: R> oranges.mlm <-lm(cbind(sales1, sales2)~price1 + price2 + day + store, + data = oranges) R> ref.grid(oranges.mlm) ref.grid object with variables: price1 = 51.222 price2 = 48.556 day = 1, 2, 3, 4, 5, 6 store = 1, 2, 3, 4, 5, 6 rep.meas = multivariate response levels: sales1, sales2 What happens is that the multivariate response is treated like an additional factor, by default named rep.meas. In turn, it can be used to specify levels for LS means. Here we rename the multivariate response to "variety" and obtain day means (and a compact letter display for comparisons thereof) for each variety: R> org.mlsm <-lsmeans(oranges.mlm,~day | variety, mult.name = "variety") R> cld(org.mlsm, sort = FALSE) Results are averaged over the levels of: store Confidence level used: 0.95 P value adjustment: tukey method for comparing a family of 6 estimates significance level used: alpha = 0.05 Suppose that we want to compare the two varieties on each day using this model; and in turn, compare these resulting differences with one another. This is a contrast of contrasts.
To start, let's obtain the daily differences: R> org.vardiff <-update(pairs(org.mlsm, by = "day"), by = NULL) The results (not yet shown) will comprise the six sales1-sales2 differences, one for each day. It seems odd to have two by specifications, but the one in pairs specifies doing a separate comparison for each day, and the one in update asks that we convert it to one table with six rows, rather than 6 tables with one row each. Now, let's compare these differences to see if they vary from day to day. Results are averaged over the levels of: store P value adjustment: tukey method for comparing a family of 6 estimates significance level used: alpha = 0.05 There is little evidence of variety differences, nor that these differences vary from day to day. This is all made possible by the fact that lsmeans, contrast, and relatives all produce the same class of object, so we can compound the results as needed to obtain all types of interaction contrasts.

Proportional-odds example
The housing data in the MASS package (Venables and Ripley 2002) is an example where the response variable is an ordinal variable Sat (satisfaction) with a three-point scale of low, medium, high. The predictors include Type (type of rental unit, four levels), Infl (influence on management of the unit, three levels), and Cont (contact with other residents, two levels).
Ordinal-data models may be fitted using the polr function in MASS, or, with more modeling options, clm or clmm in the ordinal package. Here, we fit a second-order model, using the default logit link function.
R> library("MASS") R> housing.plr <-polr(Sat~(Infl + Type + Cont)^2, data = housing, + weights = Freq) One way to view an ordinal-response model is in terms of a latent variable S having a particular type of probability distribution (in this case, logistic, as implied by the logit link) that depends linearly on the predictors. The model posits that the observed ordinal responses comprise a categorization of S into classes, based on fixed thresholds that define the "low," "medium," and "high" levels. By default, lsmeans produces estimates of the mean of S for each factor combination. (The location and scale of S are not identifiable from the data, so by default it is scaled to have a marginal mean of zero and scale parameter of 1.) The command below produces the display in Figure 5 of the latent-variable predictions from this model.
R> lsmip(housing.plr, Infl~Cont | Type, ylab = "Latent mean", + layout = c(4, 1)) Some interesting interactions are in evidence, e.g., between Type and Cont, and this warrants looking at various combinations of the factors. But in the interest of compactness, the remaining illustrations ignore these interactions and focus only on the effects of Infl on satisfaction. From these results, it seems clear that the latent mean satisfaction increases with influence.
Support for ordinal models in lsmeans also allows for an optional mode argument depending on what you want to summarize. The default, as demonstrated above, is mode = "latent". Using mode = "linear.predictor" produces results comparable to those for logistic regression, but for each boundary between ordinal levels. Accordingly, an extra factor named "cut" is created to identify which boundary is being referenced. In the code below, we summarize the linear-predictor means and pairwise differences thereof, transformed back to the response scale.
R> housing.lsm <-lsmeans(housing.plr,~Infl | cut, mode = "lin") R> summary(housing.lsm, type = "response") With type = "response", the logits are transformed to cumulative probabilities. Noting that a low cumulative probability at the low cut means a low number of people are dissatisfied, we again find that those having more influence tend to be more satisfied. Note also that the pairwise comparisons transform to odds ratios on the "response" scale. Only the first three rows of the comparisons table are shown because the comparisons for cut = Medium|High will be identical.
Other modes available for ordinal-response models include "cum.prob", "prob", and "mean.class". Results from "cum.prob" mode will be similar to those above for the response scale -but not identical because the averaging and statistical tests are performed after applying the reverse-logit transformation. Also, the results of pairs will be differences rather than odds ratios.
In "prob" mode, the grid will include the levels of the response variable itself, and estimation is done of the probabilities of each ordinal level: R> lsmeans(housing.plr,~Sat | Infl, mode = "prob") These results are comparable to those for mode = "latent" (not shown). Both suggest satisfaction scores below center, slightly above center, and considerably above center, respectively.

MCMC samplers
The lsmeans package provides special support for Bayesian models that are fitted via Markov chain Monte Carlo (MCMC) methods. A slot exists in the fitted object which holds the sample from the posterior distribution of the fixed-effects coefficients. Since LS means depend on these regression coefficients, we can calculate a posterior sample of LS means (or contrasts thereof) simply by performing the needed calculations with each posterior realization of the coefficients. The as.mcmc function returns this posterior sample with the R class "mcmc" (defined in the coda package), which then may be examined using any of the methods available for that class.
The ordinary summary of such results is based on frequentist estimates of the mean and covariance matrix of the sample. For example, here is a summary of consecutive differences of LS means: R> contrast(Oats.mclsm, "consec") The lsm.options function may be used to set such things as default confidence and significance levels, disable using pbkrtest for obtaining degrees of freedom, and default display options for objects produced by lsmeans and contrast.
In the summary or test method, one may provide null and side specifications to specify null hypotheses or obtain one-sided tests; or delta to set a threshold value for a test of equivalence (if two-sided) or noninferiority/nonsuperiority (if one-sided). Also, the test method has an optional joint argument that, when TRUE, will output joint F statistics (or χ 2 statistics when degrees of freedom are unavailable) for the collection of linear functions.
In constructing a reference grid, the user may specify a formula in cov.reduce. For example, suppose that conc is a covariate and treat is a factor, but you believe that conc is affected by treat. Then ordinary adjusted means are misleading because it is unreasonable to hold conc constant while varyig treatment. Instead, one may specify cov.reduce = conc~treat, which causes a linear model to be fitted to this formula within the original dataset, then to use it to predict conc at each point in the reference grid. An example is given in the package vignette "using-lsmeans".
A function lsmobj is provided that will construct an "lsmobj" object given the factor levels, linear functions, coefficients, covariance matrix, and degrees of freedom. This can be handy for accessing lsmeans's capabilities for a model not supported but these statistics are available, or even using summary statistics in a textbook or article.
There are a few other lsmeans features too new to be included in any depth in this article: • An rbind method for "ref.grid" objects that makes it easy to combine two or more (conforming) objects; for example, R> rbind(pairs(lsmeans(Oats.lmer, "Variety")), + pairs(lsmeans(Oats.lmer, "nitro"))) combines all 3 + 6 = 9 main-effects comparisons into one family. By default, the summary will apply the "mvt" multiplicity adjustment to the family of tests.
There is also a new [.ref.grid method for selecting a subset of the estimates.
• A configurable limit whereby the degrees of freedom and adjusted covariances from pbkrtest are bypassed when the dataset is too large, in order to avoid undue computation time.
• For those who prefer the term "predicted marginal means," wrappers such as pmmeans, pmtrends, and pmmip are provided. They call the corresponding ls... functions and relabel any results that start with "ls." • The code for checking for estimability has been moved to a separate estimability package (Lenth 2015), which is available from CRAN. This was done to make it easy for other package developers to incorporate estimability checks in their code, e.g., predict methods for new data.
• The contrast function has a new interaction option for constructing interaction contrasts.

Extending to more models
The functions ref.grid and lsmeans work by first reconstructing the dataset (so that the reference grid can be identified) and extracting needed information about the model, such as the regression coefficients, covariance matrix, and the linear functions associated with each point in the reference grid. For a fitted model of class, say, "modelobj", these tasks are accomplished by defining S3 methods recover.data.modelobj and lsm.basis.modelobj. The functions nonest.basis and is.estble (recently moved to a separate estimability package) are helpful in providing for estimability checking, and regrid may be used when it is necessary to reparameterize a reference grid. The help page "extending-lsmeans" and the vignette by the same name provide details and examples.
Developers of packages that fit models are encouraged to include support for lsmeans by incorporating (and exporting) recover.data and lsm.basis methods for their model classes.

Comparisons with other software
A unique feature of lsmeans is its explicit reliance on the concept of a reference grid, which I feel is a useful approach for understanding what is being computed. The use of a reference grid has the effect of slightly extending the concepts as discussed in Searle, Speed, and Milliken (1980), in that covariates can have more than one value in the grid, and predictions may then be averaged over those different covariate values. Often, this would be inappropriate; but it does make sense in our Oats example where we were comparing LS means for the model with nitro as a factor with those when nitro is a covariate.
The design goal of lsmeans is primarily to provide the functionality of the LSMEANS statement in various SAS procedures. Thus its emphasis is on tabular results which, of course, may also be used as data for further analysis or graphics. By design, it can be extended with relative ease to additional model classes.
Some lsmeans capabilities exceed those of SAS, including the lstrends capability, more flexibility in organizing the output, and more built-in contrast families. In addition, SAS does not allow LS means for factor combinations when the model does not include the interaction of those factors; or creating a grid of covariate values using at.
There are a few other R packages that provide capabilities that overlap with those of lsmeans.
We have already mentioned effects in Section 5.4. The emphasis of effects is on graphical rather than tabular displays. For a model with continuous predictors, effects automatically generates a grid of values of those predictors for use in its graphs, whereas one would have to use an at specification to do the same in lsmeans. Thus, effects has special strengths for curve-fitting models such as splines. In contrast, lsmeans's strengths are more in the area of factorial models where one wants traditional summaries in the form of estimates, contrasts, and interaction plots.
The doBy package (Højsgaard, Halekoh, Robison-Cox, Wright, and Leidi 2014) provides an LSmeans function that has some of the capabilities of lsmeans, but it produces a data frame rather than a reusable object. In earlier versions of the package, this function was named popMeans. The package also has an LSmatrix function to obtain the linear functions needed to obtain LS means. The lmerTest package (Kuznetsova, Brockhoff, and Christensen 2015) offers a competing lsmeans function that claims to be based on doBy's popMatrix function, but its results don't seem to be consistent with it (as this is written). There is the possibility that one of these lsmeans functions will mask the other, so users must be careful if they use both packages.