Research and Statistical Support

MODULE 9

Linear Mixed Effects Modeling

1. Mixed Effects Models

Mixed effects models refer to a variety of models which have as a key feature both fixed and random effects.

The distinction between fixed and random effects is a murky one. As pointed out by Gelman (2005), there are several, often conflicting, definitions of fixed effects as well as definitions of random effects. Gelman offers a fairly intuitive solution in the form of renaming fixed effects and random effects and providing his own clear definitions of each. “We define effects (or coefficients) in a multilevel model as constant if they are identical for all groups in a population and varying if they are allowed to differ from group to group” (Gelman, p. 21). Other ways of thinking about fixed and random effects, which may be useful but are not always consistent with one another or those given by Gelman above, are discussed in the next paragraph.

Fixed effects are ones in which the possible values of the variable are fixed. Random effects refer to variables in which the set of potential outcomes can change. Stated in terms of populations, fixed effects can be thought of as effects for which the population elements are fixed. Cases or individuals do not move into or out of the population. Random effects can be thought of as effects for which the population elements are changing or can change (i.e. random variable). Cases or individuals can and do move into and out of the population. Another way of thinking about the distinction between fixed and random effects is at the observation level. Fixed effects assume scores or observations are independent while random effects assume some type of relationship exists between some scores or observations. For instance, it can be said that gender is a fixed effect variable because we know all the values of that variable (male & female) and those values are independent of one another (mutually exclusive); and they (typically) do not change. A variable such as high school class has random effects because we can only sample some of the classes which exist; not to mention, students move into and out of those classes each year.

There are many types of random effects, such as repeated measures of the same individuals; where the scores at each time of measure constitute samples from the same participants among a virtually infinite (and possibly random) number of times of measure from those participants. Another example of a random effect can be seen in nested designs, where for example; achievement scores of students are nested within classes and those classes are nested within schools. That would be an example of a hierarchical design structure with a random effect for scores nested within classes and a second random effect for classes nested within schools. The nested data structure assumes a relationship among groups such that members of a class are thought to be similar to others in their class in such a way as to distinguish them from members of other classes and members of a school are thought to be similar to others in their school in such a way as to distinguish them from members of other schools. The example used below deals with a similar design which focuses on multiple fixed effects and a single nested random effect.

2. Linear Mixed Effects Models

Linear mixed effects models simply model the fixed and random effects as having a linear form. Similar to the General Linear Model, an outcome variable is contributed to by additive fixed and random effects (as well as an error term). Using the familiar notation, the linear mixed effect model takes the form:

yij = β1x1ij + β2x2ij … βnxnij + bi1z1ij + bi2z2ijbinznij + εij

where yij is the value of the outcome variable for a particular ij case, β1 through βn are the fixed effect coefficients (like regression coefficients), x1ij through xnij are the fixed effect variables (predictors) for observation j in group i (usually the first is reserved for the intercept/constant; x1ij = 1), bi1 through bin are the random effect coefficients which are assumed to be multivariate normally distributed, z1ij through znij are the random effect variables (predictors), and  εij is the error for case j in group i where each group’s error is assumed to be multivariate normally distributed.

3. Example Data

The example used for this tutorial is fictional data where the interval scaled outcome variable Extroversion (extro) is predicted by fixed effects for the interval scaled predictor Openness to new experiences (open), the interval scaled predictor Agreeableness (agree), the interval scaled predictor Social engagement (social), and the nominal scaled predictor Class (classRC); as well as the random (nested) effect of Class (classRC) within School (schoolRC) as well as the random effect of School (schoolRC). The data contains 1200 cases evenly distributed among 24 nested groups (4 classes within 6 schools). The data set is available here.

4. Running the Analysis

Begin by clicking on Analyze, Mixed Models, Linear...

The initial dialogue box is self-explanatory; but will not be used in this example so click the Continue button.

Next, we have the main Linear Mixed Models dialogue box. Here we specify the variables we want included in the model. Using the arrows; move extro to the Dependent Variable box, move classRC and schoolRC to the Factor(s) box, and move open, agree, and social to the Covariat(s) box. Then click on the Fixed... button to specify the fixed effects.

The fixed effects in a LINEAR mixed effects model are essentially the same as a traditional ordinary least squares linear regression. To specify the fixed effects, use the Add button to move open, agree, social, and classRC into the Model box. Notice we are not specifying any interaction terms for this model. Then click the Continue button.

Next, click on the Random... button to specify the random effects.

The first thing we need to do is click on the Build nested terms circle (marked with the top, centered red ellipse). Then, highlight / select the classRC factor and use the down arrow button (marked with the lower, left red ellipse) to move classRC into the Build Term box. Then click the (Within) button (marked with the lower, middle ellipse). Next, highlight / select the schoolRC factor and use the down arrow button again to move it inside the parentheses created by the (Within) button. Next, click the Add button (marked with a red ellipse inside a green ellipse) to move our nested term into the Model box. Next, click on the Build terms circle (marked with the green ellipse in the upper left). Then, highlight / select schoolRC factor and use the Add button (marked with the green ellipse around the red ellipse) to move schoolRC to the Model box. Next, click the Continue button at the bottom of the dialogue box.

Next, click on the Estimation... button.

Next, change the Maximum iterations from the default (100) to 150 (marked with the red soft rectangle). This step is not technically necessary, but it insures the estimated values match those produced in R using the lme4 package. Then, click the Continue button.

Next, click on the Statistics... button. While some of the options are not necessary (Case Processing Summary), I generally click all of them.

Next, click on the EM Means... button (Estimated Marginal Means). When the (OVERALL) factor is moved to the Display Means for box, the grand mean will be produced. The classRC factor is present (and moved to the Display Means for box) because it is the only factor (categorical variable) included in the model as a fixed effect. The other fixed effects are not categorical and thus do not appear here. Next, click the Continue button.

Next, click on the Save... button. It is generally a good idea to save the Predicted values. The Fixed Predicted Values will be predicted values based solely on the Fixed Effects part of the model; while the lower Predicted Values & Residuals Predicted values will be the whole model's predicted values. Next, click the Continue button.

Then click the Paste button. Your syntax should match what is below. The reason I recommend pasting the syntax is that it takes quite a few clicks to create one of these types of models and it is often the case that multiple models are run during a session and changing variables or options is simply easier in the syntax than pointing and clicking back through all the above steps.

Next, highlight / select all the text in the syntax and then click the green 'run' arrow (marked with the red ellipse).

Your output should be the same as what is below.

5. Interpreting the Output.

The Case Processing Summary (above) simply shows that the cases are balanced among the categories of the categorical variables and no cases were excluded.

The next, rather large table contains all the descriptive statistics (only the very top of the table is shown here; below).

The Model Dimension table (below) simply shows the model in terms of which variables (and their number of levels) are fixed and / or random effects and the number of parameters being estimated.

The next table displays fit indices. For each index; the lower the number, the better the model fits the data. Generally I use and recommend the Bayesian Information Criterion (BIC).

The next table contains the results of the Fixed Effects tests; here we see the intercept and the classRC variables appear to be the main contributors.

The next 5 tables do not offer much information and simply show each parameter function (only the first and part of the second tables of the five are shown below).

The next table "Estimates of Fixed Effects" (below) is very important and shows the parameter estimates for the Fixed Effects specified in the model. It should be clear, this table and its interpretation are exactly like one would expect from a traditional ordinary least squares linear regression. One thing to note is the way SPSS chooses the reference category for categorical variables. You may have noticed we have been using the classRC and schoolRC variables instead of the original class and school variables in the data set. The RC variables contain the same information as the original variables, they simply have been ReCoded or Reverse Coded so that the output here will match the output produced using the lme4 package in the R programming language. It is important to know that SPSS (and SAS) automatically choose the category with the highest numerical value (or the lowest alphabetical letter) as the reference category for categorical variables. All packages I have used in the R programming language choose the reference category in the more intuitive but opposite way. In the lme4 package (and others I've used) in R, the software automatically picks the lowest numerical value (or the earliest alphabetically letter) as the reference category for categorical variables. This has drastic implications for the intercept estimate and more troubling, the predicted values produced by a model. For example, if this same model is specified with the original variables (not reverse coded) then the Fixed Effects intercept term is 63.049612; so you can imagine how much different the predicted values would be in that model compared to this model where the intercept is 57.383879. Recall from multiple regression, the intercept is interpreted as the mean of the outcome (extro) when all the predictors have a value of zero. The predictor estimates (coefficients or slopes) are interpreted the same way as the coefficients from a traditional regression. For instance, a one unit increase in the predictor Openness to new experiences (open) corresponds to a 0.006130 increase in the outcome Extroversion (extro). Likewise, a one unit increase in the predictor Agreeableness (agree) corresponds to a 0.007736 decrease in the outcome Extroversion (extro). Furthermore, the categorical predictor classRC = 3 has a coefficient of 2.054798; which means, the mean Extroversion score of the third group of classRC (3) is 2.0547978 higher than the mean Extroversion score of the last group of classRC (4). ClassRC (4) was automatically coded as the reference category.

The next 2 tables simply show the correlation matrix and covariance matrix for the fixed effects estimates. We can see that multicollinearity is not an issue among the predictors because, their correlations (and covariances) are quite low (except of course, the categories of the classRC variable which as expected, are related).

Next, we have the Estimates of Covariance Parameters table (below); which are the parameter estimates for the Random Effects. These are variance estimates (with standard errors, Wald Z test statistics, significance values, and confidence intervals for the variance estimates). Recall the ubiquitous ANOVA summary table where we generally have a total variance estimate (sums of squares) at the bottom, then just above it we have a residual or within groups variance estimate (sums of squares) and then we have each treatment or between groups variance estimate (sums of squares). This table is very much like that, but the total is not displayed and the residual variance estimate is on top. So, we can quickly calculate the total variance estimate: 95.171929 + 2.883600 + .968368 = 99.0239 then we can create an R² type of effect size to gauge the importance of each random effect by dividing the effect's variance estimate by the total variance estimate to arrive at a proportion of variance explained or accounted for by each random effect. This is analogous to an Eta-squared (η²) in standard ANOVA or an R² in regression; it is sometimes referred to (in the linear mixed effects situation) as an Intraclass Correlation Coefficient (ICC, Bartko, 1976; Bliese, 2009). For example, we find that the nested effect of classRC within schoolRC is 2.883600 / 99.0239 = 0.02912024 or simply stated, that random nested effect only accounts for 2.9% of the variance of the random effects. However, the random effect for schoolRC alone accounts for 95.171929 / 99.0239 = 0.9611006 or 96% of the variance of the random effects. If none of the random effects account for a meaningful amount of variance in the random effects (i.e. if the residual variance is larger than the random effect variance estimates), then the random effects should be eliminated from the model and a standard General Linear Model (or Generalized Linear Model) should be fitted (i.e., a model with only the fixed effects). Notice, SPSS does not calculate the standard errors correctly and therefore, the confidence interval estimates and the results of the Wald Z test are NOT valid. The Wald Z test simply divides the estimate by its standard error to arrive at a Z-score to test for significance with the standard normal distribution of Z-scores. However, the standard errors do not match with the standard errors produced when using the lme4 package in the R programming language. The good news is that the variance estimates are correct (do match) and the proportion of variance estimates can be correctly computed and used as effect size measures.

The next two tables simply show the correlation and covariances for the random effect parameter estimates.

The next three tables in the output are the Random Effects Covariance Structure matrices. They are omitted here because they are particularly useless and redundant; because each table simply lists the parameter estimate for each random effect.

The last part of the output contains tables with the Estimated Marginal means (EM means) for the Grand Mean and ClassRC.

The Grand Mean contrast coefficients table and actual grand mean table (the overall mean of the outcome variable: extro).

The ClassRC variable's contrast coefficients table and mean extroversion (extro) for each group table.

As with most of the tutorials / pages within this site, this page should not be considered an exhaustive review of the topic covered and it should not be considered a substitute for a good textbook.

References / Resources

Akaike, H. (1974). A new look at the statistical model identification. I.E.E.E. Transactions on Automatic Control, AC 19, 716 – 723. Available at:

Bartko, J. J. (1976). On various intraclass correlation reliability coefficients. Psychological Bulletin, 83, 762-765.

Bates, D., & Maechler, M. (2010). Package ‘lme4’. Reference manual for the package, available at:

Bates, D. (2010). Linear mixed model implementation in lme4. Package lme4 vignette, available at:

Bates, D. (2010). Computational methods for mixed models. Package lme4 vignette, available at:

Bates, D. (2010). Penalized least squares versus generalized least squares representations of linear mixed models. Package lme4 vignette, available at:

Bliese, P. (2009). Multilevel modeling in R: A brief introduction to R, the multilevel package and the nlme package. Available at:

Draper, D. (1995). Inference and hierarchical modeling in the social sciences. Journal of Educational and Behavioral Statistics, 20(2), 115 - 147. Available at:

Fox, J. (2002). Linear mixed models: An appendix to “An R and S-PLUS companion to applied regression”. Available at:

Gelman, A. (2005). Analysis of variance -- why it is more important than ever. The Annals of Statistics, 33(1), 1 -- 53. Available at:

Hofmann, D. A., Griffin, M. A., & Gavin, M. B. (2000). The application of hierarchical linear modeling to organizational research. In K. J. Klein (Ed.), Multilevel theory, research, and methods in organizations: Foundations, extensions, and new directions (p. 467 - 511). San Francisco, CA: Jossey-Bass. Available at:

Raudenbush, S. W. (1995). Reexamining, reaffirming, and improving application of hierarchical models. Journal of Educational and Behavioral Statistics, 20(2), 210 - 220. Available at:

Raudenbush, S. W. (1993). Hierarchical linear models and experimental design. In L. Edwards (Ed.), Applied analysis of variance in behavioral science (p. 459 - 496). New York: Marcel Dekker. Available at:

Rogosa, D., & Saner, H. (1995). Longitudinal data analysis examples with random coefficient models. Journal of Educational and Behavioral Statistics, 20(2), 149 - 170. Available at:

Schwarz, G. (1978). Estimating the dimension of a model. Annals of Statistics, 6, 461 – 464. Available at: