1. Mixed
Effects Models
Mixed effects models refer to a variety of models which
have as a key feature both fixed and random effects.
The distinction between fixed and random effects is a murky
one. As pointed out by
Gelman (2005), there are several, often conflicting, definitions of fixed
effects as well as definitions of random effects. Gelman offers a fairly
intuitive solution in the form of renaming fixed effects and random effects and
providing his own clear definitions of each. “We define effects (or
coefficients) in a multilevel model as constant if they are identical for
all groups in a population and varying if they are allowed to differ from
group to group” (Gelman, p. 21). Other ways of thinking about fixed and random
effects, which may be useful but are not always consistent with one another or
those given by Gelman above, are discussed in the next paragraph.
Fixed
effects are ones in which the possible values of the variable are fixed. Random
effects refer to variables in which the set of potential outcomes can change.
Stated in terms of populations, fixed
effects can be thought of as effects for which the population elements are
fixed. Cases or individuals do not move into or out of the population. Random
effects can be thought of as effects for which the population elements are
changing or can change (i.e. random variable). Cases or individuals can and do
move into and out of the population. Another way of thinking about the
distinction between fixed and random effects is at the observation level. Fixed
effects assume scores or observations are independent while random effects
assume some type of relationship exists between some scores or
observations. For instance, it can be said that gender is a fixed effect
variable because we know all the values of that variable (male & female) and
those values are independent of one another (mutually exclusive); and they
(typically) do not change. A variable such as high school class has random
effects because we can only sample some of the classes which exist; not to
mention, students move into and out of those classes each year.
There are many types of random effects, such as repeated
measures of the same individuals; where the scores at each time of measure
constitute samples from the same participants among a virtually infinite (and
possibly random) number of times of measure from those participants.
Another example of a random effect can be seen in nested designs, where for
example; achievement scores of students are nested within classes and those
classes are nested within schools. That would be an example of a hierarchical
design structure with a random effect for scores nested within classes and a
second random effect for classes nested within schools. The nested data
structure assumes a relationship among groups such that members of a class are
thought to be similar to others in their class in such a way as to distinguish
them from members of other classes and members of a school are thought to be
similar to others in their school in such a way as to distinguish them from
members of other schools. The example used below deals with a similar design
which focuses on multiple fixed effects and a single nested random effect.
2. Linear
Mixed Effects Models
Linear mixed effects models simply model the fixed and
random effects as having a linear form. Similar to the General Linear Model, an
outcome variable is contributed to by additive fixed and random effects (as well
as an error term). Using the familiar notation, the linear mixed effect model
takes the form:
y_{ij} = β_{1}x_{1ij}
+ β_{2}x_{2ij} … β_{n}x_{nij} +
b_{i1}z_{1ij} + b_{i2}z_{2ij} … b_{in}z_{nij}
+ ε_{ij}
where y_{ij} is the value of the outcome
variable for a particular ij case, β_{1} through β_{n} are the
fixed effect coefficients (like regression coefficients), x_{1ij}
through x_{nij} are the fixed effect variables (predictors) for
observation j in group i (usually the first is reserved for the
intercept/constant; x_{1ij} = 1), b_{i1} through
b_{in} are the random effect coefficients which are assumed to be
multivariate normally distributed, z_{1ij} through z_{nij} are
the random effect variables (predictors), and ε_{ij} is the error for
case j in group i where each group’s error is assumed to be multivariate
normally distributed.
3. Example
Data
The example used for this tutorial is fictional data where
the interval scaled outcome variable Extroversion (extro) is predicted by fixed
effects for the interval scaled predictor Openness to new experiences (open),
the interval scaled predictor Agreeableness (agree), the interval scaled
predictor Social engagement (social), and the nominal scaled predictor Class (classRC);
as well as the random (nested) effect of Class (classRC) within School (schoolRC)
as well as the random effect of School (schoolRC). The data contains 1200 cases
evenly distributed among 24 nested groups (4 classes within 6 schools). The data
set is available
here.
4. Running the Analysis
First, import the data (linked above) with the member name LMM_dataRC Then,
take a look at the data running a few common summary PROCs; starting with a PROC
MEANS for the whole data.
PROC MEANS
DATA=LMM_dataRC;
RUN;
Next, take a look
at the categorical variable frequencies.
PROC FREQ
DATA=LMM_dataRC;
TABLES classRC schoolRC;
RUN;
Now we can proceed
to fit the model. Pay particular attention to how the model is specified. The
first line of syntax simply tells SAS what procedure we are doing (mixed) and
specifies the data. The CLASS statement lists the classification variables
(categorical variables or factors). The MODEL statement specifies the dependent
variable and then the fixed effects variables. The RANDOM statement lists the
random effects variables; classRC nested within schoolRC as well as schoolRC
alone. Note, traditional interactions can be specified in the model as either
fixed or random effects, using the *. For example, if we wanted to include a
fixed effect interaction in our model for the interaction between open and
agree, we would add the following term to the MODEL statement: open*agree. Also
notice the /solution option is specified for both the fixed effects and random
effects. The solution option for the fixed effects, provides the beta
coefficient for each predictor (and associated standard errors, degrees of
freedom, tvalue, and pvalues). The solution option for the random effects,
provides the estimated mean values for each category of each random predictor
(and associated standard errors, degrees of freedom, tvalue, and pvalues). The
outpred option creates a new data file with the predicted values (based on the
model); which can often be useful or meaningful. PROC MIXED
DATA = LMM_dataRC;
CLASS class school;
MODEL extro = open agree social class/SOLUTION
OUTPRED=predicted;
RANDOM classRC(schoolRC) schoolRC/SOLUTION;
RUN; The output should be similar to
what is displayed below. The first page of which displays the Model Information,
the levels of each classification variable (categorical variables or factors),
the number of dimensions of the model, the number of observations, and the
iteration history.
The second page of the output starts with the Random Effects estimates. These
are variance estimates (with standard errors, Wald Z test statistics,
significance values, and confidence intervals for the variance estimates).
Recall the ubiquitous ANOVA summary table where we generally have a total
variance estimate (sums of squares) at the bottom, then just above it we have a
residual or within groups variance estimate (sums of squares) and then we have
each treatment or between groups variance estimate (sums of squares). This table
is very much like that, but the total is not displayed and the residual variance
estimate is on top. So, we can quickly calculate the total variance estimate:
95.1431 + 2.8831 + .9684 = 98.9946 then we can create an R²
type of effect size to gauge the importance of each random effect by dividing
the effect's variance estimate by the total variance estimate to arrive at a
proportion of variance explained or accounted for by each random effect. This is
analogous to an Etasquared (η²) in standard ANOVA or an R² in regression; it is
sometimes referred to (in the linear mixed effects situation) as an Intraclass
Correlation Coefficient (ICC, Bartko, 1976; Bliese, 2009). For
example, we find that the nested effect of classRC within schoolRC is 2.8831 /
98.9946 = 0.0291238 or simply stated, that random nested effect only accounts
for 2.9% of the variance of the random effects. However,
the random effect for schoolRC alone accounts for 95.1431 / 98.9946 = 0.9610938
or 96% of the variance of the random effects. If none of
the random effects account for a meaningful amount of variance of the random
effects (i.e. if residual is larger than the other variance estimates),
then the random effects should be eliminated from the model and a standard
General Linear Model (or Generalized Linear Model) should be fitted (i.e., a
model with only the fixed effects).
The next part of the second page contains the fit
indices; generally I use and recommend the Bayesian Information Criterion (BIC).
The next part of the output displays the Fixed Effects estimates.
It should be clear, this table and its interpretation are exactly like one would
expect from a traditional ordinary least squares linear regression. One thing to
note is the way SPSS chooses the reference category for categorical variables.
You may have noticed we have been using the classRC and schoolRC variables
instead of the original class and school variables in the data set. The RC
variables contain the same information as the original variables, they simply
have been ReCoded or Reverse Coded so that the output here will match the output
produced using the lme4 package in the R programming language. It is important
to know that SAS (and SPSS) automatically choose the category with the highest
numerical value (or the lowest alphabetical letter) as the reference category
for categorical variables. All packages I have used in the R programming
language choose the reference category in the more intuitive but opposite way.
In the lme4 package (and others I've used) in R, the software automatically
picks the lowest numerical value (or the earliest alphabetically letter) as the
reference category for categorical variables. This has drastic implications for
the intercept estimate and more troubling, the predicted values produced by a
model. For example, if this same model is specified with the original variables
(not reverse coded) then the Fixed Effects intercept term is 63.049612; so you
can imagine how much different the predicted values would be in that model
compared to this model where the intercept is 57.3839. Recall from multiple
regression, the intercept
is
interpreted as the mean of the outcome (extro) when all the predictors have a
value of zero. The predictor estimates (coefficients or slopes) are interpreted
the same way as the coefficients from a traditional regression. For instance, a
one unit increase in the predictor Openness to new experiences (open)
corresponds to a 0.006130 increase in the outcome Extroversion (extro).
Likewise, a one unit increase in the predictor Agreeableness (agree) corresponds
to a 0.00774 decrease in the outcome Extroversion (extro). Furthermore,
the categorical predictor classRC = 3 has a coefficient of 2.0548; which means,
the mean Extroversion score of the third group of classRC (3) is 2.0548 higher
than the mean Extroversion score of the last group of classRC (4). ClassRC (4)
was automatically coded as the reference category. The last part of the output
(which continues on to the next page) contains the Random Effects Estimates.
The next page of output continues displaying the Random Effects estimates and
also shows the tests of the Fixed Effects.
As with most of the tutorials / pages within
this site, this page should not be considered an exhaustive review of the topic
covered and it should not be considered a substitute for a good textbook.
References /
Resources
Akaike, H.
(1974). A new look at the statistical model identification.
I.E.E.E. Transactions on Automatic
Control, AC 19, 716 – 723. Available at:
http://www.unt.edu/rss/class/Jon/MiscDocs/Akaike_1974.pdf
Bartko, J.
J. (1976). On various intraclass correlation reliability coefficients.
Psychological Bulletin, 83, 762765.
http://www.unt.edu/rss/class/Jon/MiscDocs/Bartko_1976.pdf
Bates, D.,
& Maechler, M. (2010). Package ‘lme4’. Reference manual for the package,
available at:
http://cran.rproject.org/web/packages/lme4/lme4.pdf
Bates, D. (2010). Linear mixed model implementation in
lme4. Package lme4 vignette, available at:
http://cran.rproject.org/web/packages/lme4/vignettes/Implementation.pdf
Bates, D. (2010). Computational methods for mixed models.
Package lme4 vignette, available at:
http://cran.rproject.org/web/packages/lme4/vignettes/Theory.pdf
Bates, D. (2010). Penalized least squares versus
generalized least squares representations of linear mixed models. Package lme4
vignette, available at:
http://cran.rproject.org/web/packages/lme4/vignettes/PLSvGLS.pdf
Bliese, P. (2009). Multilevel modeling in R: A brief
introduction to R, the multilevel package and the nlme package. Available at:
http://cran.rproject.org/doc/contrib/Bliese_Multilevel.pdf
Draper, D. (1995). Inference and hierarchical modeling in
the social sciences. Journal of Educational and Behavioral Statistics, 20(2),
115  147. Available at:
http://www.unt.edu/rss/class/Jon/MiscDocs/Draper_1995.pdf
Fox, J.
(2002). Linear mixed models: An appendix to “An R and SPLUS companion to
applied regression”. Available at:
http://cran.rproject.org/doc/contrib/FoxCompanion/appendixmixedmodels.pdf
Gelman, A.
(2005). Analysis of variance  why it is more important than ever. The
Annals of Statistics, 33(1), 1  53. Available at:
http://www.unt.edu/rss/class/Jon/MiscDocs/Gelman_2005.pdf
Hofmann,
D. A., Griffin, M. A., & Gavin, M. B. (2000). The application of hierarchical
linear modeling to organizational research. In K. J. Klein (Ed.), Multilevel
theory, research, and methods in organizations: Foundations, extensions, and new
directions (p. 467  511). San Francisco, CA: JosseyBass. Available at:
http://www.unt.edu/rss/class/Jon/MiscDocs/Hofmann_2000.pdf
Littell,
R. C., Milliken, G. A., Stroup, W. W., & Wolfinger, R. D. (1996). SAS system for
mixed models. Cary, NC: SAS Institute Inc.
Raudenbush,
S. W. (1995). Reexamining, reaffirming, and improving application of
hierarchical models. Journal of Educational and Behavioral Statistics, 20(2),
210  220. Available at:
http://www.unt.edu/rss/class/Jon/MiscDocs/Raudenbush_1995.pdf
Raudenbush,
S. W. (1993). Hierarchical linear models and experimental design. In L. Edwards
(Ed.), Applied analysis of variance in behavioral science (p. 459  496).
New York: Marcel Dekker. Available at:
http://www.unt.edu/rss/class/Jon/MiscDocs/Raudenbush_1993.pdf
Rogosa,
D., & Saner, H. (1995). Longitudinal data analysis examples with random
coefficient models. Journal of Educational and Behavioral Statistics, 20(2),
149  170. Available at:
http://www.unt.edu/rss/class/Jon/MiscDocs/Rogosa_1995.pdf
Schwarz,
G. (1978). Estimating the dimension of a model.
Annals of Statistics, 6, 461 –
464. Available at:
http://www.unt.edu/rss/class/Jon/MiscDocs/Schwarz_1978.pdf
