Please make sure and read the farewell message from
Dr. Karl Ho in this month's SAS Corner. - Ed.
An Introduction to Multilevel Models (Part I):
Exploratory Growth Curve Modeling
By Dr. Rich
Herrington, Research and Statistical Support Consultant
This month we discuss exploratory growth curve modeling. This is
Part I in a series on Multilevel Modeling. The GNU S language,
"R" is used to implement this procedure. R is a
statistical programming environment that is a clone of the S and S-Plus
language developed at Lucent Technologies. In the following document we
illustrate the use of a GNU Web interface to the R engine on the "rss"
server ( http://rss.acs.unt.edu/cgi-bin/R/Rprog).
This GNU Web interface is a derivative of the "Rcgi" Perl
scripts available for download from the CRAN Website (http://www.cran.r-project.org),
the main "R" Website. Scripts can be submitted
interactively, edited, and then be re-submitted with changed parameters by
selecting the hypertext link buttons that appear below the figures.
For example, clicking the "Run Program" button below
creates a vector of 100 random normal deviates; sorts and displays the
results; then creates a histogram of the random numbers. To view any
text output, scroll to the bottom of the browser window. To view any
graphical output, select the "Display Graphic" link. The
script can be edited and resubmitted by changing the script in the form
window and then selecting "Run the R Program".
Selecting the browser "back page" button will return the reader
to this document.
Introduction to
Repeated Measure Multilevel Models
In
this article we will discuss an exploratory approach to fitting a
repeated measures multilevel model. This will serve as a warm up to
the more formal estimation methods used in multilevel modeling, which
we will cover in coming issues of RSS Matters. We will
motivate the model with a simulated example that uses Iteratively
Re-weighted Least Squares (IRLS) regression to estimate parameters.Our goal here is to keep the example simple to elucidate the
main insights that growth curve modeling (multilevel modeling)
offers. Additionally, we will learn S language commands that
allow us to graphically explore and statistically estimate parameters
of a simple multilevel model.
The following graph represents the time course of what we will call
the
level-oneobserved change, within an individual. Examining
such a graph allows the data modeler to attempt to characterize the
change process as a simple mathematical function (i.e. linear,
quadratic, or cubic) for the individual entity. An assumption of
growth curve modeling approaches and multilevel approaches in general,
is that a family of functions (e.g. linear) can characterize the
change process in different individuals, by allowing the
parameters of the linear function to vary randomly across
individuals. Individuals express their unique variability
through the parameters of the mathematical function. These
randomly varying estimates of the change process (e.g. intercept and
slope) can be related to background conditions of the
individuals. It is here that we wish to get efficient and
unbiased estimates of the predictors of the change process. Do
background covariates predict rates of change (slope) or initial
status of the outcome variable?
Summary Measures of the Level-One Model Change Process
Since the observed growth record for an individual is subject to
measurement error and sampling variability, it is useful to characterize
the growth process as a mathematical function. We will call this the
level-one model
of change. This mathematical
function (e.g. linear), if representative, should capture the essential
features of the growth profile. The parameters of the function
should summarize the characteristics of the change process for the
individual. In the following graph, a linear function is considered
representative of the change process. An intercept is calculated;
the intercept gives a best estimate of the initial status of the outcome
variable at the first observation period. Additionally a slope is
calculated; a slope gives an estimate of the rate of change in the outcome
variable per unit change in the time indicator variable. In our
simulated example we have electrophysiological assessments from the scalp
of an individual who is engaged in EEG biofeedback (also referred to as
neurofeedback - NF), gathered over four observation periods.
Individuals use a process of trial and error and an electrical
feedback signal from the scalp to change their own scalp electrical
potentials. An initial question is posed for our example: Does
the rate of change in cortical electrical activity (amplitude in millivolts) relate to background conditions of depression in the
individual?
Individual Variation (Level-One) and Group Level Variation
(Level-Two)
The following graph depicts four linear growth curves fit for four
different individuals. Additionally, an average curve is estimated
which represents the change process of the group as a whole, collapsing
over individual variation in the growth process. Each person has an
intercept estimate and a slope estimate; these individual estimates
can vary from the group intercept and group slope (
level-two). The source of this
variation may be related to other predictor variables, such as background
characteristics, or may represent variation that is not related to a
substantial variable of interest. We will call this the
level-twomodel. In our example, it might be that
the rate of change in electrical activity on the scalp (as measured in millivolts), is related to initial status of depression, as measured
behaviorally by a paper and pencil inventory - the Beck's Depression
Inventory (BDI).
Predicting Outcome as a Function of Time - Allowing Individual
Variation in Slopes and Intercepts
We can represent both level-one and level-two models of change as a set
of regression equations. The outcome variable (Y) is predicted by
the estimates of individuals' slopes and intercepts. Each
individuals' slope and intercept is composed of a constant (fixed) portion
and a residual (random) portion. The fixed portion can be thought of
as the mean of the group estimate (i.e. group slope and group intercept),
and the residual variation can be thought of as unique variation that is
not accounted for at the group level. We can substitute level
two regression equations into level one regression equations to obtain a
single regression equation.
The
first term represents the fixed component of growth; tthis
represents the group level characteristics of growth. The
second term represents the random component of growth;
this represents the individual level characteristics of
growth.
This model is referred to as an unconditional growth model because we
are NOT attempting to predict growth using additional measures. To
predict growth we must use a conditional growth model. Both
time variant and time invariant predictors of growth can be used in a
conditional growth model.
It is important to note that this single equation has "heteroscedastic"
components. The second term in this regression equation is
heteroscedastic because the time variable (T), which is
increasing, multiplies by a variance component (u). Hence,
the variance increases as a function of observation period, thus
violating the assumption of homoscedasticity in classical OLS
estimates of the regression equation. It is for this reason that
a non-OLS estimate is needed for the parameters in the model;
heteroscedasticity can lead to biased and/or inefficient estimates of slope and intercept
coefficients
Predicting the Outcome Variable as a Function of Level-One Covariates
and Level-Two Covariates
In the system of regression equations below, both time varying level-one covariates, and time-invariant level-two covariates are
combined. X is an individual level, time varying covariate, and Z is
a time invariant, individual level characteristic, or group level
predictor; the six regression parameters (beta coefficients) relate the
two predictors to variability in initial status and rate of change over
time.
This model is a conditional
growth model because the individual variability in initial status
and change over time are conditioned upon exogenous predictor
variables. The conditional random effects
model is sometimes referred to as a: random coefficient model,
mixed-effects model, hierarchical linear model (Bryk & Raudenbush,
1992), empirical Bayes
model, "slopes as outcomes" model - or, more generally, a
multilevel model.
In longitudinal research, we sometimes have repeated measures of
individuals who are all measured together on a small number of fixed
occasions. This is typically the case with experimental
designs involving repeated measures and panel designs. For our
example, our group level predictor can be an indicator variable
indicating either control or experimental group membership.
For our simulation example, we will use a much simpler model
(displayed below), and
will only focus on one coefficient of interest. In the
following example Y will be electrical voltages from the scalp
(amplitude - measured in millivolts) measured over observation
period (T). Individuals change their own scalp electrical
potential through trial and error, relying on a biofeedback signal
coming from their scalp (neurofeedback - NF). Additionally, we
have measured levels of depression using a behavioral (self-report)
assessment device (Beck Depression Inventory - BDI). Several
questions can be posed: Is
there evidence for systematic change and individual variability in
NF amplitudes over time? Is the post BDI assessment
related to the initial levels or rates of change in NF amplitudes? What is the relationship between
the initial levels of NF amplitudes and the rates of change in NF
amplitudes over time? Is a linear relationship a good
description of within-person change?
Parameter Estimation Using Iteratively Re-weighted Least Squares (IRLS)
We are interested in unbiased and efficient estimates of the beta
coefficients which relate the background covariate Z to the individual
level estimates of rates of change and initial status (pi
coefficients). Willet (1988) outlines a fairly straightforward
approach to estimating a growth curve model for a single, level-two
coefficient (relating individual slope coefficients to individual
background covariates) . This method relies on weighted least
squares (WLS) and can be improved by iterating until the weights or
residuals converge (IRLS). This method is appealing because it
provides insight into a central point of multilevel modeling: Level-two coefficient estimates are weighted by the precision of the level-one
coefficients. Individuals whose growth coefficients are more
reliably estimated, provide more input (information) in the estimation of
level-two coefficients. Individuals with large residual variability
in their growth record, and hence more unreliable estimates of level-one
growth coefficients, are down weighted in the second level of analysis of
the level-two coefficients. In the table below, a modified
form of Willet's
algorithm is outlined.
An Example Using GNU-S ("R")
We will create a simulated
data set that is modeled after real neurofeedback data. The observations represent average neurofeedback session
amplitudes (10 averaged sessions per observation period). The covariate represents either a pre or post BDI
assessment (we will use a post BDI assessment). We will go
through most sections of the S code to illustrate how S can be used to
both simulate data according to an assumed model, and then estimate
the model, to try and recapture the population parameters. It is
recommended that the reader change the population parameters and
experiment with the model. In this way, intuition can be gained
about how the observed data varies according to changes in the
model. A "Live" script is presented below (this script
can be run, examined, and re-edited for re-submission and
re-examination); below this live script is annotated program output.
Population parameters are set according to our previous model with one
background covariate (Z, the BDI). A small number of growth
curves are generated to examine the effect of small sample size in getting
unbiased estimates using OLS versus IRLS parameter estimation. Both
the fixed portion and the residual portion (beta and u) for the level-two
regressions are set in the population. These will be used with the
background covariate to generate the slopes and intercepts for individuals
(pi's).
A matrix of observation periods must be generated for each individual.
Slopes and intercepts of individuals can be correlated across
individuals (p0 - intercept, p1 - slope). We create slopes and
intercepts whose correlation is -.30: high intercepts lead to rapid
rates of change, and low intercepts lead to small rates of change.
Residuals at level-one (within an individual) are likely to be
correlated. We assume a correlation structure that is equal across
time periods (compound symmetry).
Next, a covariate is generated for each individual:
With the group betas (b0, b1), individual background covariate (covar),
and level-two residual terms (u0, u1), we can combine the level-two
parameters together to obtain the level-one parameters for each individual
(p0, p1). We also estimate the average population slope and
intercept estimates, and the correlation between the two, once the betas
are combined with the covariate and the residual terms. Later, in
estimating a statistical model for this simulated data, we might
want to use
"centered" estimates of the covariate (covar) to aid in
interpretation and estimation (Hox, 2002).
Once the pi coefficients (p0, p1) are calculated for each individual,
we can combine them with the time index and add the level-one individual
residual term to obtain the "observed score" for each
individual, indexed by T - each individual gets four observations since T
ranges from 1 to 4.
Some attention needs to be paid to the fact that some observations can
fall out of range of reasonable limits (dictated by physical logistics and
limitations). We can replace the min and max observation with the
mean observation, for each observation period. This process
"draws" the tails of the distribution of scores, for an
observation period, toward the mean of the distribution. This can be
iterated to shape the growth curves is so desired. Here, we only
iterate one time, taking the one smallest and one largest observation and
replacing with the mean for that observation period.
Finally, we list out the observed scores as simulated from a
"known" model.
A "not-so-useful" depiction of all the observed growth
profiles is a "Profile" plot. A profile plot allows one to
discern some random effects structures in the data - we can see that the
slopes and intercepts are potentially correlated; we see that there is
heteroscedasticity prevalent in the data; and we can examine potential
"outlying" growth records.
Parametric Estimation of Growth Curve Models Using OLS and IRLS
Ordinary Least Squares (OLS) is used to estimate the level-one linear
regressions for all individuals. Standard errors for each individual
regression are used to create weights for the level-two regression of the
level-one slopes and background covariates. Additionally, at the
level-two regression, iteratively re-weighted least squares (IRLS) is used
to converge on the best estimates given the initial level-one
weights.
All 20 regressions are performed; the beta coefficients, standard
errors (used for weights later), covariates, and other interesting
information are combined into a single data frame; informative names
are then assigned to the columns.
A nice alternative to a profile plot of the observed data, is a
profile plot of the level-one predicted values for each observation
period. This plot better summarizes the correlation structure of the
parameters (slopes and intercepts), and indicates more clearly the nature
of the heteroscedasticity in the data.
A comparison of the OLS and IRLS solution indicates that the slope
estimates are close for both the IRLS and OLS procedures (.25 for OLS
versus .36 for IRLS). However, the IRLS procedure gives a
substantially smaller residual standard error (1.001 for OLS versus .5483
for IRLS), thereby giving a more efficient estimate for the IRLS
procedure. Furthermore, the residuals for the OLS indicate fairly
high values (-1.7009, 1.5025), whereas the IRLS residuals are relatively
smaller (-.84154, 1.00915) - using 2.0 as a cutoff.
OLS gives a good fit to the observed data even though growth curve
parameters are not weighted by the parameters precision at level-one. This indicates a relatively efficient solution, however IRLS
gives a much better fit to the data. Consequently, the estimates of
R-squared are significantly larger for IRLS indicating that the OLS is not
as efficient in the presence of heterogeneous slopes having differing
measurement precisions.
Descriptive statistics for the observed data and the level-one regressions are
generated.
A Graphical Depiction of the Relationship Between the Observed
Background Covariate and the Estimate Growth Rates
Another useful graphical depiction is a scatter plot (with an OLS best
fit line) of the estimated growth rates (slopes) with the background
covariate. One can graphically characterize the impact of the
heteroscedasticity present in the data on the level-two regression of
slopes and covariates. The heteroscedasticity in the plot appears to
be quite large. While significant "outliers" are present,
these outliers seem to "counterbalance" one another across the
mean of the Y-axis, and counterbalance one another across the mean of the
X-axis. This could account for the closeness of the IRLS and OLS
parameter estimates. If the size of the residuals are: 1)
symmetrical and counterbalanced across both Y and X axes (as is in this
case), and 2) the outlying values on the X-axis are close to
the mean of the X-axis, this would lead to less bias in the
parameter estimates. An interesting experiment would be to generate
different data sets with differing patterns of heteroscedasticity, and
observe the difference in the parameter estimates of the OLS and IRLS
procedures.
The graph below (panel 1) depicts the relationship between the
covariate (Y-axis) and the slopes for individuals (X-axis). The
second graph (panel 2) plots the observed data as a function of
observation period, and the third graph (panel 3) is the predicted
values from the level-one regressions plotted as a function of
observation period.
Conclusions
A
Two Stage IRLS model of the Post BDI covariate and the individual
growth rates indicates that there is reliable differences in growth
rates in individuals and is correlated with the post BDI assessment
(for the simulated data set). In general, larger negative
slopes are correlated with smaller values on the BDI assessment, and
lesser negative slopes are correlated with larger values on the post
BDI assessment. In general, larger negative
slopes are correlated with larger NF amplitudes at initial status,
and smaller negative slopes are correlated with smaller NF
amplitudes at initial status. The IRLS and OLS beta estimates
(slopes) were fairly close in value - this may be due to the
symmetrical pattern of heteroscedasticity in the data.
However, IRLS estimation gave significantly smaller level-two
residual standard error than OLS estimation, for the level-two
regression - IRLS is more efficient than OLS in this example.
Consequently, the percentage variance accounted for in the outcome
variable (NF amplitudes), by knowledge of the background covariate (BDI),
was substantially larger for the IRLS solution.
Next Time
Next time we will explore the use of the S-Plus and R
library NLME (linear and nonlinear mixed effects) with the
simulated data set used in this article. Additionally, we will
look at other parameter estimation algorithms (e.g. Restricted Maximum
Likelihood - REML), and other model diagnostic approaches (e.g. AIC).
References
Bryk,
A.& Raudenbush, S.W. (1992).Hierarchical linear Models. Newbury Park, CA: Sage.
Hox, J. (2002).Multilevel
analysis:Techniques
and Applications.Lawrence
Erlbaum, New Jersey.
Willet, J. (1988).Questions
and answers in the measurement of change.Review of
Research Education, Vol. 15, AERA Research Associates, Washington, DC.