
This article is a reprint from the March 2000 issue
of Benchmarks Online. The title has been changed slightly. Link to the last RSS article here: A
New Face in RSS. Link to the last RSS article in this series here:
Interactive Graphics in R (Part II - cont.): Kernel Density Estimation in One and Two Dimensions.
- Ed.
Resampling Based Statistics in S-Plus for Windows: An
Example Using the MANOVA Procedure
By Dr. Rich
Herrington, Research and Statistical Support Services Manager
This month we take a look at the bootstrap resampling
capabilities of S-Plus. S-Plus has general bootstrapping functionality
available so that nearly all statistical functions and expressions can
be bootstrapped. S-Plus provides both parametric and nonparametric
bootstrap confidence intervals.
From the main menu bar, we access the resampling menu from: Statistics
- Resample - Bootstrap.
The menu for the Bootstrap facilities has five entry areas for
initializing the Bootstrap analysis: Model, Options, Results,
Plot, and Jack After Boot. Each of these
option tabs are initialized with default values. However, the critical
entry field which does not have a default entry is the Expression
entry field. Entering an expression to bootstrap can be tricky as this
assumes that the user has some knowledge of the syntax of the S-Plus
language.

One way of avoiding having detailed knowledge of the syntax used to
generate a particular analysis, is to generate the analysis before
hand from the drop down menu system. Once this analysis has been run,
the syntax used to generate the analysis is displayed. Essentially,
the drop down menu system generates the syntax as entry fields are
filled in. After an analysis is run from the menu system, this syntax
can be saved, cut and pasted back into the Expression
entry field. In the following example we will perform a four-group
MANOVA with four dependent measures.
Example
The data set we will use for our analysis will have four groups: a
control group and three experimental groups (c1, e1, e2, e3). We see a
screen capture of the object browser and the data worksheet:

From the main menu bar select: Statistics -
Multivariate - MANOVA. Select the Create Formula
tab. Fill out the create formula tab with the following specifics.
First select q1 through q4 and click Add Response.
Then select group and click Add Main Effect:
]
Select OK to return to the previous menu. Select OK
once more to actually run the analysis. In the report window we see
the following:

The calling function is listed under Call. Copy
the manova(formula.....) and paste this into your Commands
window. Use the summary function to
summarize the call to the manova function. Assign this summary
to an object, man.out, for example:

Typing man.out by itself displays the contents of
this object.names displays the components of this
list. We have six components to this list. To extract the fifth
element "Stats". We have to index the list
in the following fashion:

We see that wilks lambda (.9240) is the second index for the fifth
element of the list, man.out. So the complete calling
function to the bootstrap function will be:

This calling function returns a value of .9240 for wilks lambda for
this particular data set. We need to copy this function call: summary(manova.......))[[5]][2],
into the Expression window on the bootstrap menu.


For the Options tab we need to select the grouping
variable and how many bootstrap iterations we need:

For the Results tab we select empirical
percentiles:

For the Plot tab we select Normal
Quantile-Quantile to see how well the sampling distribution
matches with "normal distribution" theory.

Selecting OK generates the following report
window:

And the following plots:


We see that the empirically resampled sampling distribution for
wilks lambda follows normal theory fairly closely except for the right
tail region. We see that the upper and lower cut-offs for the
2.5/97.5th and 5/95th percentiles both contain the observed value of
wilks lambda. We take this as a failure to reject the null hypothesis
for wilks lambda. In general the BCa percentiles will be more accurate
than the empirical percentiles.
Further Reading
Davison, A.C. and Hinkley, D.V. (1997). Bootstrap Methods and
Their Application. Cambridge University Press.
Efron, B. and Tibshirani, R. J. (1993). An Introduction to the
Bootstrap. San Francisco: Chapman & Hall.
|