Resampling Based Statistics in S-Plus
By Rich Herrington, Research and Statistical Support Services
This month we take a look at the bootstrap resampling capabilities of S-Plus. S-Plus has general bootstrapping functionality available so that nearly all statistical functions and expressions can be bootstrapped. S-Plus provides both parametric and nonparametric bootstrap confidence intervals.
From the main menu bar, we access the resampling menu from: Statistics - Resample - Bootstrap.
The menu for the Bootstrap facilities has five entry areas for initializing the Bootstrap analysis: Model, Options, Results, Plot, and Jack After Boot. Each of these option tabs are initialized with default values. However, the critical entry field which does not have a default entry is the Expression entry field. Entering an expression to bootstrap can be tricky as this assumes that the user has some knowledge of the syntax of the S-Plus language.
One way of avoiding having detailed knowledge of the syntax used to generate a particular analysis, is to generate the analysis before hand from the drop down menu system. Once this analysis has been run, the syntax used to generate the analysis is displayed. Essentially, the drop down menu system generates the syntax as entry fields are filled in. After an analysis is run from the menu system, this syntax can be saved, cut and pasted back into the Expression entry field. In the following example we will perform a four-group MANOVA with four dependent measures.
The data set we will use for our analysis will have four groups: a control group and three experimental groups (c1, e1, e2, e3). We see a screen capture of the object browser and the data worksheet:
From the main menu bar select: Statistics - Multivariate - MANOVA. Select the Create Formula tab. Fill out the create formula tab with the following specifics. First select q1 through q4 and click Add Response. Then select group and click Add Main Effect:
Select OK to return to the previous menu. Select OK once more to actually run the analysis. In the report window we see the following:
The calling function is listed under Call. Copy the manova(formula.....) and paste this into your Commands window. Use the summary function to summarize the call to the manova function. Assign this summary to an object, man.out, for example:
Typing man.out by itself displays the contents of this object.names displays the components of this list. We have six components to this list. To extract the fifth element "Stats". We have to index the list in the following fashion:
We see that wilks lambda (.9240) is the second index for the fifth element of the list, man.out. So the complete calling function to the bootstrap function will be:
This calling function returns a value of .9240 for wilks lambda for this particular data set. We need to copy this function call: summary(manova.......))[], into the Expression window on the bootstrap menu.
For the Options tab we need to select the grouping variable and how many bootstrap iterations we need:
For the Results tab we select empirical percentiles:
For the Plot tab we select Normal Quantile-Quantile to see how well the sampling distribution matches with "normal distribution" theory.
Selecting OK generates the following report window:
And the following plots:
We see that the empirically resampled sampling distribution for wilks lambda follows normal theory fairly closely except for the right tail region. We see that the upper and lower cut-offs for the 2.5/97.5th and 5/95th percentiles both contain the observed value of wilks lambda. We take this as a failure to reject the null hypothesis for wilks lambda. In general the BCa percentiles will be more accurate than the empirical percentiles.
Davison, A.C. and Hinkley, D.V. (1997). Bootstrap Methods and Their Application. Cambridge University Press.
Efron, B. and Tibshirani, R. J. (1993). An Introduction to the Bootstrap. San Francisco: Chapman & Hall.