Bootstrapped
confidence interval for the Independent ttest The following covers
how to conduct a bootstrapped resampling procedure to get confidence intervals
for a ttest. Use the File, Import Data... to import the
Example Data 1 file using the Import Wizard with SPSS File (*.sav) source
and member name example1 as
was done previously.
Let's start by getting a look at the data and variables of interest.
PROC PRINT DATA = example1;
RUN;
PROC MEANS DATA = example1;
CLASS candy;
VAR recall1;
RUN;
1. Next, we can conduct the ttest.
We can use PROC TTEST to examine differences
between two independent groups. Notice in the output, we get tvalues for variances
assumed equal and variances not assumed equal.
Run the independent groups ttest. PROC TTEST DATA=example1;
CLASS candy;
VAR recall1;
RUN;
2. Building the macro or function; to run the bootstrapped
resampling (yes, this takes some time to type!) with 1000 resamples. References: (1)
(2)
%MACRO bootse (b);
DATA orig1 (WHERE = (candy = 1))
orig2 (WHERE = (candy = 2));
SET example1;
RUN;
DATA boot;
%DO t = 1 %to 2;
DO sample = 1 to &b;
DO i = 1 to NOBS;
pt = ROUND(RANUNI(&t) * NOBS);
SET orig&t NOBS = NOBS POINT = pt;
OUTPUT;
END;
END;
%END;
STOP;
RUN;
PROC MEANS
DATA = boot
NOPRINT
NWAY;
CLASS sample candy;
VAR recall1;
OUTPUT out = x
MEAN = mean;
RUN;
DATA diffmean;
MERGE x (WHERE = (candy = 1) RENAME = (mean = mean1))
x (WHERE = (candy = 2) RENAME = (mean = mean2));
BY sample;
diffmean = mean1  mean2;
RUN;
PROC MEANS
DATA = diffmean
STD;
VAR diffmean;
OUTPUT out = bootse
STD = bootse;
RUN;
%MEND;
%bootse (1000);
DATA bootorig;
SET example1 (in = a)
boot;
if a THEN sample = 0;
RUN;
PROC MEANS
DATA = bootorig
NOPRINT
NWAY;
CLASS sample candy;
VAR recall1;
OUTPUT out = x
mean = mean
var = var
n = n;
RUN;
DATA diff_z;
MERGE x (WHERE = (candy = 1) RENAME = (mean = mean1 var = var1 n = n1))
x (WHERE = (candy = 2) RENAME = (mean = mean2 var = var2 n = n2));
BY sample;
diffmean = mean1  mean2;
diffse = sqrt ((var1 + var2) / (n1 + n2));
RETAIN origdiff;
IF sample = 0 THEN origdiff = diffmean;
diff_z = (diffmean  origdiff) / diffse;
RUN;
PROC SORT
DATA = diff_z;
BY diff_z;
RUN;
DATA t_vals;
SET diff_z END = eof;
RETAIN t_lo t_hi;
IF _n_ = 975 THEN t_lo = diff_z;
IF _n_ = 25 THEN t_hi = diff_z;
IF eof THEN OUTPUT;
RUN;
DATA ci_t;
MERGE diff_z (WHERE = (sample = 0))
bootse (KEEP = bootse)
t_vals (KEEP = t_:);
conf_lo = origdiff  (t_lo * bootse);
conf_hi = origdiff  (t_hi * bootse);
KEEP origdiff bootse t_lo t_hi conf_lo conf_hi;
RUN; 3. Finally, we can then
pull out the confidence interval limits.
PROC PRINT DATA = ci_t;
RUN; 4. With all due respect to the
SAS Institute....that's a ridiculous amount of code when compared
to what is necessary to do essentially the same thing in R. See the
Do It Yourself
Introduction to R course, specifically,
Module 5
which covers t and F tests. The comments and code below were
adapted from that module. ### Robust ttest.
# First create an object (called 'x1' here) to show each group of Candy on Recall1.
x1 < split(Recall1, Candy)
# Load required library(WRS).
library(WRS)
# Robust ttest (Yuen bootstrapped ttest); with trimming (20%), 1000 bootstrapped resamples; onetailed test (side=T).
yuenbt(x1$Skittles, x1$None, tr=.20, alpha=.05, nboot=1000, side=T)
