

This
is a reprint (with a few small changes) of an article that originally
appeared in Benchmarks
Online in
March, 2005. You can link to the last RSS article
here:
Ade4TkGUI  A GUI for Multivariate Analysis and Graphical
Display
in R  Ed. Using Statistical
Software in Classroom Instruction:
SPlus/R, An Accessible, Low Cost Alternative By
Dr. Rich Herrington,
Research
and Statistical Support Services Manager The
choice of which statistical package to use in an introductory
statistics or advanced statistics course can be
determined by
a number of considerations: 
Which
statistics package is the instructor most comfortable with? 
Popularity of the statistics package 
Goals of the
intended student user  will the student be doing more involved
research and development, or will they be engaging in
intermittent
cursory usage? 
Ease of use 
are there drop down
menus? How easy is the syntax/language to learn? 
Flexibility

Cost for the student 
Will the
student be using modern, advanced statistical technologies, or will
they be relying mostly on well known classical methods? 
How important is high quality, publication ready graphics
(both
exploratory and classical)? 
Availability
of the
software during course work, and after the student leaves the academic
institution 
Is there an
active, supportive community
of users? 
How available are
documentation,
tutorials, and books? 
Are
there statistics textbooks
that cover software usage along with theory?
These
are only a few of the considerations involved in selecting a statistics
package for a statistics course. In this article,
we bring
two data analysis/statistical systems to the attention of educators:
"SPlus" (the commercial version of the "S" language) and the
public domain "R" (free version of the "S" language). We
discuss the cost and availability of SPlus and R to the community of
UNT researchers, instructors, and students.
SPlus
SPlus
incorporates the
objectoriented language S, developed at AT&T Bell Labs
statistics
research group (Lucent Technologies).
Marketed by Insightful
Corp., SPlus fits
statistical models as "objects", making data analysis much
more
flexible than the older, procedural language approach (e.g. SPSS, SAS).
SPlus incorporates a highly useable graphical user
interface (see
this online tutorial for examples), along with the
capability
of script based processing. Additionally, SPlus allows the
user
to "interact" with data and graphics through a command line
interface. The figure below provides an example of the SPlus
GUI
interface:
SPlus
has an active worldwide user community 
SNEWS.
Additionally, Insightful Corp. provides online versions of
all SPlus
documentation (this documentation is also installed locally
upon
software installation). Students, instructors and
researchers
will be glad to know that many books
and
tutorials have been published on the SPlus
system.
Advanced researchers should be excited about the continuing
expansion of the SPlus system with the newest statistical technologies
available. Insightful Corp. provides
numerous
"experimental" research libraries at nocharge for download.
Currently, these libraries include:
S+CorrelatedData (mixed
effects generalized linear models), S+Best (BSpline
methods),
S+Resample (bootstrap library), S+Bayes (bayesian analysis), S+FDA
(functional data analysis). Many of the libraries
utilize
both a "dropdown" GUI menu system and a command line
interface
approach. One particular library that could be particularly
useful to introductory statistics instructors is the
S+Resample
library. A
current trend in statistics education is to use resampling methods
(e.g.
bootstrap & permutation methods) to illustrate empirical
sampling
distributions and nonparametric confidence intervals based
on the
empirical sampling distribution. One notable example: Tim Hesterberg
and
coauthors have teamed up with the authors of the highly acclaimed
"Introduction
to the Practice of Statistics, Fifth Edition" by
David Moore and George McCabe, to produce a book
chapter that
integrates the bootstrap into the statistics curriculum at an
elementary level. This book chapter utilizes the
S+Resample
library to provide easy accessibility to resampling methods
at an
introductory statistics level. Tim Hesterberg has also
written
about using resampling
and simulation methods in teaching statistics. Researchers
who are interested in "datamining" methodologies can use
SPlus in
conjunction with Insightful Corp.'s
"Insightful Miner" product to explore undetected
patterns in
massive datasets. A quick search on Google search engine demonstrates
that SPlus is a popular system for research and instruction
(e.g.
a search on "SPlus" returned 482,000
hits). Pricing and
Availability of SPlus at the
University of North
Texas Students can purchase
an "Academic" version of
SPlus at the UNT University Bookstore for $25. This is a
specially licensed copy of SPlus (for UNT campus) that expires one
year after installation (MicroSoft Windows
version). This
academic version has all the features of SPlus "Professional", except
that it expires one year after installation. Insightful
Corp. also provides a "Student" version of SPlus that is freely
available at
http://elms03.eacademy.com/splus/ This version of
SPlus is
free, and has full statistical functionality of the academic
version, but: 1) Has a 20,000 cell or 1,000 row
limitation;
2) Is only for educational use; 3) Expires after one year;
4) Has
a large download (more than 100 meg). Students register at
the
website, download the software, and are given a license code
that
enables the software. The "Student" version of SPlus
is an
attractive alternative to the "Academic" version of SPlus for those
instructors teaching a "long distance" learning course where
students are incapable of purchasing SPlus from the
bookstore.
For fulltime faculty, SPlus can be obtained at no cost from
the Research and
Statistical Support Office
(RSS) at UNT. SPlus is gaining in
popularity (it is
already a favorite amongst professional statisticians); SPlus excels
in incorporating modern
statistical methodology while maintaining a large inventory
of
classical statistical methodologies; There are
many
tutorials, advanced methodology books, and introductory
statistics textbooks that incorporate SPlus.
SPlus
compares favorably on the all softwarechoice considerations enumerated
above. That is, SPlus can accommodate both novice
users and
heavily research oriented practitioners of statistics. R R is
an opensource initiative whose aim
is to create and distribute the same high quality, "cuttingedge"
statistical technology that SPlus is known for (see the R homepage).
Quoting from
the R homepage: R is a
language and
environment for statistical computing and graphics. It is a GNU project
which is
similar to the S language and environment which was developed at Bell
Laboratories (formerly AT&T, now Lucent Technologies)
by John
Chambers and colleagues. R can be considered as a different
implementation of S. There are some important differences, but much code
written for S runs unaltered under R. R
provides a
wide variety of statistical (linear and nonlinear modeling, classical
statistical tests, timeseries analysis, classification,
clustering, ...) and graphical techniques, and is highly extensible. The
S language is often the vehicle of choice for research in statistical
methodology, and R provides an Open Source route to
participation
in that activity. One of R's
strengths is the ease
with which welldesigned publicationquality plots can be
produced, ncluding mathematical symbols and formulae
where
needed. Great care has been taken over the defaults for the
minor
design choices in graphics, but the user retains full control.
R is available as Free Software under the terms of
the Free
Software Foundation's
GNU
General Public
License in source code form. It compiles and runs on a wide
variety
of UNIX platforms and similar systems (including FreeBSD and
Linux), Windows and MacOS. As
a free
alternative to SPlus, R cannot be beat. Available to the R
system are hundreds of user contributed libraries that cover
large
areas of both classical and modern statistics (see UNT's
R
server help page on installed packages). While
SPlus excels
at providing advanced functionality through a menu system, R
excels
in providing breadth in statistical functionality (e.g. our own RSS R Server
has
587 libraries installed). Much of this statistical
functionality
is not duplicated for the SPlus environment. Partly, this
is a
result of the R system being an opensource project. Since
the R
source code is available to developers of statistical
technology,
much integration of R with existing statistical tools, databases, and operating
systems has occurred. The "Omegahat"
project being the prime example of such efforts. From the
Omegahat website: Omega
is a joint project
with the goal of providing a variety of opensource
software
for statistical applications. The Omega project began in
July,
1998, with discussions among designers responsible for three
current statistical languages (S, R, and LispStat), with the
idea
of working together on new directions with special emphasis on
webbased software, Java, the Java virtual machine, and
distributed computing. We encourage participation by anyone
wanting to extend computing capabilities in one of the existing
languages, to those interested in distributed or webbased
statistical software, and to those interested in the design
of new
statistical languages. R's
integration with web servers should be of particular interest
to
instructors who are interested in webbased statistics
courses. For a number of years now, I have been
using a
modified version of
Rcgi to create online, interactive tutorials for Benchmarks
articles and
introductory statistics courses. Our RSS
Matters
column has a number of examples of using R to create interactive tutorials:
robust statistics,
kernel density estimation,
false detection rate,
robust correlation,
bootstrap, too name a few. If, as an
instructor, you are
concerned about the lack of a default dropdown menu system
for R,
some efforts have gone toward
developing a GUI system for the R system. The
most
notable of these efforts is John Fox's R
Commander
(see our past Benchmarks articles on this GUI 
Article1;
Article 2;
Article 3  these articles are somewhat dated). See
the main
R Commander website for the most recent updates. R Commander
uses
both a drop down menu system and a script window.
Similar to
other statistical packages, R Commander pastes syntax into a
syntax editor whenever the contents of a menu system window have been
submitted. This allows easy access to default
syntax (via a
GUI) , but allows the user to see the syntax, change the
syntax,
and save the syntax, for later submission. This facilitates
learning to program in the "S" language. A couple of
examples of R
Commander's interface is presented below: Like
the SPlus user
community, the R user community is highly active as well  RHELP. In
addition, the R developers publish a high quality, edited newsletter
that
covers software development news, R package development and
usage,
as well as the usual tips and hints about using R.
The user
community is also quite generous in providing free
tutorials,
books, and documents on R. R's documentation
is very high quality as
well. The basic R
language is well documented with examples that can be executed as is,
then modified as the user needs. For example,
fitting a
regression, ANOVA, or ANCOVA model can be fit with the "lm"
function. The help function for lm gives
the user an
example that can be executed by pasting the text into the R
console, then altered as needed. The
"foreign" package gives users the ability to import other
file
formats: SAS, SPSS, Stata, Minitab, SYSTAT, to
mention some
of the more common formats available. R's base language
is
mostly compatible with the SPlus base language (greater than
95%?). That is, most code written with the base R
language
will run unaltered in SPlus and viceversa. It is
not
inconceivable that a student or researcher would use both R and SPlus
in conjunction with one another. A "task" view of the
organization of R packages can be found at task view.
Conclusion
In summary, R compares favorably with SPlus (and is arguably
superior in some ways). In regards to
some of the
statisticalsoftware choices enumerated at the beginning of this
article: 1) Both SPlus and R are readily
available and
inexpensive to the student and instructor; 2) Both SPlus
and R
are readily available to instructor and student; 3) Both
SPlus
and R are inexpensive alternatives to more popular
statistical
packages (e.g. SAS, SPSS, Stata); 4) Both SPlus
and R excel
at providing a broad range of classical and modern statistical methodologies;
5) SPlus utilizes an advanced menu system that is more accessible to students,
however, R is gaining some ground on that issue; 6) Both
SPlus
and R can accommodate a range of users from novice to
advanced,
that is, both cursory users and researchers; 7)
Both SPlus
and R have high quality documentation and textbook usage;
8)
The user communities of both SPlus and R are highly active and
accessible to both student and researcher;
9) SPlus
and R are already favorites amongst theoretical and applied
statisticians, and both of these systems are becoming increasingly important
in the environmental, biological, medical, and social sciences, as
evidenced by the increase in classes being taught utilizing
these
environments and the increase in statistical texts being
published (for example, Bayesian Methods have become increasingly important
and R
has many supporting packages for teaching Bayesian methods); 10)
And most importantly  THE PRICE IS
RIGHT! Resources Faraway, Julian (2006). Extending
the Linear Model with R, CRC Press. Jureckova, J & Picek,
J. Robust
Statistical Methods With R, CRC Press. Wood, Simon (2006).
Generalized
Additive Models: An Introduction With R, CRC Press. Everitt,
Brian S. (2005).
An R and SPlus Companion to Multivariate Analysis,
Springer.
Faraway, Julian (2005).
Linear Models with R, CRC Press. Good,
Phillip
(2005).
Introduction to Statistics Through Resampling Methods R/SPlus, Wiley.
Heiberger, R.M. & Holland, Burt (2004).
Statistical Analysis and Data Display: An intermediate
Course with
Examples in SPlus, R and SAS, Springer. Verzani,
John
(2005).
Using R for Introductory Statistics, CRC Press. Crawley,
Michael (2002).
Statistical Computing: An Introduction to Data Analysis Using
SPlus, Springer. Dalgaard, Peter (2002).
Introductory
Statistics with R, Springer. Kraus, A
& Olson, M.
(2002). The
Basics
of SPlus, Third Edition, Springer. Venables,
W.N. &
Ripley, B.D. (2002). Modern Applied
Statistics
with S, Fourth Edition, Springer. Pinheiro,
J.C. &
Bates, D.M. (2000).
MixedEffects Models in SPlus, Springer.
Special Announcements:
RSS will be maintaining a blog devoted to research and statistics
related news  RSSBlogs;
Additionally, RSS will be maintaining a Zope/Plone website devoted
organizing communities and resources involved in survey research  RSSSurveys.

