is a reprint (with a few small changes) of an article that originally
appeared in Benchmarks
March, 2005. You can link to the last RSS article
Ade4TkGUI - A GUI for Multivariate Analysis and Graphical
in R - Ed.
Software in Classroom
S-Plus/R, An Accessible, Low Cost Alternative
Dr. Rich Herrington,
and Statistical Support Services Manager
choice of which statistical package to use in an introductory
statistics or advanced statistics course can
a number of considerations:
statistics package is the instructor most comfortable with?
Popularity of the statistics package
Goals of the
intended student user - will the student be doing more involved
research and development,
or will they be engaging in
Ease of use -
are there drop down
menus? How easy is the syntax/language to learn?
Cost for the student
student be using modern, advanced statistical technologies, or will
they be relying mostly on well
known classical methods?
How important is high quality, publication ready graphics
exploratory and classical)?
software during course work, and after the student leaves the academic
Is there an
active, supportive community
How available are
tutorials, and books?
there statistics textbooks
that cover software usage along with theory?
are only a few of the considerations involved in selecting a statistics
package for a statistics course.
In this article,
two data analysis/statistical systems to the attention of educators:
"S-Plus" (the commercial
version of the "S" language) and the
public domain "R" (free version of the "S" language).
discuss the cost and availability of S-Plus and R to the community of
UNT researchers, instructors,
object-oriented language S, developed at AT&T Bell Labs
Marketed by Insightful
Corp., S-Plus fits
statistical models as "objects", making data
flexible than the older, procedural language approach (e.g. SPSS, SAS).
incorporates a highly useable graphical user
this online tutorial for examples), along with the
of script based processing. Additionally, S-Plus allows the
to "interact" with data and graphics
through a command line
interface. The figure below provides an example of the S-Plus
has an active world-wide user community -
Additionally, Insightful Corp. provides
online versions of
documentation (this documentation is also installed locally
installation). Students, instructors and
will be glad to know that many books
have been published on the S-Plus
Advanced researchers should be excited about the
expansion of the S-Plus system with the newest statistical technologies
"experimental" research libraries at no-charge for download.
Currently, these libraries include:
effects generalized linear models), S+Best
S+Resample (bootstrap library), S+Bayes (bayesian analysis), S+FDA
data analysis). Many of the libraries
both a "drop-down" GUI menu system and a command line
approach. One particular library that could be particularly
useful to introductory statistics
instructors is the
current trend in statistics education is to use resampling methods
bootstrap & permutation methods) to illustrate empirical
distributions and non-parametric
confidence intervals based
empirical sampling distribution. One notable example: Tim Hesterberg
co-authors have teamed up with the authors of the highly acclaimed
to the Practice of Statistics, Fifth Edition" by
David Moore and George McCabe,
to produce a book
integrates the bootstrap into the statistics curriculum at an
This book chapter utilizes the
library to provide easy accessibility to resampling methods
introductory statistics level. Tim Hesterberg has also
and simulation methods in teaching statistics. Researchers
who are interested in "data-mining"
methodologies can use
conjunction with Insightful Corp.'s
"Insightful Miner" product to explore
massive datasets. A quick search on Google search engine demonstrates
is a popular system for research and instruction
a search on "S-Plus" returned 482,000
Availability of S-Plus at the
University of North
Students can purchase
an "Academic" version of
S-Plus at the UNT University Bookstore for $25. This is a
specially licensed copy of S-Plus (for UNT campus) that expires one
year after installation (MicroSoft Windows
academic version has all the features of S-Plus "Professional", except
that it expires one year
after installation. Insightful
Corp. also provides a "Student" version of S-Plus that is freely
http://elms03.e-academy.com/splus/ This version of
free, and has full statistical functionality of the
version, but: 1) Has a 20,000 cell or 1,000 row
2) Is only for educational use;
3) Expires after one year;
a large download (more than 100 meg). Students register at
download the software, and are given a license code
enables the software. The "Student" version of S-Plus
attractive alternative to the "Academic" version of S-Plus for those
instructors teaching a "long distance"
learning course where
students are incapable of purchasing S-Plus from the
For full-time faculty,
S-Plus can be obtained at no cost from
the Research and
Statistical Support Office
(RSS) at UNT.
S-Plus is gaining in
popularity (it is
already a favorite amongst professional statisticians); S-Plus excels
statistical methodology while maintaining a large inventory
classical statistical methodologies;
tutorials, advanced methodology books, and introductory
statistics textbooks that incorporate
compares favorably on the all software-choice considerations enumerated
above. That is, S-Plus
can accommodate both novice
heavily research oriented practitioners of statistics.
an open-source initiative whose aim
is to create and distribute the same high quality, "cutting-edge"
technology that S-Plus is known for (see the R homepage).
the R homepage:
R is a
environment for statistical computing and graphics. It is a GNU project
similar to the S language and environment which was developed at Bell
(formerly AT&T, now Lucent Technologies)
Chambers and colleagues. R can be
considered as a different
implementation of S. There are some important differences, but much
written for S runs unaltered under R.
wide variety of statistical (linear and nonlinear modeling, classical
time-series analysis, classification,
clustering, ...) and graphical techniques, and is highly extensible.
S language is often the vehicle of choice for research in statistical
methodology, and R provides
an Open Source route to
in that activity.
One of R's
strengths is the ease
with which well-designed publication-quality plots can be
ncluding mathematical symbols and formulae
needed. Great care has been taken over the
defaults for the
design choices in graphics, but the user retains full control.
R is available as Free Software under the terms of
License in source code form. It compiles and runs on a wide
platforms and similar systems (including FreeBSD and
Linux), Windows and MacOS.
alternative to S-Plus, R cannot be beat. Available to the R
system are hundreds of user contributed
libraries that cover
areas of both classical and modern statistics
server help page on installed packages). While
at providing advanced functionality
through a menu system, R
in providing breadth in statistical functionality (e.g. our own RSS R Server
587 libraries installed). Much of this statistical
is not duplicated for the S-Plus environment.
result of the R system being an open-source project. Since
source code is available
to developers of statistical
much integration of R with existing statistical tools, databases, and
systems has occurred. The "Omegahat"
project being the prime example of such efforts. From
is a joint project
with the goal of providing a variety of open-source
statistical applications. The Omega project began in
1998, with discussions among
designers responsible for three
current statistical languages (S, R, and Lisp-Stat), with
of working together on new directions with special emphasis on
Java, the Java virtual machine, and
distributed computing. We encourage participation
wanting to extend computing capabilities in one of the existing
to those interested in distributed or web-based
statistical software, and to those interested
in the design
integration with web servers should be of particular interest
instructors who are interested in
courses. For a number of years now, I have been
Rcgi to create online, interactive tutorials for Benchmarks
courses. Our RSS
column has a number of examples of using R to create interactive
kernel density estimation,
false detection rate,
too name a few. If, as an
instructor, you are
concerned about the lack of a default drop-down
some efforts have gone toward
developing a GUI system for the R system.
notable of these efforts is John Fox's R
(see our past Benchmarks
articles on this GUI -
Article 3 - these articles are somewhat dated). See
R Commander website for the most recent updates. R Commander
both a drop down
menu system and a script window.
other statistical packages, R Commander pastes
syntax into a
syntax editor whenever the contents of a menu system window have been
This allows easy access to default
syntax (via a
GUI) , but allows the user to see the syntax,
and save the syntax, for later submission. This facilitates
learning to program
in the "S" language. A couple of
examples of R
Commander's interface is presented below:
the S-Plus user
community, the R user community is highly active as well - R-HELP.
addition, the R developers publish a high quality, edited newsletter
development news, R package development and
as well as the usual tips and hints
about using R.
community is also quite generous in providing
books, and documents on R. R's documentation
is very high quality as
The basic R
language is well documented with examples that can be executed as is,
modified as the user needs. For example,
regression, ANOVA, or ANCOVA
model can be fit with the "lm"
function. The help function for lm gives
the user an
that can be executed by pasting the text into the R
console, then altered as needed. The
"foreign" package gives users the ability to import other
formats: SAS, SPSS, Stata,
Minitab, SYSTAT, to
of the more common formats available. R's base
mostly compatible with the S-Plus base language (greater than
is, most code written with the base R
will run unaltered in S-Plus and vice-versa.
inconceivable that a student or researcher would use both R and S-Plus
with one another. A "task" view of the
organization of R packages can be found at task view.
In summary, R compares favorably with S-Plus (and is arguably
superior in some ways). In
some of the
statistical-software choices enumerated at the beginning of this
1) Both S-Plus and R are readily
inexpensive to the student and instructor;
2) Both S-Plus
are readily available to instructor and student; 3) Both
R are inexpensive alternatives to more popular
packages (e.g. SAS, SPSS, Stata);
4) Both S-Plus
and R excel
at providing a broad range of classical and modern statistical
5) S-Plus utilizes an advanced menu system that is more accessible to
however, R is gaining some ground on that issue; 6) Both
and R can
accommodate a range of users from novice to
that is, both cursory users
and researchers; 7)
and R have high quality documentation and textbook
The user communities of both S-Plus and R are highly active and
to both student and researcher;
and R are already favorites amongst
theoretical and applied
statisticians, and both of these systems are becoming increasingly
in the environmental, biological, medical, and social sciences, as
by the increase in classes being taught utilizing
environments and the increase in
statistical texts being
published (for example, Bayesian Methods have become increasingly
has many supporting packages for teaching Bayesian methods);
And most importantly - THE PRICE IS
Faraway, Julian (2006). Extending
the Linear Model with R, CRC Press.
Jureckova, J & Picek,
Statistical Methods With R, CRC Press.
Wood, Simon (2006).
Additive Models: An Introduction With R, CRC Press.
Brian S. (2005).
An R and S-Plus Companion to Multivariate Analysis,
Faraway, Julian (2005).
Linear Models with R, CRC Press.
Introduction to Statistics Through Resampling Methods R/S-Plus, Wiley.
Heiberger, R.M. & Holland, Burt (2004).
Statistical Analysis and Data Display:
Examples in S-Plus, R and SAS, Springer.
Using R for Introductory Statistics, CRC Press.
Statistical Computing: An Introduction to Data Analysis
Dalgaard, Peter (2002).
Statistics with R, Springer.
& Olson, M.
of S-Plus, Third Edition, Springer.
Ripley, B.D. (2002). Modern Applied
Fourth Edition, Springer.
Bates, D.M. (2000).
Mixed-Effects Models in S-Plus, Springer.
RSS will be maintaining a blog devoted to research and statistics
related news - RSS-Blogs;
Additionally, RSS will be maintaining a Zope/Plone website devoted
organizing communities and resources involved in survey research - RSS-Surveys.