|
Statistical Computing Tips: S-PlusBy Rich Herrington, Research and Statistical Support Services ConsultantThe Research and Statistical Support Group in Academic Computing Services will be supporting the S-Plus statistical package this fall. Additionally, short courses on S-Plus will be provided. In this column, we give a brief introduction to this product. An object oriented programming environment for statistics and data analysisS-Plus is a value-added commercial version (sold by MathSoft, Inc.) of the programming environment "S" developed by Richard A. Becker, John M. Chambers, and Allan R. Wilks of AT&T Bell Laboratories Statistics Research Department (now Lucent Technologies). S-Plus has an integrated suite of software facilities for data manipulation, computation, and graphical display. Included in this array of facilities are among others (version 4.5):
S-Plus (as of version 4.0) includes a graphical interface to most of the data manipulation and statistical algorithms available in S-Plus (for a demonstration of the graphical capabilities of S-Plus refer to http://www.mathsoft.com/splus/splsprod/splsdes.html). However, the real power of S-Plus lies in its programmability through the command interface. The S-Plus language is interactive, as each command is executed as they are entered. The S-Plus language is based on the use of functions to perform calculations, set system options, manipulate graphical objects, fit statistical models, etc. Variables can refer to scalar values, vectors, matrices, or other forms (lists - an arbitrary collection of scalars, vectors, matrices or other objects). Most importantly, the S-Plus language is an object oriented programming language. Read "Object-Oriented Programming in S-PLUS" for a discussion of object oriented programming in S-Plus. The S-Plus User CommunityPerhaps the most valuable feature of S-Plus is the active user community, which provides S-Plus code and programming advice on the S-news mailing list (to subscribe, send an electronic message to s-news-request@utstat.toronto.edu with the message subscribe). A fairly large collection of freeware S-Plus code can be found on the statlib Web server. Moreover, S-Plus seems to be a popular choice among applied statisticians for the development of new statistical algorithms. S-Plus functions are usually provided by authors, free of charge. For example, selections of texts that include the complete S-Plus implementation for their statistical algorithms are:
General references on the S language in general are:
A more complete listing of third party texts on S-Plus is available at http://www.mathsoft.com/splsprod/Biblio.htm. An Advantage for EducatorsAn advantage of the S or S-Plus language for educators is the existence of the public domain, S clone software: "R". R is collaborative project whose purpose is to develop a freeware system for statistical computation and graphics. R has been heavily influenced by S, and the resulting language is very similar in appearance to S. However, the underlying implementation and semantics are derived from the programming language "Scheme". In fact, the majority of S code (as described in the general references above) will run in R unchanged. There is a WIN95/NT version of R available for download at http://www.stat.math.ethz.ch/R-CRAN/bin/ms-windows/win-32/rw0613b.zip (Note: Some memory resident programs such as virus checking software or previously loaded DLL files will interfere with this WIN95/NT version of R. Hitting CTL-ALT-DELETE will bring up a menu whereby one can selectively shut down these programs to find the offending program). Unlike S-Plus, R is not a value-added commercial application, and lacks the GUI interface of S-Plus 4.0, 4.5. R only provides a command interface and a graphical interface. Additionally, it lacks quite a bit of statistical functionality of its commercial counterpart. However, R is an ongoing project and its functionality increases steadily. For the purposes of a course on statistical programming, R cannot be beat for its value and breadth (also R can be more efficient than S-Plus in terms of utilization of memory). More importantly, R source code and binaries exist for a number of platforms: UNIX, LINUX, Macintosh, and WIN95/NT. At some point in the future the Research and Statistical Support Office will support a UNIX version of R on Sol. n |