

RSS Matters
The Allencompassing SAS 8 (1/2)
By Dr.Karl Ho,
Research and Statistical Support Services Manager
Last
year, when I wrote an evaluation note on
SAS 7, (which was a transitional release from SAS' first
windows generation 6.x to the current version SAS 8), I
fell short of giving a full coverage because of SAS'
enormity, it is composed of numerous modules and
procedures. Another reason was that SAS 7 was still in a
developer's release (a "postbeta" beta
version.) After one year, when I decided to evaluate the
new SAS version 8, I have to say I am still shy of giving
a satisfactory report: The best I can do is split the
evaluation into two articles, just to introduce the new
features that are included in version 8 alone.
The New SAS
The new SAS not only demonstrates a
higher level of stability in the MS Windows operating
system (geared for Windows 2000)*, it
introduces a wave of new functionalities and features
that give the software a facelift from its previous
mainframeadapted outlook. Windows users may still
refrain from choosing SAS 7 in lieu of other GUIbased
packages such as SPSS or Statistica since SAS is known
for its syntaxbased operation. With the three new addon
modules (SAS/Analyst, SAS/LAB, SAS/INSIGHT) plus the 3D
graphic PROC G3D procedure, I would declare SAS is now
fully gooey (GUI). For instance, with Analyst
(Solutions> Analysis > Analyst), users can
simply import data in various formats and start analyzing
in the spreadsheetlike, explorer interface. A wide
variety of procedures are readytouse in Analyst, such
as performing bivariate analyses (e.g. Ttest,
correlations, ANOVA) and multivariate analyses (GLM,
Regression, Power analysis, Principal Components and
Survival models). Users can also easily select samples
out of an existing data set and create charts by
pointandclicking.
However, comparative advantages of SAS
are still on its advancement in research and development,
that is exemplified in the new data analysis procedures.
In the following I will briefly introduce these
procedures new to the release 8.1 with some sample
outputs.
Survey Sampling
When starting a survey,
particularly a largescale or national survey,
researchers are concerned how to extract samples
from the population and if and how weighting
should be applied to certain underrepresented
(certain socialeconomic status group in some
geographic areas) or overrepresented groups
(e.g. uppermiddle class among email recipients).
SAS 8 introduces a new series of SAS procedures
enables survey researchers to select their survey
samples using different designs:
simple random
stratified
clustering
unequal weighting
PROC SURVEYSELECT selects samples
via a variety of methods ranging from simple
random to complex multistage design sampling.
With another two new procedures, SURVEYMEANS and
SURVEYREG, researchers can easily estimate sample
and population means, variances, confidence
limits, and other descriptive statistics,
sampling errors and regression models, taking
into account the sampling design and weighting
scheme introduced in the sample selection
process. (sample
output)
Nonparametric Modeling
SAS incorporates in the newest
version 8.1 one of the latest techniques in
modeling nonlinear models: nonparametric
regression. It encompasses a suite of
nonparametric techniques including kernel density
estimation and loess smoothing. The PROC KDE
procedure compute nonparametric estimates using
the method of kernel density estimation, saving
the estimate for subsequent plotting and
analysis. The PROC LOESS and PROC TPSPLINE
provide various smoothing methods to conduct
exploratory data analysis and fit nonparametric
or semiparametric models.
Sample output:
Spatial Prediction
Variogram and 2dimensional
Kriging (Spatial analyses in geology, petroleum
exploration, mining, and water pollution
analysis) PROC VARIOGRAM and PROC KRIGE2D
implement the spatial prediction of unsampled
locations using twodimensional data based on
spatial continuity.
Sample plots:
Qualitative and Limited
Dependent Variable Models
Researchers are very often faced
with dependent variables that are not continuous.
These discrete variables (sometime called
categorical choice) include the choice of
political parties, presidential candidates and
decision to take a bus or a train. One of the
most renowned examples is what the 2000 Nobel
prize laureate, Daniel L. McFadden, has been
studying since 1974: commuters' choice of
transportation mode(**).Multinomial
logit and probit models estimate the probability
of the limited dependent variable such as a
commuter's choice of whether taking a bus or
driving a car. A new procedure in SAS/ETS
is introduced to estimate the family of discrete
choice model. PROC QLIM can analyze the
regular binary (twochoice) probit and logit
models, but also:
Other New tests/features
include:
Exact Logistic Regression (sample output)
Exact tests: generating direct
exact pvalues, or using Monte Carlo simulation
(10000 samples) to estimate exact pvalues.
Numerically Precise Regression
(PROC ORTHOREG***): The new
procedure produces more numerically accurate
estimates than other regression procedures (e.g.
REG, GLM) when data are ill conditioned or badly
scaled.
Next?
In the next article, I will introduce the
following new features:
Partial Least Square
IML workshop
Multiple Imputation for Missing
Data
Distribution analysis
Robust regression
* I should have mentioned
SAS for UNIX (version 8) delivers at least as much as its
Windows version. Given the limit in space, I only
focus on the latter.
** McFadden, D. 1974.
"The Measurement of Urban Travel Demand" Journal
of Public Economics, 3:30328. Another
laureate, James Heckman, another econometrician, is known
for the selection bias model, also called Heckman model.
*** Orthogonal
regression minimizes the distance between the X/Y points
taken together and the regression line but PROC ORTHOREG
uses least squares.
Reference
An, Anthony and Donna Watts. 1998
"New SAS Procedures for Analysis of Sample Survey
Data" SUGI Proceedings
What's New in Data Analysis on SAS
Research and Development communities Web (http://www.sas.com/rnd/app/da/danew.html)
