SAS code
/*Simple example of
creating some dataset*/
/*Example 1, reading data in 'format-free'*/
DATA cars1;
INPUT make $ model $ mpg weight price;
CARDS;
AMC Concord 22 2930 4099
AMC Pacer 17 3350 4749
AMC Spirit 22 2640 3799
Buick Century 20 3250 4816
Buick Electra 15 4080 7827
;
RUN;
/*Example 2- Fixed format*/
DATA cars2;
INPUT make $ 1-5 model $ 6-12 mpg 13-14 weight 15-18 price 19-22;
CARDS;
AMC Concord2229304099
AMC Pacer 1733504749
AMC Spirit 2226403799
BuickCentury2032504816
BuickElectra1540807827
;
RUN;
/*Create a permanent dataset. In order to write (save) a
SAS system file from a raw data file, a libname statement and a data step
are required. While this may seem uninintuitive, it actually can speed
processing for extremely large datasets, and SAS's data
management ability is what makes it attractive to the corporate world (but also
means it contains a lot of bloat typically unnecessary
for academic research, as well as corners cut elsewhere).
First make a folder on your computer for where you want the file to go. In this
case I created one called 'carsdata'.
LIBNAME references said folder and 'out' tells it that we're going to send the
data there. As a further example, using 'in' would
mean we're reading data in from somewhere in that folder.*/
LIBNAME out "C:\Documents and Settings\mjc0016\Desktop\5700\Code\carsdata";
/*From the data previously made above. This creates a data
file which can be used by other programs like R or SPSS,
though SPSS is notorious for not getting the 'scale' correct, and arbitrarily
names numeric variables as nominal, which it then does
not read in at all. I've seen SPSS do that for every file type and via cut and
pasting.*/
DATA out.carsdata;
INFILE "c:\sas\cars1.dat";
INPUT make $ 1-5 model $ 6-12 mpg 13-14 weight 15-18 price 19-22;
RUN;
/*Reading in external text and excel files is perhaps more
easily done via the file menu. Once you do so it will be created in the work
folder of 'Libraries', which once clicked brings up a spreadsheet that can then
be edited.*/
/*Reminder. As long as you have a straightforward dataset, importing is usually easier via menus in my opinion. Excel example;
PROC IMPORT OUT= WORK.excelexample
DATAFILE= "C:\Documents and Settings\mjc0016\Desktop\fitness
.xls"
DBMS=EXCEL REPLACE;
SHEET="Sheet1$";
GETNAMES=YES;
MIXED=NO;
SCANTEXT=YES;
USEDATE=YES;
SCANTIME=YES;
RUN;
/*Import an SPSS
file;
proc import datafile="C:\Documents and Settings\mjc0016\Desktop\ozone.sav"
out=ozone;
run;
Obtaining Frequency Information
/*Import an SPSS file*/
proc import datafile="C:\Program
Files\SPSS16\Samples\University of Florida graduate salaries.sav" out=salary;
run;
/*Look at it if you
want*/
proc print data=salary;
run;
/*Simple and grouped
data*/
PROC FREQ DATA=salary;
TABLES college ;
RUN;
proc freq data=salary;
TABLES gender*college ;
RUN;
/*Barchart. Hideous
defaults, but at least you can do things here that will tailor the graph to your
specs e.g. title, bar width and spacing. VBar tells it I want vertical bars, and
when I imported the data, it keeps the numeric nature of them (along with the
labels). If you don't put 'discrete' you'll have numeric midpoints between the
bars too. The 'patternid' tells it I want different colors for the groups*/
TITLE 'Simple Vertical Bar Chart ';
PROC GCHART DATA=salary;
VBAR college/discrete patternid=midpoint;
RUN; quit;
/*Here I actually
specify the colors I want and tell it to have very little space between bars
(within each group)*/
PROC GCHART DATA=salary;
VBAR college /discrete group= gender space = 0.5 patternid=midpoint;
pattern1 c=orange ;
pattern2 c=purple ;
pattern3 c=green;
pattern4 c=rose ;
pattern5 c=blue ;
pattern6 c=yellow;
pattern7 c=cream;
pattern8 c=black;
RUN; quit;
/*Create a dataset*/
DATA auto ;
input MAKE $ PRICE MPG REP78 FOREIGN ;
DATALINES;
AMC 4099 22 3 0
AMC 4749 17 3 0
AMC 3799 22 3 0
Audi 9690 17 5 1
Audi 6295 23 3 1
BMW 9735 25 4 1
Buick 4816 20 3 0
Buick 7827 15 4 0
Buick 5788 18 3 0
Buick 4453 26 3 0
Buick 5189 20 3 0
Buick 10372 16 3 0
Buick 4082 19 3 0
Cad. 11385 14 3 0
Cad. 14500 14 2 0
Cad. 15906 21 3 0
Chev. 3299 29 3 0
Chev. 5705 16 4 0
Chev. 4504 22 3 0
Chev. 5104 22 2 0
Chev. 3667 24 2 0
Chev. 3955 19 3 0
Datsun 6229 23 4 1
Datsun 4589 35 5 1
Datsun 5079 24 4 1
Datsun 8129 21 4 1
;
RUN;
/*See the data*/
PROC PRINT DATA=auto;
RUN;
/*Frequencies*/
PROC FREQ DATA=auto;
TABLES foreign ;
RUN;
/*Various summary
measures for one of the variables*/
PROC UNIVARIATE DATA = auto;
VAR mpg;
HISTOGRAM mpg;
TITLE "Miles per Gallon";
run;
PROC CORR DATA=auto;
var price mpg;
run;
/* correlation against tested against a specific value, in this case against a
population rho of -.2*/
PROC CORR DATA=auto fisher (rho0=-.2) nosimple;;
var price mpg;
run;
Regression