![]() |
|
![]() |
|
![]() |
![]() |
|
| |
||||||
This page is for introducing the SPSS package to those who are just about to begin their journey with it. It assumes no prior SPSS exposure. The following was put together by Mike Clark.
I. What is SPSS?
II. Who can use
SPSS?
III. How does
SPSS work?
IV. Getting
data into SPSS
V. Menu options
VI. Relative Advantages
and Disadvantages
VI. Summary
SPSS1 is a popular statistics program used in a variety of scientific disciplines. It is composed of two facets, the statistical package itself and the SPSS language, a system of syntax used to execute commands and procedures. Likewise, there are two approaches to using SPSS: (a) via the menu system and point-and-click approach and (b) via the use of SPSS programming syntax. Most users will find a combination of these approaches most effective in carrying out their data analyses. At the University of North Texas, we have obtained the licenses of the software for Windows and Mac OSX. In this series, we will focus on SPSS for Windows, which is a complete data analysis program with many capabilities and applications. The requirements for PCs and Macs are as follows.
For SPSS 16.0 for Windows
|
For SPSS 16.0 for Mac
|
While those requirements are for version 16, labs on campus do not yet use 16 (and from our point of view at RSS you are better off as such).
II. Who can use SPSS?
SPSS software is distributed through the university's site license. UNT
has a site license that allows students to use the software in any
general access labs on campus. For students who want to install the
software on their own machines, versions of the software are available for
sale at the UNT bookstore at discounted academic prices.
For the most current student prices at UNT Bookstore you may contact
them at 940-565-2592, however the price is about $200 for the 'Grad
Pack' and $100 for a crippled Student Version which expires each
year. The only reason to purchase it is for private use, it
is otherwise ubiquitous on campus.
ONLY Full-time faculty and staff can request SPSS installation on their machines on campus or at home. Are you faculty? Are you someone that puts in 40 hours a week on campus in a completely non-student capacity? If your answer is no to both then you do not qualify in any way, shape, or form for a personal copy from us at RSS. There is no ambiguity in using the word 'only' here, despite what your cousin's best friend who is a student in a department on the other side of campus may have told you. And no, we won't believe for a second that your major professor sent you over for their copy.
SPSS has three basic files, viz. the data, syntax, and output file.
SPSS Data
The data window contains your SPSS system files and displays your data in spreadsheet format. With version 14 you can now have multiple data files open. With 16 SPSS is also now Java-based, which seems to have resulted in SPSS losing part of its 'ease of use' advantage it had relative to other some other stat packages as even casual use may reveal some quirks. For simple data entry it works very well, but if you are already familiar with Excel you won't necessarily find much advantage.
This is where you will directly enter data into SPSS. The rows are typically considered to be observational units (e.g., the subjects under study), and the columns are considered to be variables for the observational units. You can cut, paste, and delete rows (observational units) and columns (variables) as desired from this window as well as move cases and variables around by clicking and dragging. SPSS system files are by default stored with the *.sav extension, but they can be saved as many other types of files. In particular, we recommend you saving a completed dataset as a *.por (portable) at least once to save on possible compatibility issues. The data window actually has two views, the actual data view above and the variable view, seen here.

It is with the variable view that you will be able to assign variable type, vary column widths, create variable labels, assign missing values etc. We suggest you make everything that isn't actually a name, e.g. country or person's name, as numeric with labels instead, unless you're of the sadistic type then go right ahead and add what will likely be a few hours work later. As a final note, leave the data alone. Excel users in particular come to data analysis with bad habits like coloring cells and playing with font sizes and the like. If you want to do that with output, have at it, but leave the data file itself like it is unless you want a headache later.
SPSS Syntax
SPSS is kind of odd to me. People like it because of its menus, but the menus are so limiting that one inevitably has to go to syntax to perform a worthwhile analysis (or more likely, to another stat package). However if you're going to use syntax, SPSS is not flexible or efficient compared to other packages. Its language was developed when people only did this sort of stuff on mainframe computers and it's never changed, even though computing continued to evolve. To make up for this, SPSS now has add-ons that allow one to use true programming languages like Python and R. But if you could use those as an applied academic researcher, there would be no reason to be using SPSS in the first place. In any event if you still prefer SPSS the best way to use it is with the syntax or heavily supplemented with it, the window of which is shown below.

As we'll see in the next course, SPSS certainly does make for much more efficient data analysis with syntax compared to using menus, and there are tricks one can do there that are unavailable in the menus (e.g. the powerful MANOVA procedure). However note that the menus are still available on the syntax window, so you can use them if needed. SPSS syntax files are *.sps. Also, with version 17.0 the syntax editor has changed quite a bit, mostly for the better. But there are potential compatibility issues running the syntax from 17 in prior versions.
SPSS Output
The third type of file common to SPSS is the output file (with 16 I guess we should call it a viewer file).
Data_view
In some ways SPSS has an edge over other packages because things do come out a little easier on the eyes with the text information, and students I've had seem to fall in love with text in grids for some reason. Furthermore it is very easy to export them to html or Powerpoint for presentation. Unfortunately this comes at a price, namely that you can't do anything with those results in the output, for example, use them as input into a new analysis (at least not without some notable syntactical finagaling). Novice users will not think that's a big deal. More experienced researchers know better. That said with version 16 output is now very slow to come up, and SPSS graphics have lagged far behind most major statistical packages for awhile now. In short, pretty text is not a reason to use a package, and while you can easily export the graphics, this is no longer an advantage it has over other packages. A final note: users of 16 cannot view the old *.spo files without installing the legacy viewer, which is not installed by default but is available to any SPSS user. The file extension is now *.spv.
There are three main ways to get data into SPSS: (a) creating a new SPSS data file, (b) opening existing SPSS data files, and (c) importing data from another source such as an ASCII file, an Excel spreadsheet, etc.
1. Creating new SPSS data files
Data can be directly entered into SPSS similar to an Excel spreadsheet. However, if you are going to enter data directly, you will need to name and define your variables.
2. Opening existing SPSS system files
Opening existing SPSS files is a fairly straightforward procedure, similar to opening other Windows files. Select "Open" from the File menu, and you will find a dialog box that looks similar to the figure below. You can also see that you can open any type of SPSS file, not just data files, as well as easily call up any files you've used recently (the number of recent files can be adjusted in Edit/Options).

As mentioned previously, SPSS system files are stored with the *.sav extension. By default, SPSS assumes that you want to enter an SPSS system file, though there are many file types you may access for direct import, and this always will get someone who might instead be looking for an Excel file ("I swear I put it on the desktop!!"). You can then move to the directory in which the data file you wish to open is stored and open the file.
3. Importing data from an ASCII file.
For a number of reasons, data is often in ASCII or text format, the big one being that any program can read it. In order to use the data in SPSS, the data must be converted to a file format that SPSS can recognize, namely something in *.sav format. SPSS can read in ASCII data, which can then be saved in *.sav format. The basic approach through menus is shown below, however if you are pulling huge data files off the web, e.g. through ICPSR, you will be using syntax that is typically provided.

4. Importing data from other file formats
SPSS allows the user to open data directly into SPSS from many different file formats. For example, SPSS will directly open Excel, SAS, Stata and *.dbf (database) files. All the user needs to do is to go to the File Menu, select "Open", select the correct file type from the "Files of Type" drop down menu, and navigate to the file you wish to open.
As can be seen above, there are several menus available and necessary over the course of analysis. To begin with, it is suggested you spend some time customizing SPSS output and views to your liking with Edit/Options. The File menu is much like other (Windows) applications, as is the Edit meu, and the most commonly used for the applied researcher will be the Data, Transform, and Analyze menus. Note that many of the analyses come with plotting options specific to them and which are not available in the Graphics menu, but as was mentioned earlier, SPSS possesses fairly poor graphics capabilities in general.
A word of caution regarding menus. Just because you can easily click your way through to an analysis, doesn't mean, a. You've done any of it appropriately b. Your analysis is worth any more than the paper you might print it out on. Menus can make it easy to get results, but it doesn't mean they will be useful. In short output does not equal analyis. At RSS we've gotten many clients that come in who have clicked their way through to horrible results, which were poor because they went straight to analysis. That of course is to be avoided.
VI. Relative Advantages and Disadvanatges
Advantages: SPSS offers a user friendliness that most packages are only now catching up to. It is popular, and though that is certainly not a reason for choosing a statistical package, many data sets are easily loaded into it and other programs can easily import SPSS files. As of version 16 and 17 it now is compatible with R and Python (assuming they are installed on the machine), which can give it the functionality it otherwise lacks or would be too clunky in its own syntax.
Disadvantages: For academic use SPSS lags notably behind SAS, R and even perhaps others that are on the more mathematical rather than statistical side for modern data analysis (e.g. robust and bootstrapping approaches available easily conducted elsewhere are nonexistent or very difficult to do, basic tests of analytical assumptions are often not available). Its menu offerings are typically the most basic of an analysis and sometimes lacking even then, and it makes doing an inappropriate analysis very easy. The default graphics are poor and not easily customizable to make them better. It is expensive, sometimes ridiculously so (e.g. many of its add-ons are free elsewhere or part of the base install for other packages), and even when you do buy you're really only leasing, and its license is definitely not user friendly. There are often compatibility issues with prior versions.
SPSS offers quite a bit as a general statistics program, and is freely and widely available to everyone on campus via the labs, and if qualified, faculty or staff for personal use. There are three basic files to work with (though others are available), and SPSS has done a lot to develop its graphical user interface. If you are partial to GUI approaches SPSS is certainly ahead of some, but not all, others in that department, but that is about the only thing it has over others that would likely appeal to the applied academic researcher. If you are looking for further help and info on SPSS, there is a great deal on it on the web because of its popularity, so feel free to do searching on your own. You'll find that most books on using SPSS offer much less than what is freely and easily obtainable on the web.
Footnotes
1. Some refer to it as "statistical package for the social sciences". The company officially dropped that name a couple decades ago and it is in no way specific to social science research only, whatever that might have meant initially.
Back to the SPSS Intructional Page
| RSS Main Page
| Computer
Center Home
| Academic Computing
Services | Help
Desk | Training
About Us | Publications | Our Mission Questions, comments and corrections for this site:Rich Herrington, Patrick McLeod, Mike Clark Last updated: |