Dealing With Missing Data
By Patti Price, RSS Statistical Consultant
In dealing with missing data, one solution is to employ a Monte Carlo approach using a program for creating multiple imputations. While these programs may be purchased as part of packages like SPSS, other similar programs are available for free download. If you are using S-Plus, there are four different packages that may be used as functions in S-Plus. These include NORM (for multivariate continuous data), CAT (for multivariate categorical data), MIX (for mixed continuous and categorical data), and PAN (for panel or clustered data).
A stand-alone version of NORM is also available for those using Windows 95/98/NT. Work is in progress for stand-alone versions of the other programs listed above. Each of these programs was developed by Dr. Joseph Schafer and is available at http://www.stat.psu.edu/~jls/misoftwa.html - top.
Specific information on frequently asked questions concerning multiple imputation is available at http://www.stat.psu.edu/~jls/mifaq.html.
After downloading and installing the NORM program, you will find that there are some example files to work with. Your own files will need to be saved in the .dat format. After opening the file, you will note that there are four file folder tabs to work with. In the data tab, you will see your data and will need to enter the value assigned to your missing data. In the data folder tab, it is possible to enter variable names and to obtain basic descriptive information. To run the complete program, simply click on the EM Algorithm tab and click run, the Data Augmentation tab and click run, and finally click on the Impute from parameters tab and run to complete the process.