Thursday, August 5, 2010

Exporting

I have discussed how to import data into SAS from SPSS and Excel files, but what about when a user wishes to get the data out of SAS into another file format.  Exporting data is one of those times I really love SAS and the tools/wizards they provide.   Let’s start with the SAS Export Wizard.
The Export wizard:
                File -> Export Data…
Here you can see the Export Wizard is open and ready for you to select the file.  In the Library drop down menu, you simply select the library location, and the “member” is the file name.  In this example I would be exporting the SAS dataset “DATA” from the “WORK” library.
One thing you may quickly notice is the extensive list of file formats available to you by SAS.  A few include:  Excel, comma/tab delimited, JMP, SPSS, ect.  That is right SPSS!  I find it interesting that SAS will let users export straight into SPSS, and it works, but you can’t export an SPSS file into SAS without issues.
The final step after selecting the file format is to find the final folder destination of the data.  Once the destination is selected, you will assign a table name and hit FINISH, which will complete your export.
Simple, right? 
*************************************************************************************            
Option 2.  SAS Macro
When explaining how to import Excel files, I discussed the positives and negatives of using these macros for importing/exporting.  The only reason I use these macros is when importing the data, making quick changes, and exporting the data.  I won’t utilize this macro for my day to day exporting, but if I’m making summary variables or mean/mode/median imputations, this is the easiest way.
%macro expo (data=);
PROC EXPORT DATA= WORK.data
            OUTFILE= "C:\folder\datasets\Updated\&data"
            DBMS=EXCEL LABEL REPLACE;
     NEWFILE=YES;
RUN;
%mend;

*The first line creates a macro called “expo” where we are going to use the macro to define “data”.
The macro utilizes PROC EXPORT and states the data is in the work library and is called “data”.  Second note, I created a new file folder called “Updated”, so the data didn’t just overwrite the old data in the folder from where you imported the data.
DBMS=Excel, tells SAS it will be exporting the data as an Excel file with the labels printed instead of the names. ;

%mend;
*closes the macro.;
*The final step is calling the macro;

%expo(data=file);
This statement exports the SAS dataset as “file” into the folder on your drive specified in the macro statement.
These are just two of the forms of exporting data I use most often, but again, there are many ways to export. 
Cheers!

Tuesday, August 3, 2010

Excel Import

There are many options when importing Excel files, but today I will go over the three options I most commonly utilize.  These three options will move from easiest to hardest, but this doesn’t mean least to most useful.  The real benefit of SAS over some other programs is the flexibility it provides in all aspects of data management, including simple data importation.

The three options:
1.)    Import excel through SAS’s import wizard.
2.)    Import excel through code
3.)    Import excel through a macro
****************************************************************************
OPTION 1:
When importing excel through SAS’s import wizard I would always have the excel file saved in the ’97 – 2003 format, SAS hasn’t quite caught up to Excel 2007. 

Once in SAS, go to File-> Import Data… 

This opens the Import Wizard.  The default is “Standard Data Source” with “Microsoft Excel 97/2000/2002/2003 Workbook” in the drop down menu, click “Next” once you verify everything is correct.
 Hit “Browse…” to find the dataset you’re looking to import and click “OK”.

The next screen asks you to select the table you are importing.  This is the table in the original excel file where your data is saved.  Click “Next” once you have selected the correct table.

The final screen asks you to select the location you want to place the excel sheet once imported.  The default library is “WORK”.  The second drop down menu is “Member:”, here SAS is asking you to name the new SAS dataset, anything will do here, I prefer simple names like “d”, “data”, “a”, ect.  Once named you can select “Finish” and you are completed.

You will see in the SAS log “NOTE: WORK.D data set was successfully created. “  Now your file can be located in the Work Library, and is ready for manipulation/analysis.

LIMITATIONS:  The import wizard can’t handle variables where the values are mixture of numbers and characters.  It assigns the variable type for the column based on the first few rows.  You will run into problems if you have mixed variables, so please check your data after importation, or use the next option.
OPTION 2:
Option two saves out a permanent SAS dataset, but require code. 

Before you begin to write the code, the excel file *MUST* be saved as an Excel ’95 document.  This can be done in Excel through file->Save as ->Other formats, and place the “save as type” as “Microsoft Excel 5.0/95 Workbook”.

The first SAS code is to create a new library with a libname statement.
libname b "C:\folder\datasets";
*This creates a library named "b" where all files are saved into C:\folder\datasets; remember this is just an example, but the location your saving the data must already exist;

proc access dbms=xls;
   create b.xlsa.access;
   path = 'C:\folder\datasets\excelfile.xls';
   scantype  = yes;
   worksheet = deploy;
   getnames  = yes;
   assign    = yes;
   create b.xlsv.view;
   select all;
run;


*Some notes:
dbms=xls, is what tell SAS it is an excel file.
create=b.xlsa.access, creates your permanent dataset "xlsa.access" in the "b" library
path= 'C:\folder\datasets\excelfile.xls', is the exact location of the excel file.
scantype= no, this allows a column to not be assigned numeric if most entries are numeric.
mixed=yes, this allows columns with numbers and characters to be formatted as character.
worksheet=sheet1, is the name of the worksheet the data can be found in the excel file.
getnames=yes, SAS takes the first row from the excel file and uses it to write column names.
assign=yes, tells SAS to name the columns legal SAS names using excel column names
create b.xlsv.view,  creates a subfolder called xlsv, which we delete later.
select all, tells SAS all observations will be available to view.
;


data b.SASdataset;
      set b.xlsv;
run;

*This data step saves the imported excel file "b.xlsv" into "b.SASdataset", which is located in the library "b" and named SASdataset.  The dataset can now be found at "C:\folder\datasets\SASdataset.sas7bdat";

proc datasets library=b;
      delete xlsv/memtype=view;
      delete xlsa/memtype=access;
run;quit;

*This final data step deletes your temporary folders xlsv and xlsa that were created in the PROC ACCESS step;

LIMITATIONS:  This procedure has few/no limitations, once you write the code once, it can be re-used and customized to fit the file you’re importing.  It is the most common way I import files.


Option 3:

This final option is through writing a SAS macro, it is the most challenging but very useful.  I usually use this option when I am importing a dataset, making quick changes, i.e. imputation, and then exporting it right out as an excel file. 

%macro impo(data=);
 PROC IMPORT OUT= WORK.Data
            DATAFILE= "C:\folder\datasets\&data"
            DBMS=EXCEL REPLACE;
     SHEET="&data";
     GETNAMES=YES;
     MIXED=NO;
     SCANTEXT=YES;
     USEDATE=YES;
     SCANTIME=YES;
RUN;
%mend;

*You are writing a macro called impo by using the proc import option.  The file is imported into your work library so nothing needs to be done for renaming the file.
DATAFILE= "C:\folder\datasets\&data", tells     the location of your excel file.
SHEET="&data", tells SAS the sheet will be named when you call the macro. 
NOTE: The sheet and excel file must be named the same.
All of the other commands should be familiar.
The code is ended with the %mend to tell SAS the macro is completed.;

%impo(data=dataset);

*Here you are calling the macro you just created and saying the excel database is called "dataset".  Once these two steps are run, the sheet has been imported.;

This option is extremely useful when importing numerous datasets all in the same SAS program.  I use this primarily when doing quick steps, i.e. imputation, summary variables, ect.  Very useful when used for the right reasons, otherwise unnecessary steps.

LIMITATIONS:  writing a SAS macro takes a few extra steps, but any extra steps can create errors.  This is time consuming and if used for the wrong reasons, just unnecessary. 
There are many ways to import Excel files, but I just wanted to show the ones I use the most, and believe the most useful.  However, if you ask someone else, I’m sure they will mention other options.

Cheers!

Monday, August 2, 2010

Import from SPSS

One of the most common importation question asked is how to import SPSS files into SAS.  My longterm plan is to provide a quick guide on how to import every file format into SAS.  However, SPSS is by far the most frequently asked format, so it will be explained first.

Two excellent options exist:

1.) Through a program, like Stat-transfer

2.) Via an SPSS portable file straight into SAS.
*********************************************
Option 1, Stat-transfer is an excellent program for a simple conversion from an SPSS file into a SAS file.  It can be used for either an SPSS portable file (.por) or a normal SPSS data file (.sav).  This program does cost money, so it isn't a preferred option, unless you need to transfer a variety of file formats into SAS files (i.e. Access, ascii, excel, matlab, minitab, s-plus, STATA, ect.).  Stat-transfer is the only transfer program I have used.  It has always worked great, so I would recommend this program if you frequently need to import files into SAS from a variety of data formats.

Option 2, Import a portable SPSS file directly into SAS.

The first step is save the SPSS data file as a portable file (.por).

1.) Open the file in SPSS.
2.) Go to File -> Save As...
               A new window will open, you will type in the new file name and in the "Save as type" drop down            menu, choose "Portable (*.por)".  This will save the file into the same folder as the original
SPSS file unless otherwise stated.

***Note: You will see the option of saving the file as "SAS v7+ Windows long extention (*.sas7bdat).  This is the normal SAS file extension, but this will often not import into SAS properly.

Now your file is saved as a portable SPSS file, lets get it into SAS.
*****************************************************************************
libname a spss 'd:\SAS Folder\spssfile.por';






*ANNOTATION: With the libname statement we are creating a new library to import SPSS. In this example I am calling the SAS library "a". This is followed by the SPSS command.  Lastly, the file location, including the .por extension;                      

libname b 'C:\SAS Folder';

*ANNOTATION: Now you have imported the file into library "a", which can be found in the explorer tab. If you open the library you see the file as "_first_"  however, this file contains errors. So we want to save this into a permanent SAS dataset, so we create a new SAS library with the libname statement;                                                 


data b.data;
set a._first_;
run;



*ANNOTATION: Finally, we are saving the SPSS file "a._first_" into a permanent SAS file in the "b" library with the file name "data". Now you will be able to see your SAS dataset in the file-folder “C:\SAS Folder” as "data.sas7bdat";       

*****************************************************************************

If you follow these steps the file will be converted from an SPSS dataset into a working SAS dataset, which you may now use for any analysis.