Tuesday, August 3, 2010

Excel Import

There are many options when importing Excel files, but today I will go over the three options I most commonly utilize.  These three options will move from easiest to hardest, but this doesn’t mean least to most useful.  The real benefit of SAS over some other programs is the flexibility it provides in all aspects of data management, including simple data importation.

The three options:
1.)    Import excel through SAS’s import wizard.
2.)    Import excel through code
3.)    Import excel through a macro
****************************************************************************
OPTION 1:
When importing excel through SAS’s import wizard I would always have the excel file saved in the ’97 – 2003 format, SAS hasn’t quite caught up to Excel 2007. 

Once in SAS, go to File-> Import Data… 

This opens the Import Wizard.  The default is “Standard Data Source” with “Microsoft Excel 97/2000/2002/2003 Workbook” in the drop down menu, click “Next” once you verify everything is correct.
 Hit “Browse…” to find the dataset you’re looking to import and click “OK”.

The next screen asks you to select the table you are importing.  This is the table in the original excel file where your data is saved.  Click “Next” once you have selected the correct table.

The final screen asks you to select the location you want to place the excel sheet once imported.  The default library is “WORK”.  The second drop down menu is “Member:”, here SAS is asking you to name the new SAS dataset, anything will do here, I prefer simple names like “d”, “data”, “a”, ect.  Once named you can select “Finish” and you are completed.

You will see in the SAS log “NOTE: WORK.D data set was successfully created. “  Now your file can be located in the Work Library, and is ready for manipulation/analysis.

LIMITATIONS:  The import wizard can’t handle variables where the values are mixture of numbers and characters.  It assigns the variable type for the column based on the first few rows.  You will run into problems if you have mixed variables, so please check your data after importation, or use the next option.
OPTION 2:
Option two saves out a permanent SAS dataset, but require code. 

Before you begin to write the code, the excel file *MUST* be saved as an Excel ’95 document.  This can be done in Excel through file->Save as ->Other formats, and place the “save as type” as “Microsoft Excel 5.0/95 Workbook”.

The first SAS code is to create a new library with a libname statement.
libname b "C:\folder\datasets";
*This creates a library named "b" where all files are saved into C:\folder\datasets; remember this is just an example, but the location your saving the data must already exist;

proc access dbms=xls;
   create b.xlsa.access;
   path = 'C:\folder\datasets\excelfile.xls';
   scantype  = yes;
   worksheet = deploy;
   getnames  = yes;
   assign    = yes;
   create b.xlsv.view;
   select all;
run;


*Some notes:
dbms=xls, is what tell SAS it is an excel file.
create=b.xlsa.access, creates your permanent dataset "xlsa.access" in the "b" library
path= 'C:\folder\datasets\excelfile.xls', is the exact location of the excel file.
scantype= no, this allows a column to not be assigned numeric if most entries are numeric.
mixed=yes, this allows columns with numbers and characters to be formatted as character.
worksheet=sheet1, is the name of the worksheet the data can be found in the excel file.
getnames=yes, SAS takes the first row from the excel file and uses it to write column names.
assign=yes, tells SAS to name the columns legal SAS names using excel column names
create b.xlsv.view,  creates a subfolder called xlsv, which we delete later.
select all, tells SAS all observations will be available to view.
;


data b.SASdataset;
      set b.xlsv;
run;

*This data step saves the imported excel file "b.xlsv" into "b.SASdataset", which is located in the library "b" and named SASdataset.  The dataset can now be found at "C:\folder\datasets\SASdataset.sas7bdat";

proc datasets library=b;
      delete xlsv/memtype=view;
      delete xlsa/memtype=access;
run;quit;

*This final data step deletes your temporary folders xlsv and xlsa that were created in the PROC ACCESS step;

LIMITATIONS:  This procedure has few/no limitations, once you write the code once, it can be re-used and customized to fit the file you’re importing.  It is the most common way I import files.


Option 3:

This final option is through writing a SAS macro, it is the most challenging but very useful.  I usually use this option when I am importing a dataset, making quick changes, i.e. imputation, and then exporting it right out as an excel file. 

%macro impo(data=);
 PROC IMPORT OUT= WORK.Data
            DATAFILE= "C:\folder\datasets\&data"
            DBMS=EXCEL REPLACE;
     SHEET="&data";
     GETNAMES=YES;
     MIXED=NO;
     SCANTEXT=YES;
     USEDATE=YES;
     SCANTIME=YES;
RUN;
%mend;

*You are writing a macro called impo by using the proc import option.  The file is imported into your work library so nothing needs to be done for renaming the file.
DATAFILE= "C:\folder\datasets\&data", tells     the location of your excel file.
SHEET="&data", tells SAS the sheet will be named when you call the macro. 
NOTE: The sheet and excel file must be named the same.
All of the other commands should be familiar.
The code is ended with the %mend to tell SAS the macro is completed.;

%impo(data=dataset);

*Here you are calling the macro you just created and saying the excel database is called "dataset".  Once these two steps are run, the sheet has been imported.;

This option is extremely useful when importing numerous datasets all in the same SAS program.  I use this primarily when doing quick steps, i.e. imputation, summary variables, ect.  Very useful when used for the right reasons, otherwise unnecessary steps.

LIMITATIONS:  writing a SAS macro takes a few extra steps, but any extra steps can create errors.  This is time consuming and if used for the wrong reasons, just unnecessary. 
There are many ways to import Excel files, but I just wanted to show the ones I use the most, and believe the most useful.  However, if you ask someone else, I’m sure they will mention other options.

Cheers!

No comments:

Post a Comment