Store multiple strings of text as a macro variable in SAS with PROC SQL and the INTO statement

I often need to work with many variables at a time in SAS, but I don’t like to type all of their names manually – not only is it messy to read, it also induces errors in transcription, even when copying and pasting.  I recently learned of an elegant and efficient way to store multiple variable names into a macro variable that overcomes those problems.  This technique uses the INTO statement in PROC SQL.

To illustrate how this storage method can be applied in a practical context, suppose that we want to determine the factors that contribute to a baseball player’s salary in the built-in SASHELP.BASEBALL data setI will consider all continuous variables other than “Salary” and “logSalary”, but I don’t want to write them explicitly in any programming statements.  To do this, I first obtain the variable names and types of a data set using PROC CONTENTS.

* create a data set of the variable names;
proc contents
     data = sashelp.baseball
          noprint
     out = bvars (keep = name type);
run;

Then, I use the INTO statement in PROC SQL to convert the list of variable names into a macro variable; I use spaces to delimit the variable names.  I want the continuous (i.e. numeric) variables that are not “Salary” or “logSalary”; those are the potential covariates in a regression model to predict the salaries of the baseball players.

proc sql 
          noprint; 
     select name 
     into :bnumerics 
          separated by ' ' 
     from bvars 
     where type = 1 
               and 
           name not in ('Salary', 'logSalary'); 
quit;

Those variable names are now stored in the macro variable “bnumerics”.  We can check that it looks as expected in the log by using the “%PUT” statement.

%put &bnumerics;

Here is what the log shows.

%put &bnumerics;
CrAtBat CrBB CrHits CrHome CrRbi CrRuns YrMajor nAssts nAtBat nBB nError nHits nHome nOuts nRBI nRuns

Now, let’s apply my macro for sorting correlation coefficients by their absolute values to this data set.  Use the %INCLUDE statement to run the macro from the folder that stores this macro.

%include 'C:\MyScripts\macros\corr_sort';

%corr_sort(sashelp.baseball, salary, &bnumerics, bcorr);
proc print
     data = bcorr;
run;

Here are the results from PROC PRINT:

 

Obs Variable Label corr PValue NSalary abs_corr
1 CrRbi Career RBIs 0.61871 <.0001 263 0.61871
2 CrRuns Career Runs 0.61278 <.0001 263 0.61278
3 CrHits Career Hits 0.59221 <.0001 263 0.59221
4 CrHome Career Home Runs 0.58125 <.0001 263 0.58125
5 CrAtBat Career Times at Bat 0.57165 <.0001 263 0.57165
6 CrBB Career Walks 0.54574 <.0001 263 0.54574
7 nRBI RBIs in 1986 0.51723 <.0001 263 0.51723
8 nBB Walks in 1986 0.50462 <.0001 263 0.50462
9 nHits Hits in 1986 0.50136 <.0001 263 0.50136
10 nRuns Runs in 1986 0.47903 <.0001 263 0.47903
11 nAtBat Times at Bat in 1986 0.46167 <.0001 263 0.46167
12 YrMajor Years in the Major Leagues 0.44288 <.0001 263 0.44288
13 nHome Home Runs in 1986 0.39885 <.0001 263 0.39885
14 nOuts Put Outs in 1986 0.30048 <.0001 263 0.30048
15 nAssts Assists in 1986 0.02544 0.6814 263 0.02544
16 nError Errors in 1986 -0.00540 0.9305 263 0.00540

 

Throughout this entire exercise, I did not ever need to type the names of any of the above variable names, yet I performed a valuable computation involving all of them.  This illustrates the power of storing variable names in a macro variable in SAS!

Advertisements

Your thoughtful comments are much appreciated!

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: