A macro to execute PROC TTEST for multiple binary grouping variables in SAS (and sorting t-test statistics by their absolute values)

In SAS, you can perform PROC TTEST for multiple numeric variables in the same procedure.  Here is an example using the built-in data set SASHELP.BASEBALL; I will compare the number of at-bats and number of walks between the American League and the National League.

proc ttest
     data = sashelp.baseball;
     class League;
     var nAtBat nBB; 
     ods select ttests;
run;

Here are the resulting tables.

Method Variances DF t Value Pr > |t|
Pooled Equal 320 2.05 0.0410
Satterthwaite Unequal 313.66 2.06 0.04

Method Variances DF t Value Pr > |t|
Pooled Equal 320 0.85 0.3940
Satterthwaite Unequal 319.53 0.86 0.3884

 

What if you want to perform PROC TTEST for multiple grouping (a.k.a. classification) variables?  You cannot put more than one variable in the CLASS statement, so you would have to run PROC TTEST separately for each binary grouping variable.  If you do put LEAGUE and DIVISION in the same CLASS statement, here is the resulting log.

1303 proc ttest
1304 data = sashelp.baseball;
1305 class league division;
 --------
 22
 202
ERROR 22-322: Expecting ;.
ERROR 202-322: The option or parameter is not recognized and will be ignored.
1306 var natbat;
1307 ods select ttests;
1308 run;

 

There is no syntax in PROC TTEST to use multiple grouping variables at the same time, so this tutorial provides a macro to do so.  There are several nice features about my macro:

  1. It allows you to use multiple grouping variables at the same time.
  2. It sorts the t-test statistics by their absolute values within each grouping variable.
  3. It shows the name of each continuous variable in the output table, unlike the above output.

Here is its basic skeleton.

Read more of this post

Advertisements

Use ODS EXCLUDE ALL to suppress printing output in SAS while producing output data sets

I regularly produce output data sets from a SAS procedure, such as getting the variable names from a data set in PROC CONTENTS.  In these instances, I often wish to suppress any printing of the output in HTML or TXT.  Such printing of the results is often unnecessary, and it can cost a lot of time and memory.

Some SAS procedures have the NOPRINT option that suppresses the printing of output, but this is limiting in several ways:

  1. Some SAS procedures do NOT have the NOPRINT option.  PROC TTEST is a prominent example.  I checked the high-performance procedures like PROC HPFOREST (random forest) and PROC HPSVM (support vector machine), and I could not find the NOPRINT option for these procedures.
  2. I cannot use ODS OUTPUT to produce output data sets while invoking the NOPRINT option.  Here is an example.

Read more of this post

Career Panel at the 2018 Canadian Statistics Student Conference – McGill University, Montreal, Quebec

I will speak on the career-advice panel at the 2018 Canadian Statistics Student Conference.  It will be held on Saturday, June 2, at McGill University.

cropped-eric-cai-head-shot-7.png

If you will attend this conference or the subsequent Annual Meeting of the Statistical Society of Canada, then I strongly recommend students to read my following advice articles in advance.