sas programming | The Chemical Statistician

Some SAS procedures (like PROC REG, GLM, ANOVA, SQL, and IML) end with “QUIT;”, not “RUN;”

August 1, 2018 Leave a comment

Most SAS procedures require the

RUN;

statement to signal their termination. However, there are some notable exceptions to this.

I have written about PROC SQL many times on my blog, and this procedure requires the

QUIT;

statement instead.

It turns out that there is another set of statistical procedures that require the QUIT statement, and some of them are very common. They are called interactive procedures, and they include PROC REG, PROC GLM, and PROC ANOVA. If you end them with RUN rather than QUIT, then you will run into problems with displaying further output. For example, if you try to output a data set from one such PROC and end it with the RUN statement, then you will get this error message:

ERROR: You cannot open WORK.MYDATA.DATA for input access with record-level 
control because WORK.MYDATA.DATA is in use by you in resource environment 
REG.

WORK.MYDATA cannot be opened.

You will also notice that the Program Editor says “PROC … running” in its banner when you end such a PROC with RUN rather than QUIT.

I don’t like this exception, but, alas, it does exist. You can find out more about these interactive procedures in SAS Usage Note #37105. As this note says, the ANOVA, ARIMA, CATMOD, FACTEX, GLM, MODEL, OPTEX, PLAN, and REG procedures are interactive procedures, and they all require the QUIT statement for termination.

PROC IML is not mentioned in that usage note, but this procedure also requires the QUIT statement. Rick Wicklin has written an article about this on his blog, The DO Loop.

Filed under Data Analysis, SAS Programming, Statistics, Tutorials Tagged with data analysis, interactive procedures, proc anova, proc arima, proc catmod, proc factex, proc glm, PROC IML, proc model, proc optex, proc plan, proc reg, PROC SQL, rick wicklin, SAS, sas programming, statistics

Beware of accidental replacement of data sets with PROC SORT in SAS

June 28, 2018 Leave a comment

PROC SORT is a very useful procedure in SAS. Not only can you sort a data set on one or more variables with it, but you can sort each variable in ascending or descending order, and you can use it to obtain unique observations or duplicated observations. However, there is a feature about PROC SORT that can be dangerous and deserves emphasis: If you are not careful, you can accidentally replace an existing, valuable data set.

Suppose that you wish to use PROC SORT to get only the duplicated records of a data set. Here is an example of how to do it.

data heights;
     input Name $ 
           Age 
           Height;
     datalines;
Amy 15 174
Amy 16 177
Bob 14 172
Cam 13 163
Cam 17 181
;
run;

proc sort
     data = heights
          nouniquekey;
     by Name;
run;

proc print
     data = heights;
run;

Obs	Name	Age	Height
1	Amy	15	174
2	Amy	16	177
3	Cam	13	163
4	Cam	17	181

Note that the record for “Bob” is gone from HEIGHTS, because it was a unique observation and, thus, removed in the above PROC SORT statement.

If the original data set is valuable, then this loss can be very damaging, especially if it took a lot of work and time to obtain the original data set. This shows the danger of accidental replacement of a data set in SAS when using PROC SORT.

Read more of this post

Filed under Data Analysis, SAS Programming, Tutorials Tagged with data analysis, duplicate, duplicates, nodupkey, PROC SORT, SAS, sas programming, sort, sorting

Use ODS EXCLUDE ALL to suppress printing output in SAS while producing output data sets

May 4, 2018 Leave a comment

I regularly produce output data sets from a SAS procedure, such as getting the variable names from a data set in PROC CONTENTS. In these instances, I often wish to suppress any printing of the output in HTML or TXT. Such printing of the results is often unnecessary, and it can cost a lot of time and memory.

Some SAS procedures have the NOPRINT option that suppresses the printing of output, but this is limiting in several ways:

Some SAS procedures do NOT have the NOPRINT option. PROC TTEST is a prominent example. I checked the high-performance procedures like PROC HPFOREST (random forest) and PROC HPSVM (support vector machine), and I could not find the NOPRINT option for these procedures.
I cannot use ODS OUTPUT to produce output data sets while invoking the NOPRINT option. Here is an example.

Read more of this post

Filed under Data Analysis, SAS Programming, Tutorials Tagged with data analysis, noprint, ods, ods exclude all, rick wicklin, SAS, sas programming

Convert multiple variables between character and numeric formats in SAS

April 27, 2018 5 Comments

Introduction

I often get data that are coded as character, but are actually meant to be numeric. Thus, converting them into the correct variable types is a common task, and SAS Note #24590 shows how to do so. However, I recently needed to do hundreds of these conversions, so I wanted some code to accomplish this quickly and accurately. This tutorial shows how to do so.

Let’s consider this small data set in SAS as an example. They are hypothetical statistics of 3 players from a basketball game.

data basketball1;
     input jersey points $ rebounds $ assists $;
     datalines;
21 10 14 1
4  11 3  12
23 29 4  5
;
run;

The 3 performance metrics (points, rebounds, and assists) are clearly numeric, but they are currently coded as character. (You can use PROC CONTENTS to confirm this if needed.)

The jersey number is really a character variable, because its magnitude has no real-life meaning. The National Basketball Association (NBA) allows “00” as a possible jersey number. (Robert Parish wore this jersey number; he won 4 NBA championships and reached the Naismith Basketball Hall of Fame.) If you code “00” as a numeric variable, then it will render as “0”. Thus, for NBA jersey numbers, it is best to save it as a character variable.

I can convert these variables into the correct types using the following code. Note that I chose “2.” for the length of “JERSEY”, because I know that jersey numbers in the NBA have, at most, 2 digits.

data basketball2;
     set basketball1;
 
     jersey2 = put(jersey, 2.);
     drop jersey;
     rename jersey2 = jersey;

     points2 = input(points, 8.);
     drop points;
     rename points2 = points;

     rebounds2 = input(rebounds, 8.);
     drop rebounds;
     rename rebounds2 = rebounds;

     assists2 = input(assists, 8.);
     drop assists;
     rename assists2 = assists;
run;

Despite this success, the above code can be very cumbersome when I need to do this for many variables, and this situation arose in my job recently. In this tutorial, I will show a fast way of doing these conversions for many variables at once. I will use this BASKETBALL1 data set as an example, and I will convert POINTS, REBOUNDS, and ASSISTS from character to numeric simultaneously.

Read more of this post

Filed under Data Analysis, SAS Programming Tagged with categorical data, categorical variable, character data, character variable, continuous data, continuous variable, data analysis, numeric data, numeric variable, SAS, sas programming

A macro to automate the creation of indicator variables in SAS

April 25, 2018 Leave a comment

In a recent blog post, I introduced an easy and efficient way to create indicator variables from categorical variables in SAS. This method pretends to run logistic regression, but it really is using PROC LOGISTIC to get the design matrix based on dummy-variable coding. I shared SAS code for how to do so, step-by-step.

I write this follow-up post to provide a macro that you can use to execute all of those steps in one line. If you have not read my previous post on this topic, then I strongly encourage you to do that first. Don’t use this macro blindly.

Here is the macro. The key steps are

Run PROC LOGISTIC to get the design matrix (which has the indicator variables)
Merge the original data with the newly created indicator variables
Delete the “INDICATORS” data set, which was created in an intermediate step

%macro create_indicators(input_data, target, covariates, output_data);

proc logistic
     data = &input_data
          noprint
          outdesign = indicators;
     class &covariates / param = glm;
     model &target = &covariates;
run;


data &output_data;
      merge    &input_data
               indicators (drop = Intercept &target);
run;


proc datasets 
     library = work
          noprint;
     delete indicators;
run;

%mend;

I will use the built-in data set SASHELP.CARS to illustrate the use of my macro. As you can see, my macro can accept multiple categorical variables as inputs for creating indicator variables. I will do that here for the variables TYPE, MAKE, and ORIGIN.

Read more of this post

Filed under Applied Statistics, Biostatistics, Categorical Data Analysis, Data Analysis, SAS Programming, Statistics, Tutorials Tagged with categorical data, Categorical Data Analysis, categorical variable, data analysis, dummy coding, dummy variables, indicator, indicator variable, indicator variables, indicators, SAS, sas programming, statistics

An easy and efficient way to create indicator variables (a.k.a. dummy variables) from a categorical variable in SAS

April 24, 2018 Leave a comment

Introduction

In statistics and biostatistics, the creation of binary indicators is a very useful practice.

They can be useful predictor variables in statistical models.
They can reduce the amount of memory required to store the data set.
They can treat a categorical covariate as a continuous covariate in regression, which has certain mathematical conveniences.

However, the creation of indicator variables can be a long, tedious, and error-prone process. This is especially true if there are many categorical variables, or if a categorical variable has many categories. In this tutorial, I will show an easy and efficient way to create indicator variables in SAS. I learned this technique from SAS usage note #23217: Saving the coded design matrix of a model to a data set.

The Example Data Set

Let’s consider the PRDSAL2 data set that is built into the SASHELP library. Here are the first 5 observations; due to a width constraint, I will show the first 5 columns and the last 6 columns separately. (I encourage you to view this data set using PROC PRINT in SAS by yourself.)

COUNTRY	STATE	ACTUAL	PREDICT
U.S.A.	California	$987.36	$692.24
U.S.A.	California	$1,782.96	$568.48
U.S.A.	California	$32.64	$16.32
U.S.A.	California	$1,825.12	$756.16
U.S.A.	California	$750.72	$723.52

PRODTYPE	PRODUCT	YEAR	QUARTER	MONTH	MONYR
FURNITURE	SOFA	1995	1	Jan	JAN95
FURNITURE	SOFA	1995	1	Feb	FEB95
FURNITURE	SOFA	1995	1	Mar	MAR95
FURNITURE	SOFA	1995	2	Apr	APR95
FURNITURE	SOFA	1995	2	May	MAY95

Read more of this post

Sort a data set by ascending or descending variables using PROC SORT in SAS

April 23, 2018 Leave a comment

Consider the built-in data set SASHELP.CLASS in SAS. Here are the first 5 observations from PROC PRINT.

Obs	Name	Sex	Age	Height	Weight
1	Joyce	F	11	51.3	50.5
2	Thomas	M	11	57.5	85.0
3	James	M	12	57.3	83.0
4	Jane	F	12	59.8	84.5
5	John	M	12	59.0	99.5

As you can clearly see, they are NOT sorted by weight. Here is how you can sort the data set by weight using PROC SORT.

Read more of this post

Filed under Data Analysis, SAS Programming, Tutorials Tagged with data analysis, PROC SORT, PROC SQL, SAS, sas programming, sort, sorting, SQL

Remove leading blanks when creating macro variables using PROC SQL in SAS

March 10, 2018 4 Comments

I regularly use PROC SQL to create macro variables in SAS, and I recently noticed a strange phenomenon when resolving a macro variable within double quotation marks in the title of a plot. Thankfully, I was able to replicate this problem using the SASHELP.BASEBALL data set, which is publicly available. I was then able to send the code and the strange result to SAS Technical Support for their examination.

proc sql;
     select count(name)
     into   :hitters_100plusHR
     from   sashelp.baseball
     where  CrHome > 100;
quit;


proc sgplot
     data = sashelp.baseball;
     histogram Salary;
     title1 'Distribution of salaries';
     title2 "Restricted to the &hitters_100plusHR hitters with more than 100 career home runs";
run;

Here is the resulting plot. Notice the extra spaces before “72” in the title of the plot.

SAS Technical Support informed me that

this problem is commonly known.
there is no way of predicting when it will occur
for now, the best way to deal with it is to remove the leading blanks using one of several ways.

Read more of this post

Filed under Data Analysis, SAS Programming, Tutorials Tagged with %cmpres, %let, %sysfunc, compbl, data analysis, leading blanks, leading spaces, macro, proc sgplot, PROC SQL, SAS, sas macro, sas programming, sashelp.baseball

Store multiple strings of text as a macro variable in SAS with PROC SQL and the INTO statement

September 8, 2017 Leave a comment

I often need to work with many variables at a time in SAS, but I don’t like to type all of their names manually – not only is it messy to read, it also induces errors in transcription, even when copying and pasting. I recently learned of an elegant and efficient way to store multiple variable names into a macro variable that overcomes those problems. This technique uses the INTO statement in PROC SQL.

To illustrate how this storage method can be applied in a practical context, suppose that we want to determine the factors that contribute to a baseball player’s salary in the built-in SASHELP.BASEBALL data set. I will consider all continuous variables other than “Salary” and “logSalary”, but I don’t want to write them explicitly in any programming statements. To do this, I first obtain the variable names and types of a data set using PROC CONTENTS.

* create a data set of the variable names;
proc contents
     data = sashelp.baseball
          noprint
     out = bvars (keep = name type);
run;

Read more of this post

Filed under Applied Statistics, Data Analysis, Descriptive Statistics, SAS Programming, Statistics Tagged with applied statistics, correlation, correlation coefficient, data analysis, data manipulation, into statement, macro, macro variable, PROC SQL, programming, SAS, sas programming, SQL, statistics

Use the LENGTH statement to pre-set the lengths of character variables in SAS – with a comparison to R

August 16, 2017 5 Comments

I often create character variables (i.e. variables with strings of text as their values) in SAS, and they sometimes don’t render as expected. Here is an example involving the built-in data set SASHELP.CLASS.

Here is the code:

data c1;
     set sashelp.class;
 
     * define a new character variable to classify someone as tall or short;
     if height > 60
     then height_class = 'Tall';
          else height_class = 'Short';
run;


* print the results for the first 5 rows;
proc print
     data = c1 (obs = 5);
run;

Here is the result:

Obs	Name	Sex	Age	Height	Weight	height_class
1	Alfred	M	14	69.0	112.5	Tall
2	Alice	F	13	56.5	84.0	Shor
3	Barbara	F	13	65.3	98.0	Tall
4	Carol	F	14	62.8	102.5	Tall
5	Henry	M	14	63.5	102.5	Tall

What happened? Why does the word “Short” render as “Shor”?

Read more of this post

Filed under Categorical Data Analysis, Data Analysis, R programming, SAS Programming, Statistics, Tutorials Tagged with categorical data, categorical variable, character data, character variable, length(), R, r programing, SAS, sas programming

New Job at the Bank of Montreal in Toronto

December 7, 2015 Leave a comment

I have accepted an offer from the Bank of Montreal to become a Manager of Operational Risk Analytics and Modelling at its corporate headquarter office in Toronto. Thus, I have resigned from my job at the British Columbia Cancer Agency. I will leave Vancouver at the end of December, 2015, and start my new job at the beginning of January, 2016.

I have learned some valuable skills and met some great people here in Vancouver over the past 2 years. My R programming skills have improved a lot, especially in text processing. My SAS programming skills have improved a lot, and I began a new section on my blog to SAS programming as a result of what I learned. I volunteered and delivered presentations for the Vancouver SAS User Group (VanSUG) – once on statistical genetics, and another on sampling strategies in analytical chemistry, ANOVA, and PROC TRANSPOSE. I have thoroughly enjoyed meeting some smart and helpful people at the Data Science, Machine Learning, and R Programming Meetups.

I lived in Toronto from 2011 to 2013 while pursuing my Master’s degree in statistics at the University of Toronto and working as a statistician at Predicum. I look forward to re-connecting with my colleagues there.

Filed under Presentations & Appearances Tagged with analytical chemistry, ANOVA, Bank of Montreal, chemistry, operational risk, proc transpose, R, R programming, SAS, sas programming, statistics, text processing, toronto, Vancouver, Vancouver SAS Users Group, VanSUG

Separating Unique and Duplicated Observations Using PROC SORT in SAS 9.3 and Newer Versions

April 10, 2015 5 Comments

As Fareeza Khurshed commented in my previous blog post, there is a new option in SAS 9.3 and later versions that allows sorting and the identification of duplicates to be done in one step. My previous trick uses FIRST.variable and LAST.variable to separate the unique observations from the duplicated observations, but that requires sorting the data set first before using the DATA step to do the separation. If you have SAS 9.3 or a newer version, here is an example of doing it in one step using PROC SORT.

There is a data set called ADOMSG in the SASHELP library that is built into SAS. It has an identifier called MSGID, and there are duplicates by MSGID. Let’s create 2 data sets out of SASHELP.ADOMSG:

DUPLICATES for storing the duplicated observations
SINGLES for storing the unique observations

proc sort
     data = sashelp.adomsg
          out = duplicates
          uniqueout = singles
          nouniquekey;
     by msgid;
run;

Here is the log:

NOTE: There were 459 observations read from the data set SASHELP.ADOMSG.
NOTE: 300 observations with unique key values were deleted.
NOTE: The data set WORK.DUPLICATES has 159 observations and 6 variables.
NOTE: The data set WORK.SINGLES has 300 observations and 6 variables.
NOTE: PROCEDURE SORT used (Total process time):
real time 0.28 seconds
cpu time 0.00 seconds

Note that the number of observations in WORK.DUPLICATES and WORK.SINGLES add to 459, the total number of observations in the original data set.

In addition to Fareeza, I also thank CB for sharing this tip.

Filed under Data Analysis, SAS Programming Tagged with data analysis, data manipulation, duplicate, first.variable, last.variable, nouniquekey, PROC SORT, SAS, sas programming, sorting, unique, uniqueout

Resources for Learning Data Manipulation in R, SAS and Microsoft Excel

February 23, 2015 Leave a comment

I had the great pleasure of speaking to the Department of Statistics and Actuarial Science at Simon Fraser University on last Friday to share my career advice with its students and professors. I emphasized the importance of learning skills in data manipulation during my presentation, and I want to supplement my presentation by posting some useful resources for this skill. If you are new to data manipulation, these are good guides for how to get started in R, SAS and Microsoft Excel.

For R, I recommend Winston Chang’s excellent web site, “Cookbook for R“. It has a specific section on manipulating data; this is a comprehensive list of the basic skills that every data analyst and statistician should learn.

For SAS, I recommend the UCLA statistical computing web page that is adapted from Oliver Schabenberger’s web site.

For Excel, I recommend Excel Easy, a web site that was started at the University of Amsterdam in 2010. It is a good resource for learning about Excel in general, and there is no background required. I specifically recommend the “Functions” and “Data Analysis” sections.

A blog called teachr has a good list of Top 10 skills in Excel to learn.

I like to document tips and tricks for R and SAS that I like to use often, especially if I struggled to find them on the Internet. I encourage you to check them out from time to time, especially in my “Data Analysis” category.

If you have any other favourite resources for learning data manipulation or data analysis, please share them in the comments!

Filed under Data Analysis, R programming, SAS Programming Tagged with data analysis, data manipulation, Excel, R, R programming, SAS, sas programming

Performing Logistic Regression in R and SAS

November 24, 2014 3 Comments

Introduction

My statistics education focused a lot on normal linear least-squares regression, and I was even told by a professor in an introductory statistics class that 95% of statistical consulting can be done with knowledge learned up to and including a course in linear regression. Unfortunately, that advice has turned out to vastly underestimate the variety and depth of problems that I have encountered in statistical consulting, and the emphasis on linear regression has not paid dividends in my statistics career so far. Wisdom from veteran statisticians and my own experience combine to suggest that logistic regression is actually much more commonly used in industry than linear regression. I have already started a series of short lessons on binary classification in my Statistics Lesson of the Day and Machine Learning Lesson of the Day. In this post, I will show how to perform logistic regression in both R and SAS. I will discuss how to interpret the results in a later post.

The Data Set

The data set that I will use is slightly modified from Michael Brannick’s web page that explains logistic regression. I copied and pasted the data from his web page into Excel, modified the data to create a new data set, then saved it as an Excel spreadsheet called heart attack.xlsx.

In R, I imported it using the “XLConnect” package.
In SAS, I imported it using PROC IMPORT.

This data set has 3 variables (I have renamed them for convenience in my R programming).

ha2 – Whether or not a patient had a second heart attack. If ha2 = 1, then the patient had a second heart attack; otherwise, if ha2 = 0, then the patient did not have a second heart attack. This is the response variable.
treatment – Whether or not the patient completed an anger control treatment program.
anxiety – A continuous variable that scores the patient’s anxiety level. A higher score denotes higher anxiety.

Read the rest of this post to get the full scripts and view the full outputs of this logistic regression model in both R and SAS!

Read more of this post

Filed under Applied Statistics, Biostatistics, Categorical Data Analysis, R programming, Statistics, Tutorials Tagged with applied statistics, binary classification, deviance residual, deviance residuals, Excel, fisher scoring, fisher scoring algorithm, logistic regression, null deviance, ods graphics, ods pdf, proc import, proc logistic, R, R programming, regression, residual deviance, SAS, sas programming, statistics

Vancouver SAS User Group Meeting – Wednesday, November 26, 2014, at Holiday Inn Vancouver-Centre (West Broadway)

November 19, 2014 Leave a comment

I am pleased to have recently joined the executive organizing team of the Vancouver SAS User Group. We hold meetings twice per year to allow Metro Vancouver users of all kinds of SAS products to share their knowledge, tips and advice with others. These events are free to attend, but registration is required.

Our next meeting will be held on Wednesday, November 26, 2014. Starting from 8:30 am, a free breakfast will be served while registration takes place. The session will begin at 9:00 am and end at 12:30 pm with a prize draw.

Please note that there is a new location for this meeting: the East and Centre Ballrooms at Holiday Inn Vancouver-Centre at 711 West Broadway in Vancouver. We will also experiment with holding a half-day session by ending at 12:30 pm at this meeting. Visit our web site for more information and to register for this free event!

If you will attend this event, please feel free to come and say “Hello”!

Read the rest of this post for the full agenda!

Read more of this post

Filed under Presentations & Appearances Tagged with JMP, PROC IML, PROC MEANS, PROC SQL, SAS, SAS Enterprise Guide, SAS Enterprise Miner, sas programming, VanSUG

Calculating the sum or mean of a numeric (continuous) variable by a group (categorical) variable in SAS

October 28, 2014 3 Comments

Introduction

A common task in data analysis and statistics is to calculate the sum or mean of a continuous variable. If that variable can be categorized into 2 or more classes, you may want to get the sum or mean for each class.

This sounds like a simple task, yet I took a surprisingly long time to learn how to do this in SAS and get exactly what I want – a new data with with each category as the identifier and the calculated sum/mean as the value of a second variable. Here is an example to show you how to do it using PROC MEANS.

Read more to see an example data set and get the SAS code to calculate the sum or mean of a continuous variable by a categorical variable!

Read more of this post

Filed under Data Analysis, Descriptive Statistics, SAS Programming, Statistics, Tutorials Tagged with average, by statement, class statement, data analysis, mean by group, mean(), nway, PROC MEANS, PROC PRINT, programming, SAS, sas programming, statistics, sum by group, sum(), summation

Useful Options For Every SAS Program – Lessons and Resources from Dr. Jerry Brunner

June 24, 2014 Leave a comment

Introduction

Today, I want to share some useful options to put at the beginning of every SAS program that I write. These options will make the practicality of using SAS much easier. My applied statistics professor from the University of Toronto, Jerry Brunner*, taught me some of these options when I first learned SAS in our class, and I’m grateful for that. In later instances of using SAS in team projects, I have met SAS programmers who were delightfully surprised by the existence of these options and desperately wished that they had learned them earlier. I hope that they will help you with your SAS programming. I have also learned some useful options by posting questions on the SAS Support Communities online forum.

Clearing Output

After running your SAS program many times to test and debug, you will have accumulated numerous pages of old and useless output and log. Scrolling through and searching for the desired portion to read in either file can be tedious and difficult. Thus, it’s really helpful to have the option of clearing all of the output and the log whenever you run your script. I put the following commands on top of every one of my SAS scripts.

/* 
Useful Options For Every SAS Program 
- With Some Tips Learned From Dr. Jerry Brunner
by Eric Cai - The Chemical Statistician
*/

dm 'cle log; cle out;';
ods html close; 
ods html;

dm 'odsresults; clear';
ods listing close;
ods listing;

Read more of this post

Filed under SAS Programming, Tutorials Tagged with formdlim, line size, linesize, list file, log, lst file, noovp, output, ovp, pageno, results window, SAS, sas programming

	Eric Cai - The Chemi… on Convert multiple variables bet…
	Jack on Convert multiple variables bet…
	Eric Cai - The Chemi… on Getting the names, types, form…
	Emily V on Getting the names, types, form…
	Lauren McClain on Convert multiple variables bet…
	Eric Cai - The Chemi… on Convert multiple variables bet…
	Lauren McClain on Convert multiple variables bet…
	Eric Cai - The Chemi… on Exploratory Data Analysis: Com…
	CK on Exploratory Data Analysis: Com…
	Eric Cai - The Chemi… on Video Tutorial: Breaking Down…

The Chemical Statistician

Some SAS procedures (like PROC REG, GLM, ANOVA, SQL, and IML) end with “QUIT;”, not “RUN;”

Beware of accidental replacement of data sets with PROC SORT in SAS

Use ODS EXCLUDE ALL to suppress printing output in SAS while producing output data sets

Convert multiple variables between character and numeric formats in SAS

Introduction

A macro to automate the creation of indicator variables in SAS

An easy and efficient way to create indicator variables (a.k.a. dummy variables) from a categorical variable in SAS

Introduction

The Example Data Set

Sort a data set by ascending or descending variables using PROC SORT in SAS

Remove leading blanks when creating macro variables using PROC SQL in SAS

Store multiple strings of text as a macro variable in SAS with PROC SQL and the INTO statement

Use the LENGTH statement to pre-set the lengths of character variables in SAS – with a comparison to R

New Job at the Bank of Montreal in Toronto

Separating Unique and Duplicated Observations Using PROC SORT in SAS 9.3 and Newer Versions

Resources for Learning Data Manipulation in R, SAS and Microsoft Excel

Performing Logistic Regression in R and SAS

Introduction

The Data Set

Vancouver SAS User Group Meeting – Wednesday, November 26, 2014, at Holiday Inn Vancouver-Centre (West Broadway)

Calculating the sum or mean of a numeric (continuous) variable by a group (categorical) variable in SAS

Introduction

Eric’s Twitter Feed (@chemstateric)

Recent Comments

Popular Topics

Recent Posts

About Eric

Blogs and Web Sites That I Like to Read

Archives

Categories