#### Introduction

My statistics education focused a lot on normal linear least-squares regression, and I was even told by a professor in an introductory statistics class that 95% of statistical consulting can be done with knowledge learned up to and including a course in linear regression. *Unfortunately, that advice has turned out to vastly underestimate the variety and depth of problems that I have encountered in statistical consulting, and the emphasis on linear regression has not paid dividends in my statistics career so far.* Wisdom from veteran statisticians and my own experience combine to suggest that **logistic regression** **is actually much more commonly used in industry than linear regression**. I have already started a series of short lessons on binary classification in my Statistics Lesson of the Day and Machine Learning Lesson of the Day. In this post, I will show how to perform logistic regression in both R and SAS. I will discuss how to interpret the results in a later post.

#### The Data Set

The data set that I will use is slightly modified from Michael Brannick’s web page that explains logistic regression. I copied and pasted the data from his web page into Excel, **modified the data to create a new data set**, then saved it as an Excel spreadsheet called heart attack.xlsx.

This data set has 3 variables (I have renamed them for convenience in my R programming).

**ha2 **– Whether or not a patient had a second heart attack. If ha2 = 1, then the patient had a second heart attack; otherwise, if ha2 = 0, then the patient did not have a second heart attack. This is the response variable.
**treatment** – Whether or not the patient completed an anger control treatment program.
**anxiety** – A continuous variable that scores the patient’s anxiety level. A higher score denotes higher anxiety.

**Read the rest of this post to get the full scripts and view the full outputs of this logistic regression model in both R and SAS!**

Read more of this post

## Recent Comments