## Machine Learning Lesson of the Day – Supervised Learning: Classification and Regression

January 5, 2014 1 Comment

**Supervised learning** has 2 categories:

- In
**classification**, the target variable is categorical. - In
**regression**, the target variable is continuous.

Thus, **regression in statistics** is different from

**regression in**.

**supervised learning**In statistics,

- regression is used to model relationships between predictors and targets, and
*the targets could be continuous or categorical*. - a regression model usually includes 2 components to describe such relationships:
- a
**systematic**component - a
**random**component. The random component of this relationship is mathematically described by some**probability distribution**.

- a
- most regression models in statistics also have assumptions about the
**statistical****independence**or**dependence**between the predictors and/or between the observations. - many statistical models also aim to provide interpretable relationships between the predictors and targets.
- For example, in simple linear regression, the slope parameter, , predicts the change in the target, , for every unit increase in the predictor, .

In supervised learning,

- target variables in regression must be continuous
- categorical target variables are modelled in classification

- regression has less or even no emphasis on using probability to describe the random variation between the predictor and the target
- Random forests are powerful tools for both classification and regression, but they do not use probability to describe the relationship between the predictors and the target.

- regression has less or even no emphasis on providing interpretable relationships between the predictors and targets.
- Neural networks are powerful tools for both classification and regression, but they do not provide interpretable relationships between the predictors and the target.

****The last 2 points are applicable to classification, too.*

In general, supervised learning puts much more emphasis on **accurate prediction** than statistics.

Since regression in supervised learning includes only continuous targets, this results in some confusing terminology between the 2 fields. For example, logistic regression is a commonly used technique in both statistics and supervised learning. *However, despite its name, it is a classification technique in supervised learning, because the response variable in logistic regression is categorical.*

## Recent Comments