**Supervised learning** has 2 categories:

- In
**classification**, the target variable is categorical.
- In
**regression**, the target variable is continuous.

Thus, **regression in ****statistics** is different from **regression in ****supervised learning**.

In statistics,

- regression is used to model relationships between predictors and targets, and
*the targets could be continuous or categorical*.
- a regression model usually includes 2 components to describe such relationships:
- a
** systematic** component
- a
**random** component. The random component of this relationship is mathematically described by some **probability distribution**.

- most regression models in statistics also have assumptions about the
**statistical** **independence** or **dependence** between the predictors and/or between the observations.
- many statistical models also aim to provide interpretable relationships between the predictors and targets.
- For example, in simple linear regression, the slope parameter, , predicts the change in the target, , for every unit increase in the predictor, .

In supervised learning,

- target variables in regression must be continuous
- categorical target variables are modelled in classification

- regression has less or even no emphasis on using probability to describe the random variation between the predictor and the target
- Random forests are powerful tools for both classification and regression, but they do not use probability to describe the relationship between the predictors and the target.

- regression has less or even no emphasis on providing interpretable relationships between the predictors and targets.
- Neural networks are powerful tools for both classification and regression, but they do not provide interpretable relationships between the predictors and the target.

****The last 2 points are applicable to classification, too.*

In general, supervised learning puts much more emphasis on **accurate prediction** than statistics.

Since regression in supervised learning includes only continuous targets, this results in some confusing terminology between the 2 fields. For example, logistic regression is a commonly used technique in both statistics and supervised learning. *However, despite its name, it is a classification technique in supervised learning, because the response variable in logistic regression is categorical.*

## Recent Comments