## Machine Learning Lesson of the Day – K-Nearest Neighbours Regression

I recently introduced the K-nearest neighbours classifier.  Some slight adjustments to the same algorithm can make it into a regression technique.

Given a training set and a new input $X$, we can predict the target of the new input by

1. identifying the K data (the K “neighbours”) in the training set that are closest to $X$ by Euclidean distance
2. build a linear regression model to predict the target for $X$
• the K data are the predictors
• the reciprocals of the predictors’ distances to $X$ are their respective regression coefficients (the “weights”)

Validation or cross-validation can be used to determine the best number of “K”.

## Machine Learning Lesson of the Day: The K-Nearest Neighbours Classifier

The K-nearest neighbours (KNN) classifier is a non-parametric classification technique that classifies an input $X$ by

1. identifying the K data (the K “neighbours”) in the training set that are closest to $X$
2. counting the number of “neighbours” that belong to each class of the target variable
3. classifying $X$ by the most common class to which its neighbours belong

K is usually an odd number to avoid resolving ties.

The proximity of the neighbours to $X$ is usually defined by Euclidean distance.

Validation or cross-validation can be used to determine the best number of “K”.