Machine Learning Lesson of the Day – K-Nearest Neighbours Regression

I recently introduced the K-nearest neighbours classifier.  Some slight adjustments to the same algorithm can make it into a regression technique.

Given a training set and a new input X, we can predict the target of the new input by

  1. identifying the K data (the K “neighbours”) in the training set that are closest to X by Euclidean distance
  2. build a linear regression model to predict the target for X
  • the K data are the predictors
  • the reciprocals of the predictors’ distances to X are their respective regression coefficients (the “weights”)

Validation or cross-validation can be used to determine the best number of “K”.

Advertisements

Machine Learning Lesson of the Day: The K-Nearest Neighbours Classifier

The K-nearest neighbours (KNN) classifier is a non-parametric classification technique that classifies an input X by

  1. identifying the K data (the K “neighbours”) in the training set that are closest to X
  2. counting the number of “neighbours” that belong to each class of the target variable
  3. classifying X by the most common class to which its neighbours belong

K is usually an odd number to avoid resolving ties.

The proximity of the neighbours to X is usually defined by Euclidean distance.

Validation or cross-validation can be used to determine the best number of “K”.