Machine Learning Lesson of the Day – Linear Gaussian Basis Function Models

I recently introduced the use of linear basis function models for supervised learning problems that involve non-linear relationships between the predictors and the target.  A common type of basis function for such models is the Gaussian basis function.  This type of model uses the kernel of the normal (or Gaussian) probability density function (PDF) as the basis function.

$\phi_j(x) = exp[-(x - \mu_j)^2 \div 2\sigma^2]$

The $\sigma$ in this basis function determines the spacing between the different basis functions that combine to form the model.

Notice that this is just the normal PDF without the scaling factor of $1/\sqrt{2\pi \sigma^2}$; the scaling factor ensures that the normal PDF integrates to 1 over its support set.  In a linear basis function model, the regression coefficients are the weights for the basis functions, and these weights will scale Gaussian basis functions to fit the data that are local to $\mu_j$.  Thus, there is no need to include that scaling factor of $1/\sqrt{2\pi \sigma^2}$, because the scaling is already being handled by the regression coefficients.

The Gaussian basis function model is useful because

• it can model many non-linear relationships between the predictor and the target surprisingly well,
• each basis function is non-zero over a very small interval and is zero everywhere else.  These local basis functions result in a very sparse design matrix (i.e. one with mostly zeros) that leads to much faster computation.