Machine Learning Lesson of the Day – Babies and Non-Statisticians Practice Unsupervised Learning All the Time!

My recent lesson on unsupervised learning may make it seem like a rather esoteric field, with attempts to categorize it using words like “clustering“, “density estimation“, or “dimensionality reduction“.  However, unsupervised learning is actually how we as human beings often learn about the world that we live in – whether you are a baby learning what to eat or someone reading this blog.

  • Babies use their mouths and their sense of taste to explore the world, and they can probably determine what satisfies their hunger and what doesn’t pretty quickly.  As they expose themselves to different objects – a formula bottle, a pacifier, a mother’s breast, their own fingers – their taste and digestive system are recognizing these inputs and detecting patterns of what satisfies their hunger and what doesn’t.  This all happens before they even fully understand what “food” or “hunger” means.  This will probably happen before someone says “This is food” to them and they have the language capacity to know what those 3 words mean.
    • When a baby finall realizes what hunger feels like and develops the initiative to find something to eat, then that becomes a supervised learning problem: What attributes about an object will help me to determine if it’s food or not?
  • I recent wrote a page called “About this Blog” to categorize the different types of posts that I have written on this blog so far.  I did not aim to predict anything about any blog post; I simply wanted to organize the 50-plus blog posts into a few categories and make it easier for you to find them.  I ultimately clustered my blog posts into 4 mutually exclusive categories (now with some overlaps).  You can think of each blog post as a vector-valued input, and I chose 2 elements – the length and the topic – of each vector to find a way to group them into classes that are very similar in length and topic within each class and very different in length and topic between the classes.  (I used those 2 elements – or features – to maximize the similarities within each category and minimized the dissimilarities between the 4 categories.)  There were other features that I could have used – whether it had an image (binary feature), the number of colours of the fonts (integer-valued feature), the time of publication of the post (continuous feature) – but length and topic were sufficient for me to arrive at the 4 categories of “Tutorials”, “Lessons”, “Advice”, and “Notifications about Presentations and Appearances at Upcoming Events”.

Presentation Slides: Machine Learning, Predictive Modelling, and Pattern Recognition in Business Analytics

I recently delivered a presentation entitled “Using Advanced Predictive Modelling and Pattern Recognition in Business Analytics” at the Statistical Society of Canada’s (SSC’s) Southern Ontario Regional Association (SORA) Business Analytics Seminar Series.  In this presentation, I

– discussed how traditional statistical techniques often fail in analyzing large data sets

– defined and described machine learning, supervised learning, unsupervised learning, and the many classes of techniques within these fields, as well as common examples in business analytics to illustrate these concepts

– introduced partial least squares regression and bootstrap forest (or random forest) as two examples of supervised learning (0r predictive modelling) techniques that can effectively overcome the common failures of traditional statistical techniques and can be easily implemented in JMP

– illustrated how partial least squares regression and bootstrap forest were successfully used to solve some major problems for 2 different clients at Predictum, where I currently work as a statistician

Read more of this post

Presentation Slides – Finding Patterns in Data with K-Means Clustering in JMP and SAS

My slides on K-means clustering at the Toronto Area SAS Society (TASS) meeting on December 14, 2012, can be found here.

Screen Shot 2014-01-04 at 8.15.18 PM

This image is slightly enhanced from an image created by Weston.pace from Wikimedia Commons.

My Presentation on K-Means Clustering

I was very pleasured to be invited for the second time by the Toronto Area SAS Society (TASS) to deliver a presentation on machine learning.  (I previously presented on partial least squares regression.)  At its recent meeting on December 14, 2012, I introduced an unsupervised learning technique called K-means clustering.

I first defined clustering as a set of techniques for identifying groups of objects by maximizing a similarity criterion or, equivalently, minimizing a dissimilarity criterion.  I then defined K-means clustering specifically as a clustering technique that uses Euclidean proximity to a group mean as its similarity criterion.  I illustrated how this technique works with a simple 2-dimensional example; you can follow along this example in the slides by watching the sequence of images of the clusters toward convergence.  As with many other machine learning techniques, some arbitrary decisions need to be made to initiate the algorithm for K-means clustering:

  1. How many clusters should there be?
  2. What is the mean of each cluster?

I provided some guidelines on how to make these decisions in these slides.

Read more of this post