Machine Learning Lesson of the Day – Babies and Non-Statisticians Practice Unsupervised Learning All the Time!
January 9, 2014 1 Comment
My recent lesson on unsupervised learning may make it seem like a rather esoteric field, with attempts to categorize it using words like “clustering“, “density estimation“, or “dimensionality reduction“. However, unsupervised learning is actually how we as human beings often learn about the world that we live in – whether you are a baby learning what to eat or someone reading this blog.
- Babies use their mouths and their sense of taste to explore the world, and they can probably determine what satisfies their hunger and what doesn’t pretty quickly. As they expose themselves to different objects – a formula bottle, a pacifier, a mother’s breast, their own fingers – their taste and digestive system are recognizing these inputs and detecting patterns of what satisfies their hunger and what doesn’t. This all happens before they even fully understand what “food” or “hunger” means. This will probably happen before someone says “This is food” to them and they have the language capacity to know what those 3 words mean.
- When a baby finall realizes what hunger feels like and develops the initiative to find something to eat, then that becomes a supervised learning problem: What attributes about an object will help me to determine if it’s food or not?
- I recent wrote a page called “About this Blog” to categorize the different types of posts that I have written on this blog so far. I did not aim to predict anything about any blog post; I simply wanted to organize the 50-plus blog posts into a few categories and make it easier for you to find them. I ultimately clustered my blog posts into 4
mutually exclusivecategories (now with some overlaps). You can think of each blog post as a vector-valued input, and I chose 2 elements – the length and the topic – of each vector to find a way to group them into classes that are very similar in length and topic within each class and very different in length and topic between the classes. (I used those 2 elements – or features – to maximize the similarities within each category and minimized the dissimilarities between the 4 categories.) There were other features that I could have used – whether it had an image (binary feature), the number of colours of the fonts (integer-valued feature), the time of publication of the post (continuous feature) – but length and topic were sufficient for me to arrive at the 4 categories of “Tutorials”, “Lessons”, “Advice”, and “Notifications about Presentations and Appearances at Upcoming Events”.