Video Tutorial – Obtaining the Expected Value of the Exponential Distribution Using the Moment Generating Function

In this video tutorial on YouTube, I use the exponential distribution’s moment generating function (MGF) to obtain the expected value of this distribution.  Visit my YouTube channel to watch more video tutorials!


Video Tutorial – Rolling 2 Dice: An Intuitive Explanation of The Central Limit Theorem

According to the central limit theorem, if

  • n random variables, X_1, ..., X_n, are independent and identically distributed,
  • n is sufficiently large,

then the distribution of their sample mean, \bar{X_n}, is approximately normal, and this approximation is better as n increases.

One of the most remarkable aspects of the central limit theorem (CLT) is its validity for any parent distribution of X_1, ..., X_n.  In my new Youtube channel, you will find a video tutorial that provides an intuitive explanation of why this is true by considering a thought experiment of rolling 2 dice.  This video focuses on the intuition rather than the mathematics of the CLT.  In a later video, I will discuss the technical details of the CLT and how it applies to this example.


Machine Learning Lesson of the Day – Supervised Learning: Classification and Regression

Supervised learning has 2 categories:

  • In classification, the target variable is categorical.
  • In regression, the target variable is continuous.

Thus, regression in statistics is different from regression in supervised learning.

In statistics,

  • regression is used to model relationships between predictors and targets, and the targets could be continuous or categorical.  
  • a regression model usually includes 2 components to describe such relationships:
    • a systematic component
    • a random component.  The random component of this relationship is mathematically described by some probability distribution.  
  • most regression models in statistics also have assumptions about the statistical independence or dependence between the predictors and/or between the observations.  
  • many statistical models also aim to provide interpretable relationships between the predictors and targets.  
    • For example, in simple linear regression, the slope parameter, \beta_1, predicts the change in the target, Y, for every unit increase in the predictor, X.

In supervised learning,

  • target variables in regression must be continuous
    • categorical target variables are modelled in classification
  • regression has less or even no emphasis on using probability to describe the random variation between the predictor and the target
    • Random forests are powerful tools for both classification and regression, but they do not use probability to describe the relationship between the predictors and the target.
  • regression has less or even no emphasis on providing interpretable relationships between the predictors and targets.  
    • Neural networks are powerful tools for both classification and regression, but they do not provide interpretable relationships between the predictors and the target.

***The last 2 points are applicable to classification, too.

In general, supervised learning puts much more emphasis on accurate prediction than statistics.

Since regression in supervised learning includes only continuous targets, this results in some confusing terminology between the 2 fields.  For example, logistic regression is a commonly used technique in both statistics and supervised learning.  However, despite its name, it is a classification technique in supervised learning, because the response variable in logistic regression is categorical.