In my statistics classes, I learned to use the **variance** or the **standard deviation** to measure the **variability** or **dispersion** of a data set. However, consider the following 2 hypothetical cases:

- the standard deviation for the
**incomes** **of households** in Canada is $2,000
- the standard deviation for the
**incomes of the 5 major banks** in Canada is $2,000

Even though this measure of dispersion has the same value for both sets of income data, $2,000 is a significant amount for a household, whereas $2,000 is not a lot of money for one of the “Big Five” banks. Thus, **the standard deviation alone does not give a fully accurate sense of the relative variability between the 2 data sets**. One way to overcome this limitation is to **take the mean of the data sets** into account.

A useful statistic for measuring the variability of a data set while scaling by the mean is the **sample coefficient of variation**:

where is the **sample standard deviation** and is the **sample mean**.

Analogously, the **coefficient of variation** for a random variable is

where is the random variable’s **standard deviation** and is the random variable’s **expected value**.

The coefficient of variation is a very useful statistic that I, unfortunately, never learned in my introductory statistics classes. I hope that all new statistics students get to learn this alternative measure of dispersion.

## Recent Comments