Odds and Probability: Commonly Misused Terms in Statistics – An Illustrative Example in Baseball

Yesterday, all 15 home teams in Major League Baseball won on the same day – the first such occurrence in history.  CTV News published an article written by Mike Fitzpatrick from The Associated Press that reported on this event.  The article states, “Viewing every game as a 50-50 proposition independent of all others, STATS figured the odds of a home sweep on a night with a full major league schedule was 1 in 32,768.”  (Emphases added)

odds of all 15 home teams winning on same day

Screenshot captured at 5:35 pm Vancouver time on Wednesday, August 12, 2015.

Out of curiosity, I wanted to reproduce this result.  This event is an intersection of 15 independent Bernoulli random variables, all with the probability of the home team winning being 0.5.

P[(\text{Winner}_1 = \text{Home Team}_1) \cap (\text{Winner}_2 = \text{Home Team}_2) \cap \ldots \cap (\text{Winner}_{15}= \text{Home Team}_{15})]

Since all 15 games are assumed to be mutually independent, the probability of all 15 home teams winning is just

P(\text{All 15 Home Teams Win}) = \prod_{n = 1}^{15} P(\text{Winner}_i = \text{Home Team}_i)

P(\text{All 15 Home Teams Win}) = 0.5^{15} = 0.00003051757

Now, let’s connect this probability to odds.

It is important to note that

  • odds is only applicable to Bernoulli random variables (i.e. binary events)
  • odds is the ratio of the probability of success to the probability of failure

For our example,

\text{Odds}(\text{All 15 Home Teams Win}) = P(\text{All 15 Home Teams Win}) \ \div \ P(\text{At least 1 Home Team Loses})

\text{Odds}(\text{All 15 Home Teams Win}) = 0.00003051757 \div (1 - 0.00003051757)

\text{Odds}(\text{All 15 Home Teams Win}) = 0.0000305185

The above article states that the odds is 1 in 32,768.  The fraction 1/32768 is equal to 0.00003051757, which is NOT the odds as I just calculated.  Instead, 0.00003051757 is the probability of all 15 home teams winning.  Thus, the article incorrectly states 0.00003051757 as the odds rather than the probability.

This is an example of a common confusion between probability and odds that the media and the general public often make.  Probability and odds are two different concepts and are calculated differently, and my calculations above illustrate their differences.  Thus, exercise caution when reading statements about probability and odds, and make sure that the communicator of such statements knows exactly how they are calculated and which one is more applicable.

Advertisements

8 Responses to Odds and Probability: Commonly Misused Terms in Statistics – An Illustrative Example in Baseball

  1. Kevin Brogle says:

    The probability is greater than presented as the likelihood of the home team winning in baseball is ~55% rather than the estimated 50%.

  2. Pingback: Distilled News | Data Analytics & R

  3. Stephen Denton says:

    Hello Eric,

    I don’t know I would make a big deal about this. The statement 1 in 32,768 is the correct frequency based interpretation of the likelihood of the event (1 state out of 2^15 possible states was realized). So it comes down to how the word “in” is translated into mathematics. You are correct that if I interpret “in” strictly as “divided by” then I am reporting a probability (and labelling it odds seem wrong). And technically odds would be the number of states realized over the number of states not realized (1/(2^15 – 1) = 1/32767 = 0.0000305185). But if your suggestion is to report these odds as 1 in 32767, then you are no longer conforming to the frequency based interpretation (which most people find more intuitive and interpretable). So, perhaps the correct mathematical conversion of the word “in” when odds is wanted is to subtract the realized states (n_A) from the total number of possible states (n_T) and use that as the denominator (i.e., if you see “n_A in n_T” then odds is calculated as n_A / (n_T – n_A) ).

    Regardless, the media is perfectly right to use the frequency based interpretation, whether they call it odds or probabilities seems kind of hopeless to decry; both can be easily calculated from the reported frequency information.

  4. Stephen Denton says:

    I guess you could of course just say the odds are 1 TO 32767 and eliminate the ambiguity…

    • Hi Stephen,

      1) My objective in this blog post is to distinguish between probability and odds, which are 2 different concepts. This distinction holds regardless of how you interpret probability.

      2) I agree – we should use the preposition “in” for probability and “to” for odds. I did this in my blog post; however, I appreciate you pointing this out.

      Eric

      • Stephen Denton says:

        I guess my point is that probability and odds are not two ‘different’ concepts, they are two closely related concepts and confusion is inevitable.

      • Hi Stephen,

        1) Probability applies to many types of variables, not just binary variables. Odds applies to binary variables only. Thus, I argue that they are still 2 different concepts, however related they may be.

        2) The inevitability of confusion does not excuse improper use of language. Let’s correct wrong word usage when we see it, regardless of how easy or inevitable it may be.

        Eric

Your thoughtful comments are much appreciated!

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: