# Odds and Probability: Commonly Misused Terms in Statistics – An Illustrative Example in Baseball

August 12, 2015 8 Comments

Yesterday, all 15 home teams in Major League Baseball won on the same day – the first such occurrence in history. CTV News published an article written by Mike Fitzpatrick from The Associated Press that reported on this event. The article states, “Viewing every game as a 50-50 proposition independent of all others, STATS figured the **odds** of a home sweep on a night with a full major league schedule was **1 in 32,768**.” (Emphases added)

Out of curiosity, I wanted to reproduce this result. This event is an intersection of 15 independent Bernoulli random variables, all with the probability of the home team winning being 0.5.

Since all 15 games are assumed to be mutually independent, the probability of all 15 home teams winning is just

Now, let’s connect this probability to odds.

It is important to note that

- odds is only applicable to Bernoulli random variables (i.e. binary events)
- odds is the ratio of the probability of success to the probability of failure

For our example,

The above article states that the odds is 1 in 32,768**. **The fraction 1/32768 is equal to 0.00003051757**, which is NOT the odds as I just calculated. **Instead, 0.00003051757 is the probability of all 15 home teams winning. Thus, the article incorrectly states 0.00003051757 as the odds rather than the probability.

This is an example of a common confusion between probability and odds that the media and the general public often make. Probability and odds are two different concepts and are calculated differently, and my calculations above illustrate their differences. **Thus, exercise caution when reading statements about probability and odds, and make sure that the communicator of such statements knows exactly how they are calculated and which one is more applicable.**

The probability is greater than presented as the likelihood of the home team winning in baseball is ~55% rather than the estimated 50%.

Hi Kevin,

You may be right (source please!), but the original assumption in the article is that the probability of winning is the same for both teams, so I wanted to reproduce the calculation based on that statement.

Pingback: Distilled News | Data Analytics & R

Hello Eric,

I don’t know I would make a big deal about this. The statement 1 in 32,768 is the correct frequency based interpretation of the likelihood of the event (1 state out of 2^15 possible states was realized). So it comes down to how the word “in” is translated into mathematics. You are correct that if I interpret “in” strictly as “divided by” then I am reporting a probability (and labelling it odds seem wrong). And technically odds would be the number of states realized over the number of states not realized (1/(2^15 – 1) = 1/32767 = 0.0000305185). But if your suggestion is to report these odds as 1 in 32767, then you are no longer conforming to the frequency based interpretation (which most people find more intuitive and interpretable). So, perhaps the correct mathematical conversion of the word “in” when odds is wanted is to subtract the realized states (n_A) from the total number of possible states (n_T) and use that as the denominator (i.e., if you see “n_A in n_T” then odds is calculated as n_A / (n_T – n_A) ).

Regardless, the media is perfectly right to use the frequency based interpretation, whether they call it odds or probabilities seems kind of hopeless to decry; both can be easily calculated from the reported frequency information.

I guess you could of course just say the odds are 1 TO 32767 and eliminate the ambiguity…

Hi Stephen,

1) My objective in this blog post is to distinguish between probability and odds, which are 2 different concepts. This distinction holds regardless of how you interpret probability.

2) I agree – we should use the preposition “in” for probability and “to” for odds. I did this in my blog post; however, I appreciate you pointing this out.

Eric

I guess my point is that probability and odds are not two ‘different’ concepts, they are two closely related concepts and confusion is inevitable.

Hi Stephen,

1) Probability applies to many types of variables, not just binary variables. Odds applies to binary variables only. Thus, I argue that they are still 2 different concepts, however related they may be.

2) The inevitability of confusion does not excuse improper use of language. Let’s correct wrong word usage when we see it, regardless of how easy or inevitable it may be.

Eric