In day-to-day life people use the words ‘odds’ and ‘probability’ interchangeably. They are both terms that imply an estimate of chance. I also see these terms used interchangeably in the workplace. People can say that the ‘odds are twice as high’, and they can understand that to mean ‘the probability is double’. Well, that’s wrong!
Odds and probability are related concepts but very different in scale and meaning. When mixed up in the wrong contexts this can lead to mistaken estimates of chance, which can then lead to erroneous decision making.
In this article, I want to illustrate what those differences are and how, in confusing the two, you can really affect analysis and research.
What is the difference between probability and odds?
Imagine you are putting your hand inside a black bag. Inside that bag are five red balls, three blue balls and two yellow balls. This sounds like a high-school math problem right? And in high school they tend to teach us about probabilities, not odds.
A probability is defined as the number of occurrences of a certain event expressed as a proportion of all events that could occur. In our black bag there are three blue balls, but there are ten balls in total, so the probability that you pull out a blue ball is three divided by ten which is 30% or 0.3.
Odds is defined as the number of occurrences of a certain event expressed as a proportion of the number of non-occurrences of that event. In our black bag there are three blue balls, but there are seven balls which are not blue, so the odds for drawing a blue ball are 3:7. Odds are often expressed as odds for, which in this case would be three divided by seven, which is about 43% or 0.43, or odds against, which would be seven divided by three, which is 233% or 2.33
How does odds relate to probability?
Its not hard to convert between odds and probability, here are two simple conversion formulas — see if you can understand why these formulas work based on the simple example I gave above:
- If the probability of something happening is P, then the odds of it happening is P/(1 — P).
- If the odds of something happening is O, then the probability of it happening is O/(1 + O).
It’s also helpful to think of how odds and probability differ in their properties:
- Probability has a limited range from zero to one. Odds has an infinite range.
- The probability of something happening is always less than the odds of it happening (assuming the probability is non-zero).
- The smaller the probability, the more similar probability and odds will be. For example, the probability of winning the UK National Lottery is 0.0000000221938762. The odds are 0.0000000221938767.
- The larger the probability, the larger the difference with the odds. High probabilities have astronomical odds. A probability of 90% equates to odds of 900%, 99% equates to 9,900% and 99.999% equates to 9,999,900%.
Why does this matter?
One area where this can cause substantial errors in understanding is when logistic regression is used in studying a problem. Logistic regression is used to model how certain input variables might influence a binary outcome (eg Yes or No, Group Member or Not Group Member). For example, a logistic regression might be used to determine how a number of lifestyle factors influenced a five year survival outcome for a disease. It might be used to determine how a variety of employment factors might influence whether or not an employee is performing well at a certain point in time.
Logistic regression models use odds ratios to quantify the degree of influence of a given input variable. The odds ratio describes the change in odds of the outcome based on a one unit increase in the input variable. You might calculate for example, that each additional day spent in a learning program increases the odds of high performance of an employee by 10%.
This does not mean that ‘for every day spent learning the probability of high performance increases by 10%’. If half of the organization are high performers, then the baseline odds are 100%. A 10% increase in those odds translates to only an increase in probability of around 2%. As you can see, this confusion could lead to wild overestimates of the effect of the input variable, which could then lead to all sorts of business decisions being made on erroneous statistical conclusions.
So, when you hear that word ‘odds’ being used, be careful. It’s odds-on that an erroneous decision could be in the works.