Lecture 10: Probability#
STATS60 so far#
Unit 1: Thinking About Scale.
Putting numbers in context.
Fermi estimates.
Cost benefit analysis.
Unit 2: Exploratory Data Analysis.
Data terminology.
Data visualization.
Summaries of center (mean and median).
Summaries of variability (standard deviation, quantiles).
Summaries of association (correlation coefficient).
Looking ahead#
Unit 3: Probability.
The mathematics of uncertainty.
One of the foundations of statistics.
Probability will help us:
Assess how likely/unlikely coincidences are.
Make decisions when we don’t know all the relevant information.
Generalize findings from data to a broader group of people.
Probability#
Games of chance#
Probability theory started with people trying to understand games of chance.
The Book on Games of Chance is perhaps the first mathematical text on probability and contain a section on how to use probability to cheat!
The book’s author, Cardano, had trouble holding down an academic position and made money by gambling and playing chess.

Coins#
Tossing a coin is a simple game of chance.

What is the probability that the coin lands with heads facing up? Why?
Coins#
The probability is \(1/2\) or \(50\%\).
Two possible reasons why:
There are two possible outcomes that are equally likely. The outcome of “heads facing up” has probability \(\frac{1}{2}\).
If the coin was flipped many, many times, then the coin will land with heads facing up in roughly \(50\%\) of the times.
We can ask a computer to simulate flipping many coins and count the number of heads.
Number of tosses |
10 |
100 |
1 000 |
10 000 |
|---|---|---|---|---|
Number of heads |
7 |
46 |
534 |
5003 |
Fraction of heads |
0.7 |
0.46 |
0.534 |
0.5003 |
Flipping many coins#
As more and more coins are flipped, the fraction of heads “settles down” near \(1/2\).

Dice#
A die (plural, “dice”) is a cube with six faces labelled 1,2,3,4,5,6.
When a die is rolled, the number on the top face “shows”.
What is the probability that when a die is rolled it shows a 1? Why?
Dice#
The probability is \(1/6\) or approximately \(16.7\%\).
Again there are two reasons why:
There are six possible outcomes (one for each face) and they are all equally likely. The probability the die shows a 1 is therefore \(1/6\).
If the die was rolled over and over again, then a 1 will show in about \(1/6\) of the times.
Again we can get the computer to throw many dice and count the number of 1’s.
Number of throws |
10 |
100 |
1 000 |
10 000 |
|---|---|---|---|---|
Number of 1s |
1 |
16 |
189 |
1759 |
Fraction of 1s |
0.1 |
0.16 |
0.189 |
0.1759 |
Throwing many dice#
The fraction of times settles down at \(1/6\).

Dice again#
What is the probability that the die shows a 1 or shows a 2?
The probability is \(1/3\) or roughly \(33.3\%\).
There are six possible outcomes (one for each face) and they are all equally likely. There is one outcome where the die shows a 1 and one outcome where the die shows a 2. The probability is therefore \(2/6=1/3\).
If the die was rolled over and over again, then a 1 will be the top face in about \(1/6\) of the times and 2 will be on the top face in about \(1/6\) of the times. This means that 1 or 2 will be on the two face happens in about \(1/6+1/6=1/3\) of the times.
Number of throws |
10 |
100 |
1 000 |
10 000 |
|---|---|---|---|---|
Number of 1s or 2s |
4 |
38 |
335 |
3335 |
Fraction of 1s or 2s |
0.4 |
0.38 |
0.335 |
0.3335 |
Throwing many dice#
The fraction of times settles down at \(1/3\).

Definition of probability#
Random process and sample space#
A random process is something that results in a random outcome.
The set of possible outcomes is called the sample space.
Examples:
Flipping a coin is a random process. The sample space contains two possible outcomes “Heads” and “Tails”.
Rolling a die is a random process. The sample space contains six possible outcomes: 1, 2, 3, 4, 5, 6.
Events#
A random process is something that results in a random outcome.
The set of all possible outcomes is called the sample space.
An event is a collection of some of the possible outcomes.
Examples:
Rolling a die: the die shows 1 or 2 is an event (it contains two possible outcomes).
Tossing a coin: the coin landing on heads (this an event that just contains a single outcome).
Probability#
A random process is something that results in a random outcome.
The set of all possible outcomes is called the sample space.
An event is a collection of some of the possible outcomes.
If all outcomes are equally likely, then the probability of an event is equal to the number of outcomes in the event divided by the total number of possible outcomes.
The probability of an event is written as \(\mathrm{Pr}[\text{event}]\).
Diagram - sample space#
The sample space is the set of all possible outcomes.

Diagram - event#
An event is a collection of some of the possible outcomes.

Die example#
Rolling a die example:
There are 6 total possible outcomes.
The event “die shows 1 or 2” contains two outcomes.
The probability is \(2/6=1/3\).
Calculating probabilities#
There are two main methods for calculating the probability of an event:
Direct: count the number of outcomes in the event and divide by the total number of outcomes.
Simulation: repeat the random process many times and compute the fraction of times when the event occurs.
These two methods will give you the same answer!
We will use both methods to calculate probabilities.
For simulations, there are a few options:
You can use websites that do probability simulations.
You can do a tactile simulation (flip real coins/roll real dice).
We will give you the output of a computer simulation.
Examples#
Balls in a bag#
You are playing a game where you draw a single ball from a bag that contains a mix of white and red balls.
You win a dollar if you draw a red ball.
Which of the two bags would you prefer to use? Why?
Bag A: 2 white balls and 3 red balls.
Bag B: 20 white balls and 30 red balls.
Balls in a bag#
Both bags have the same chance of winning.
For bag A:
For bag B:
In probability, the relative number of outcomes is what matters!
Random outfits#
Suppose that each day you pick a random outfit by:
First picking one of 3 pairs of shoes.
Then picking one of 2 pairs of pants.
What is the total number of possible outfits?
Random outfits#
The total number of possible outfits is \(3 \times 2 = 6\).
This can be visualized with a “decision tree.”

Multiplication rule#
Consider a compound random process that consists of two smaller random process: first process \(A\) and then process \(B\).
Suppose that process \(A\) has \(a\) possible outcomes, and for each of these outcomes, process \(B\) has \(b\) possible outcomes.
Then the compound process has \(a \times b\) possible outcomes.
For example:
There were \(3\) possible outcomes for the choice of shoes (random process \(A\)).
For each of these outcomes, there were \(2\) possible outcomes for the choice of pants (random process \(B\)).
The total number of shoes and pants combinations is \(3 \times 2=6\) (compound process).
Random outfits again#
The multiplication rule can also be used when there are more than two processes.
Suppose that each day you pick a random outfit by:
First picking one of 3 pairs of shoes.
Then picking one of 2 pairs of pants.
Then picking one of 4 possible shirts.
What is the total number of possible outfits?
Random outfits#
The multiplication rule says that the total number of outfits is \(3 \times 2 \times 4 = 24\).

Random outfits#
What is the probability that the person wears flip-flops and a long sleeved shirt?

There are \(1 \times 2 \times 2 = 4\) outcomes that correspond to flip-flops and a long sleeved shirt. The probability is \(\mathrm{Pr}[\text{flip-flops and long sleeved shirt}] = \frac{4}{24} = \frac{1}{6}\).
Polling#
In the course survey there were:
45 student who think we don’t live in a simulation.
15 students who think we might live in a simulation.
8 students who think we do live a simulation.
Suppose I select one of those students at random. What is the probability that they think we live in a simulation?
Polling two students#
Now suppose that I pick two different students. What is the probability that they both think we live in a simulation?
By the multiplication rule: the total number of possible outcomes is \(68 \times 67\).
There are \(8 \times 7\) outcomes where both students think we live in a simulation. $\(\mathrm{Pr}[\text{Both think we live in a simulation}] = \frac{8 \times 7}{68 \times 67} = 0.012\)$
Birthdays#
On Wednesday, we will compute the probability that two people in class has the same birthday.
Today, we will compute the probability of some simpler events related to birthdays.
What is the probability that a randomly selected person has their birthday today (April 20)?
What assumptions do you need to make?
Birthdays#
Assuming that all birthdays are equally likely and that we can ignore leap years, then the probability is: $\(\mathrm{Pr}[\text{Birthday on April 20}] = \frac{1}{365} = 0.002740 \)$
If we account for leap years and still assume that all birthdays are equally likely:
Is it reasonable to assume that all birthdays are equally likely? How could we make our answer more realistic?
Birthdays#
What is the probability that a randomly chosen person has their birthday on the 20th day of the month?
Complements#
Complements#
Remember: an event is a collection of possible outcomes.
The complement of an event is the collection of all outcomes not in the original event.
For the event “the die shows 1 or 2”, the complement is “the die shows 3, 4, 5 or 6”.
You can think of the complement event as the opposite event.
Sample space and complements#

Probability of complements#
Calculate the probabilities of the following events. What do you notice?
The die shows 1 or 2.
The die shows 3, 4, 5, or 6.
The probability of the complement is always 1 minus the probability of the original event.
Birthdays again#
What is the complement of the event “a randomly chosen person has their birthday on April 20”?
What is the probability of this event?
The complement is “a randomly chosen person has their birthday on a day other than April 20.”
The probability is
Birthdays again#
Suppose now that we randomly choose three people.
What is the complement of the event “at least one person has their birthday on April 20”?
Both of these are correct:
None of the three people have their birthday on April 20.
All three people have their birthday on a day other than April 20.
Birthdays again#
What is the complement of “at least one person in this class has their birthday today?”
Both of these are correct:
No one in the class has their birthday on April 20.
Everyone in the class has their birthday on a day other than April 20.
Roughly 120 people come to class. $\(\mathrm{Pr}[\text{all birthdays in class not on April 20}] = \left(\frac{364}{365}\right)^{120} = 0.72 \)$
Take-away: although it is unlikely that a specific person has their birthday today, there is a decent chance that in a large group it is someone’s birthday.
Probability recap#
A random process is something that results in a random outcome.
The set of possible outcomes is called the sample space.
An event is a collection of some of the possible outcomes.
If all outcomes are equally likely, then the probability of an event is equal to the number of outcomes in the event divided by the total number of possible outcomes.
You can use the multiplication rule to calculate the number of outcomes in the sample space or event.
The complement of an event is the collection of all outcomes not in the original event.
The probability of a complement is one minus the probability of the original event.