Section 3.1 Introduction to Probability
¶We all have some intuitive understanding of the notions of “chance” and “probability”. When buying a lottery ticket, we know that there is a chance of winning the jackpot, but we also know that this chance is very small. Before leaving home in the morning, we check the weather forecast and see that, with probability \(80\%\text{,}\) we get \(3\) inches of rain in Honolulu. In this chapter, we will give a formal definition of this notion of “probability”.
Subsection 3.1.1 Probability Spaces
¶In this section, we give a formal definition of the notion of “probability” in terms of sets and functions.
Definition 3.1.1. Sample Space.
A sample space \(S\) is a non-empty countable set. Each element of \(S\) is called an outcome and each subset of \(S\) is called an event.
In daily life, we express probabilities in terms of percentages. For example, the weather forecast may tell us that, with \(80\%\) probability, we will be getting a snowstorm today. In probability theory, probabilities are expressed in terms of numbers in the interval \([0,1]\text{.}\) A probability of \(80\%\) becomes a probability of \(0.8\text{.}\)
Definition 3.1.2. Probability Function or Distribution.
Let \(S\) be a sample space. A probability function on \(S\) is a function \(P : S \rightarrow \mathbb{R}\) such that
- for all \(x \in S\text{,}\) \(0 \leq P(x) \leq 1\text{,}\) and
- \(\sum_{x \in S} P(x) = 1\text{.}\)
For any outcome \(x\) in the sample space \(S\text{,}\) we will refer to \(P(x)\) as the probability that the outcome is equal to \(x\text{.}\)
Definition 3.1.3. Probability Space.
A probability space is a pair \((S,P)\text{,}\) where \(S\) is a sample space and \(P : S \rightarrow \mathbb{R}\) is a probability function on \(S\text{.}\)
Definition 3.1.4. Probability of an Event.
A probability function \(P : S \rightarrow \mathbb{R}\) maps each element of the sample space \(S\) (i.e., each outcome) to a real number in the interval \([0,1]\text{.}\) It turns out to be useful to extend this function so that it maps any event to a real number in \([0,1]\text{.}\) If \(A\) is an event (i.e., \(A \subseteq S\)), then we define
We will refer to \(P(A)\) as the probability that the event \(A\) occurs.
Note that since \(S \subseteq S\text{,}\) the entire sample space \(S\) is an event and
where the last equality follows from the second condition in Definition 3.1.2.
Example 3.1.5. Flipping a Coin.
Assume we flip a coin. Since there are two possible outcomes (the coin comes up either heads (\(H\)) or tails (\(T\))), the sample space is the set \(S = \{H,T \}\text{.}\) If the coin is fair, i.e., the probabilities of \(H\) and \(T\) are equal, then the probability function \(P : S \rightarrow \mathbb{R}\) is given by
Observe that this function \(P\) satisfies the two conditions in Definition 3.1.2. Since this sample space has two elements, there are four events, one event for each subset. These events are
and it follows from Definition 3.1.4 that
Example 3.1.6. Flipping a Coin Twice.
If we flip a fair coin twice, then there are four possible outcomes, and the sample space becomes \(S = \{ HH , HT , TH , TT \}\text{.}\) For example, \(HT\) indicates that the first flip resulted in heads, whereas the second flip resulted in tails. In this case, the probability function \(P : S \rightarrow \mathbb{R}\) is given by \[ P(HH) = P(HT) = P(TH) = P(TT) = 1/4 . \] Observe again that this function \(P\) satisfies the two conditions in Definition 3.1.2. Since the sample space consists of \(4\) elements, the number of events is equal to \(2^4 = 16\text{.}\) For example, \(A = \{ HT , TH \}\) is an event and it follows from Definition 3.1.4 that
In words, when flipping a fair coin twice, the probability that we see one heads and one tails (without specifying the order) is equal to \(1/2\text{.}\)
Example 3.1.7. Rolling a Die Twice.
If we roll a fair die, then there are six possible outcomes (\(1\text{,}\) \(2\text{,}\) \(3\text{,}\) \(4\text{,}\) \(5\text{,}\) and \(6\)), each one occurring with probability \(1/6\text{.}\) If we roll this die twice, we obtain the sample space
where \(i\) is the result of the first roll and \(j\) is the result of the second roll. Note that \(|S| = 6 \times 6 = 36\text{.}\) Since the die is fair, each outcome has the same probability. Therefore, in order to satisfy the two conditions in Definition 3.1.2, we must have
for each outcome \((i,j)\) in \(S\text{.}\)
If we are interested in the sum of the results of the two rolls, then we define the event
which, using the notation of sets, is the same as
Consider, for example, the case when \(k=4\text{.}\) There are three possible outcomes of two rolls that result in a sum of \(4\text{.}\) These outcomes are \((1,3)\text{,}\) \((2,2)\text{,}\) and \((3,1)\text{.}\) Thus, the event \(A_4\) is equal to
In the matrix below, the leftmost column indicates the result of the first roll, the top row indicates the result of the second roll, and each entry is the sum of the results of the two corresponding rolls.
As can be seen from this matrix, the event \(A_k\) is non-empty only if \(k \in \{2,3,\ldots,12\}\text{.}\) For any other \(k\text{,}\) the event \(A_k\) is empty, which means that it can never occur.
It follows from Definition 3.1.4 that
For example, the number \(4\) occurs three times in the matrix and, therefore, the event \(A_4\) has size three. Observe that we have already seen this above. It follows that
In a similar way, we see that
A sample space is not necessarily uniquely defined. In the last example, where we were interested in the sum of the results of two rolls of a die, we could also have taken the sample space to be the set
The probability function \({P}'\) corresponding to this sample space \(S'\) is given by
because \({P}'(k)\) is the probability that we get the outcome \(k\) in the sample space \(S'\text{,}\) which is the same as the probability that event \(A_k\) occurs in the sample space \(S\text{.}\) You should verify that this function \({P}'\) satisfies the two conditions in Definition 3.1.2 and, thus, is a valid probability function on \(S'\text{.}\)
Subsection 3.1.2 Basic Rules of Probability
¶In this section, we prove some basic properties of probability functions. As we will see, all these properties follow from Definition 3.1.2. Throughout this section, \((S,P)\) is a probability space.
Theorem 3.1.8. \(P(\emptyset) = 0\).
Recall that an event is a subset of the sample space \(S\text{.}\) In particular, the empty set \(\emptyset\) is an event. Intuitively, \(P(\emptyset)\) must be zero, because it is the probability that there is no outcome, which can never occur.
Proof.
By Definition 3.1.4, we have
Since there are zero terms in this summation, its value is equal to zero.
Definition 3.1.9. Disjoint Events.
We say that two events \(A\) and \(B\) are disjoint, if \(A \cap B = \emptyset\text{.}\) A sequence \(A_1,A_2,\ldots,A_n\) of events is pairwise disjoint, if any pair in this sequence is disjoint.
Theorem 3.1.10. Probability of a Sequence of Disjoint Events.
If \(A_1,A_2,\ldots,A_n\) is a sequence of pairwise disjoint events, then
Proof.
Define \(A = A_1 \cup A_2 \cup \cdots \cup A_n\text{.}\) Using Definition 3.1.4, we have
Example 3.1.11. Probability of an Even Outcome When Flipping a Die Twice.
Assume we roll a fair die twice. What is the probability that the sum of the two results is even? If you look at the matrix in Example 3.1.7, then you see that there are \(18\) entries out of \(36\) that are even. Therefore, the probability of having an even sum is equal to \(18/36=1/2\text{.}\) Below we will give a different way to determine this probability.
The sample space is the set
where \(i\) is the result of the first roll and \(j\) is the result of the second roll. Each element of \(S\) has the same probability \(1/36\) of being an outcome of rolling the die twice.
The event we are interested in is
Observe that \(i+j\) is even if and only if both \(i\) and \(j\) are even or both \(i\) and \(j\) are odd. Therefore, we split the event \(A\) into two disjoint events
and
By Theorem 3.1.10, we have
The set \(A_1\) has \(3 \cdot 3 = 9\) elements, because there are \(3\) choices for \(i\) and \(3\) choices for \(j\text{.}\) Similarly, the set \(A_2\) has \(9\) elements. It follows that
Definition 3.1.12. Complement of an event.
If \(A\) is an event, then \(\overline{A}\) denotes its complement, i.e., \(\overline{A} = S - A\text{.}\) Intuitively, the sum of \(P(A)\) and \(P \left( \overline{A} \right)\) must be equal to one, because the event \(A\) either occurs or does not occur. Observe that this is similar to the Definition 1.2.10.
Theorem 3.1.13. Probability of Complements.
For any event \(A\text{,}\)
Proof.
Since \(A\) and \(\overline{A}\) are disjoint and \(S = A \cup \overline{A}\text{,}\) it follows from Theorem 3.1.10 that
We have seen in Definition 3.1.4 that \(P(S) = 1\text{.}\)
Theorem 3.1.14. Probability of Unions of Events.
If \(A\) and \(B\) are events, then
Proof.
This is similar to the Law of Inclusion-Exclusion Theorem 2.3.9. Since \(B - A\) and \(A \cap B\) are disjoint and \(B = (B - A) \cup (A \cap B)\text{,}\) it follows from Theorem 3.1.10 that
Next observe that \(A\) and \(B - A\) are disjoint. Since \(A \cup B = A \cup (B - A)\text{,}\) we again apply Theorem 3.1.10 and obtain
By combining these two equations, we obtain
Example 3.1.15. Probability a number is divisible by 2 or 3.
Assume we choose a number \(x\) in the sample space \(S=\{1,2,\ldots,1000\}\text{,}\) such that each element has the same probability \(1/1000\) of being chosen. What is the probability that \(x\) is divisible by \(2\) or \(3\text{?}\)
Define the events
and
Then we want to determine \(P(A \cup B)\text{,}\) which, by Theorem 3.1.14 is equal to
Since there are \(\lfloor 1000/2 \rfloor = 500\) even numbers in \(S\text{,}\) we have
Since there are \(\lfloor 1000/3 \rfloor = 333\) elements in \(S\) that are divisible by \(3\text{,}\) we have
Observe that \(i\) belongs to \(A \cap B\) if and only if \(i\) is divisible by \(6\text{,}\) i.e.,
Since there are \(\lfloor 1000/6 \rfloor = 166\) elements in \(S\) that are divisible by \(6\text{,}\) we have
We conclude that
Definition 3.1.16. Uniform Distributions or Probability Spaces.
A uniform distribution or uniform probability space is a pair \((S,P)\text{,}\) where \(S\) is a finite sample space and the probability function \(P : S \rightarrow \mathbb{R}\) satisfies
for each outcome \(x\) in \(S\text{.}\)
Theorem 3.1.17. Probability of Events in a Uniform Distribution.
If \((S,P)\) is a uniform probability space and \(A\) is an event, then
In a uniform probability space \((S,P)\text{,}\) the probability of an event \(A\) is the ratio of the size of \(A\) and the size of \(S\text{.}\)
Proof.
By using Definition 3.1.4 and Definition 3.1.16, we get
Example 3.1.18. Probability of a Full House.
In a standard deck of \(52\) cards, each card has a suit and a rank. There are four suits (spades, hearts, clubs, and diamonds), and 13 ranks (Ace, \(2,3,4,5,6,7,8,9,10\text{,}\) Jack, Queen, and King). A hand of five cards is called a full house, if three of the cards are of the same rank and the other two cards are also of the same (but necessarily different) rank. For example, the hand:
is a full house, because it consists of three sevens and two Queens.7 of spades , 7 of hearts , 7 of diamonds , Q of spades , Q of clubs
Assume we get a uniformly random hand of five cards. What is the probability that this hand is a full house? To answer this question, first observe that a hand of five cards is a subset of the set of all \(52\) cards. Thus, the sample space is the set \(S\) consisting of all \(5\)-element subsets of the set of \(52\) cards and, therefore,
Each hand of five cards has a probability of \(1/|S|\) of being chosen.
Since we are interested in the probability of a random hand being a full house, we define the event \(A\) to be the set of all elements in \(S\) that are full houses. By Theorem 3.1.17, we have
Thus, to determine \(P(A)\text{,}\) it remains to determine the size of the set \(A\text{,}\) i.e., the total number of full houses. For this, we will use the Rule of Products Subsection 2.1.2:
The procedure is “choose a full house”.
First task: Choose the rank of the three cards in the full house. There are \(13\) ways to do this.
Second task: Choose the suits of these three cards. There are \(4 \choose 3\) ways to do this.
Third task: Choose the rank of the other two cards in the full house. There are \(12\) ways to do this.
Fourth task: Choose the suits of these two cards. There are \(4 \choose 2\) ways to do this.
Thus, the number of full houses is equal to
We conclude that the probability of getting a full house is equal to
Example 3.1.19. The Monty Hall Problem.
The Monty Hall Problem is a well-known puzzle in probability theory. It is named after the host, Monty Hall, of the American television game show Let's Make a Deal. The problem became famous in 1990, when (part of) a reader's letter was published in Marilyn vos Savant's column Ask Marilyn in the magazine Parade:
Suppose you're on a game show, and you're given the choice of three doors: Behind one door is a car; behind the others, goats. You pick a door, say No. 1, and the host, who knows what's behind the doors, opens another door, say No. 3, which has a goat. He then says to you, “Do you want to switch?” Is it to your advantage to switch your choice?
Note that the host can always open a door that has a goat behind it. After the host has opened No.3, we know that the car is either behind No.1 or No.2, and it seems that both these doors have the same probability (i.e., \(50\%\)) of having the car behind them. We will prove below, however, that this is not true: It is indeed to our advantage to switch our choice.
We assume that the car is equally likely to be behind any of the three doors. Moreover, the host knows what is behind each door.
- We initially choose one of the three doors uniformly at random; this door remains closed.
- The host opens one of the other two doors that has a goat behind it.
- Our final choice is to switch to the other door that is still closed.
Let \(A\) be the event that we win the car and let \(B\) be the event that the initial door has a goat behind it. Then it is not difficult to see that event \(A\) occurs if and only if event \(B\) occurs. Therefore, the probability that we win the car is equal to
Exercises 3.1.3 Exercises for Section 3.1
1.
Let \(S = \{a,b,c\}\) be a sample space. Let \(P(a) = 1/2\text{,}\) \(P(b) = 1/3\text{,}\) and \(P(c) = 1/6\text{.}\) Find the probabilities for all eight subsets of \(S\text{.}\)
\(P(\{a, b, c\}) = 1\text{,}\) \(P (\{a\}) = \frac{1}{2}\text{,}\) \(P(\{a, b\}) = \frac{5}{6}\text{,}\) \(P(\{b\}) = \frac{1}{3}\text{,}\) \(P (\{b, c\}) = \frac{1}{2} \text{,}\) \(P(\{c\}) = \frac{1}{6} \text{,}\) \(P (\{a, c\}) = \frac{2}{3} \text{,}\) \(P(\emptyset) = 0\text{.}\)
2.
Consider the sample space \(S = \{a,b,c,d\}\) and a probability function \(P : S \rightarrow \mathbb{R}\) on \(S\text{.}\) Define the events \(A=\{a\}\text{,}\) \(B=\{a,b\}\text{,}\) \(C=\{a,b,c\}\text{,}\) and \(D=\{b,d\}\text{.}\) You are given that \(P(A) = 1/10\text{,}\) \(P(B)=1/2\text{,}\) and \(P(C)=7/10\text{.}\) Determine \(P(D)\text{.}\)
3.
Give a possible sample space \(S\) for each of the following experiments:
An election decides between two candidates \(A\) and \(B\text{.}\)
A two-sided coin is tossed.
A student is asked for the month of the year and the day of the week on which her birthday falls.
A student is chosen at random from a class of ten students.
You receive a grade in this course.
There are three possible outcomes, \(A\) wins \((A_w)\text{,}\) \(B\) wins \((B_w)\text{,}\)or it is a tie \((T)\text{.}\) The sample space would be the set \(S = \{A_w, B_w, T\}\)
The sample space is \(S = \{H,T\}\text{.}\)
The sample space would be the set of all possible combinations of month name paired with day name, such as: Jan-Mon, Jan-Tue, Jan-Wed, ... There are \(12*7 = 84\) total possible outcomes.
The sample space would be the set of all ten students.
Unless you are doing this problem before the drop date, you are guaranteed to get some kind of grade. Therefore, there is only one event: get-grade.
4.
For which of the cases in Exercise 3 would it be reasonable to assign the uniform distribution function?
5.
Consider a coin that has \(0\) on one side and \(1\) on the other side. We flip this coin once and roll a die twice, and are interested in the product of the three numbers.
- What is the sample space?
- How many possible events are there?
- If both the coin and the die are fair, how would you define the probability function \(P\) for this sample space?
-
The sample space is the products of the combinations of 1 coin flip and two dice rolls. We can use a matrix similar to Example 3.1.7 above for the case where the coin flip results in a 0. The product for all entries is 0. Then we can make another matrix of the same size where the coin flip results in a 1. The product is just the product of the two dice rolls
\begin{equation*} \begin{array}{|c||c|c|c|c|c|c|} \hline c=0 &1 &2 &3 &4 &5 &6 \\ \hline \hline 1 &0 & &0 &0 &0 &0 \\ 2 &0 &0 &0 &0 &0 &0 \\ 3 &0 &0 &0 &0 &0 &0 \\ 4 &0 &0 &0 &0 &0 &0 \\ 5 &0 &0 &0 &0 &0 &0 \\ 6 &0 &0 &0 &0 &0 &0 \\ \hline \end{array} \end{equation*}Where the coin outcome is = 1:
\begin{equation*} \begin{array}{|c||c|c|c|c|c|c|} \hline c=1 &1 &2 &3 &4 &5 &6 \\ \hline \hline 1 &1 &2 &3 &4 &5 &6\\ 2 &2 &4 &6 &8 &10 &12\\ 3 &3 &6 &9 &12 &15 &18 \\ 4 &4 &8 &12 &16 &20 &24\\ 5 &5 &10 &15 &20 &25 &30\\ 6 &6 &12 &18 &24 &30 &36 \\ \hline \end{array} \end{equation*} There are \(2 \times 6 \times 6 = 72\) possible outcomes, so \(|S| = 72\text{.}\)
-
Using the \(A_k\)notation from Example 3.1.7 above there are 19 different events: 0, 1, 2, 3, 4, 5, 6, 8, 9, 10, 12, 15, 16, 18, 20, 24, 25, 30 ,36.
\begin{equation*} \begin{array}{rlrl} P(A_0) & = \frac{18}{72} & P(A_1) & = \frac{1}{72}\\ P(A_2) & = \frac{2}{72} & P(A_3) & = \frac{2}{72}\\ P(A_4) & = \frac{3}{72} & P(A_5) & = \frac{2}{72}\\ P(A_6) & = \frac{4}{72} & P(A_8) & = \frac{2}{72}\\ P(A_9) & = \frac{1}{72} & P(A_{10}) & = \frac{2}{72}\\ P(A_{12}) & = \frac{4}{72} & P(A_{15}) & = \frac{2}{72}\\ P(A_{16}) & = \frac{1}{72} & P(A_{18}) & = \frac{2}{72}\\ P(A_{20}) & = \frac{2}{72} & P(A_{24}) & = \frac{2}{72}\\ P(A_{25}) & = \frac{1}{72} & P(A_{30}) & = \frac{2}{72}\\ P(A_{36}) & = \frac{1}{72}\\ \end{array} \end{equation*}Let \(i,j,m\) be the outcomes of the coin flip, the first die roll, and the second die roll respectively. The probability function could be defined:
\begin{equation*} P \left( A_k \right) = \sum_{(i,j,m) \in A_k} P(i,j,m) = \sum_{(i,j,m) \in A_k} 1/72 = |A_k| / 72 . \end{equation*}
6.
Describe in words the events specified by the following subsets of
(see Example 3.1.5 and Example 3.1.6).
\(E = \{HHH,HHT,HTH,HTT\}\text{.}\)
\(E = \{HHH,TTT\}\text{.}\)
\(E = \{HHT,HTH,THH\}\text{.}\)
\(E = \{HHT,HTH,HTT,THH,THT,TTH,TTT\}\text{.}\)
What are the probabilities of each of these events?
7.
Let \(n\) be a positive integer. We flip a fair coin \(2n\) times and consider the possible outcomes, which are strings of length \(2n\) with each character being \(H\) (= heads) or \(T\) (= tails). Thus, we take the sample space \(S\) to be the set of all such strings. Since our coin is fair, each string of \(S\) should have the same probability. Thus, we define \(P(s) = 1/|S|\) for each string \(s\) in \(S\text{.}\) In other words, we have a uniform probability space.
You are asked to determine the probability that in the sequence of \(2n\) flips, the coin comes up heads exactly \(n\) times:
- What is the event \(A\) that describes this?
- Determine \(P(A)\text{.}\)
\(|S| = 2^{2n} = (2^2)^n = 4^n\) because each of the \(2n\) flips has 2 possible outcomes.
Let \(A\) be the set of all sequences in \(S\) that have exactly \(n\) heads.
-
We must determine \(|A|\text{.}\) To do this we must choose \(n\) flips to come up heads, out of \(2n\) flips total.
\begin{equation*} \begin{array}{rl} |A| = & {2n \choose n} \\ \end{array} \end{equation*}Therefore \(P(A)=\frac{|A|}{|S|} = \frac{2n \choose n}{4^n}\)
8.
A cup contains two pennies (P), one nickel (N), and one dime (D). You choose one coin uniformly at random, and then you choose a second coin from the remaining coins, again uniformly at random.
- Let \(S\) be the sample space consisting of all ordered pairs of letters P, N, and D that represent the possible outcomes. Write out all elements of \(S\text{.}\)
- Determine the probability for each element in this sample space.
9.
Let \(A\) and \(B\) be events such that \(P(A \cap B) = 1/4\text{,}\) \(P(\tilde A) = 1/3\text{,}\) and \(P(B) = 1/2\text{.}\) What is \(P(A \cup B)\text{?}\)
\(\frac{11}{12}\)
10.
A die is loaded in such a way that the probability of each face turning up is proportional to the number of dots on that face. (For example, a six is three times as probable as a two.) What is the probability of getting an even number in one throw?