Download - SDS 321: Introduction to Probability and Statistics ...psarkar/sds321/lecture3-ps.pdf · SDS 321: Introduction to Probability and Statistics Lecture 3: Conditional probability and

SDS 321: Introduction to Probability andStatistics

Lecture 3: Conditional probability and Bayes rule

Purnamrita SarkarDepartment of Statistics and Data Science

The University of Texas at Austin

www.cs.cmu.edu/∼psarkar/teaching.html

1

Page 2: SDS 321: Introduction to Probability and Statistics ...psarkar/sds321/lecture3-ps.pdf · SDS 321: Introduction to Probability and Statistics Lecture 3: Conditional probability and

Example: Umbrella salesAnita works for an umbrella company. She gets a bonus if and only if shesells more than 10 umbrellas in a day.

I If it is raining, the probability that she sells more than 10 umbrellas is0.8.

I W := # umbrellas sold > 10, R := rain andB := Anita gets bonus

I W = B.I P(W |R) = 0.8

I If it isn’t raining, the probability that she sells more than 10 umbrellasis 0.25.

I P(W |Rc) = 0.25

I The probability that it rains tomorrow is 0.1.

I P(R) = 0.1.

What is the probability that it doesn’t rain tomorrow, and she gets herbonus?

What is P(Rc ∩ B)?

2

Page 3: SDS 321: Introduction to Probability and Statistics ...psarkar/sds321/lecture3-ps.pdf · SDS 321: Introduction to Probability and Statistics Lecture 3: Conditional probability and

I W = B.I P(W |R) = 0.8

I P(W |Rc) = 0.25

I P(R) = 0.1.

2

Page 4: SDS 321: Introduction to Probability and Statistics ...psarkar/sds321/lecture3-ps.pdf · SDS 321: Introduction to Probability and Statistics Lecture 3: Conditional probability and

I W = B.

I P(W |R) = 0.8

I P(W |Rc) = 0.25

I P(R) = 0.1.

2

Page 5: SDS 321: Introduction to Probability and Statistics ...psarkar/sds321/lecture3-ps.pdf · SDS 321: Introduction to Probability and Statistics Lecture 3: Conditional probability and

I W = B.I P(W |R) = 0.8

I P(W |Rc) = 0.25

I P(R) = 0.1.

2

Page 6: SDS 321: Introduction to Probability and Statistics ...psarkar/sds321/lecture3-ps.pdf · SDS 321: Introduction to Probability and Statistics Lecture 3: Conditional probability and

I W = B.I P(W |R) = 0.8

I P(W |Rc) = 0.25

I P(R) = 0.1.

2

Page 7: SDS 321: Introduction to Probability and Statistics ...psarkar/sds321/lecture3-ps.pdf · SDS 321: Introduction to Probability and Statistics Lecture 3: Conditional probability and

I W = B.I P(W |R) = 0.8

I P(W |Rc) = 0.25

I The probability that it rains tomorrow is 0.1.I P(R) = 0.1.

2

Page 8: SDS 321: Introduction to Probability and Statistics ...psarkar/sds321/lecture3-ps.pdf · SDS 321: Introduction to Probability and Statistics Lecture 3: Conditional probability and

I W = B.I P(W |R) = 0.8

I P(W |Rc) = 0.25

I The probability that it rains tomorrow is 0.1.I P(R) = 0.1.

What is the probability that it doesn’t rain tomorrow, and she gets herbonus? What is P(Rc ∩ B)?

2

Page 9: SDS 321: Introduction to Probability and Statistics ...psarkar/sds321/lecture3-ps.pdf · SDS 321: Introduction to Probability and Statistics Lecture 3: Conditional probability and

Example: Umbrella sales

I P(R) = 0.1

I P(W |R) = 0.8

I P(W |Rc ) = 0.25

We can rearrange our formula for conditional probability. Since

P(W |Rc ) =P(W ∩ Rc )

P(Rc ), we also have:

P(W ∩ Rc ) =P(W |Rc )P(Rc ) = P(W |Rc )(1− P(R)) = 0.25× 0.9 = 0.225

I But I wanted P(B ∩ Rc )! Isn’t this the same as P(W ∩ Rc )?

3

Page 10: SDS 321: Introduction to Probability and Statistics ...psarkar/sds321/lecture3-ps.pdf · SDS 321: Introduction to Probability and Statistics Lecture 3: Conditional probability and

I P(R) = 0.1

I P(W |R) = 0.8

I P(W |Rc ) = 0.25

P(W |Rc ) =P(W ∩ Rc )

P(W ∩ Rc ) =P(W |Rc )P(Rc )

= P(W |Rc )(1− P(R)) = 0.25× 0.9 = 0.225

3

Page 11: SDS 321: Introduction to Probability and Statistics ...psarkar/sds321/lecture3-ps.pdf · SDS 321: Introduction to Probability and Statistics Lecture 3: Conditional probability and

I P(R) = 0.1

I P(W |R) = 0.8

I P(W |Rc ) = 0.25

P(W |Rc ) =P(W ∩ Rc )

3

Page 12: SDS 321: Introduction to Probability and Statistics ...psarkar/sds321/lecture3-ps.pdf · SDS 321: Introduction to Probability and Statistics Lecture 3: Conditional probability and

I P(R) = 0.1

I P(W |R) = 0.8

I P(W |Rc ) = 0.25

P(W |Rc ) =P(W ∩ Rc )

3

Page 13: SDS 321: Introduction to Probability and Statistics ...psarkar/sds321/lecture3-ps.pdf · SDS 321: Introduction to Probability and Statistics Lecture 3: Conditional probability and

Representing conditional probabilities using a tree

We can represent conditional probabilities using a tree structure.

Rc

Rc ∩W c

P(W c|R c) = 0.75

Rc ∩W

P(W |Rc ) = 0.25

P(R c) =

0.9

R

R ∩W c

P(W c|R) = 0.2

R ∩W

P(W |R) = 0.8

P(R) =

0.1

4

Page 14: SDS 321: Introduction to Probability and Statistics ...psarkar/sds321/lecture3-ps.pdf · SDS 321: Introduction to Probability and Statistics Lecture 3: Conditional probability and

The probability at a leaf node is the product of the probabilities alongeach path.

Rc

Rc ∩W c

P(W c|R c) = 0.75

P(Rc ∩ W )= 0.9 × 0.25= 0.225

P(W |Rc ) = 0.25

P(R c) =

0.9

R

R ∩W c

P(W c|R) = 0.2

R ∩W

P(W |R) = 0.8

P(R) =

0.1

5

Page 15: SDS 321: Introduction to Probability and Statistics ...psarkar/sds321/lecture3-ps.pdf · SDS 321: Introduction to Probability and Statistics Lecture 3: Conditional probability and

The probability at a leaf node is the product of the probabilities alongeach path.

Rc

Rc ∩W c

P(W c|R c) = 0.75

Rc ∩W

P(W |Rc ) = 0.25

P(R c) =

0.9

R

R ∩W c

P(W c|R) = 0.2

P(R ∩ W )= 0.1 × 0.8= 0.08

P(W |R) = 0.8

P(R) =

0.1

6

Page 16: SDS 321: Introduction to Probability and Statistics ...psarkar/sds321/lecture3-ps.pdf · SDS 321: Introduction to Probability and Statistics Lecture 3: Conditional probability and

Multiplication Rule

This is known as the multiplication rule.

I We know that P(A ∩ B) = P(A|B)P(B). What is P(A ∩ B ∩ C)?

I Treat (B ∩ C) as an event. Call this R.

I Now P(A ∩ B ∩ C) = P(A ∩ R) = P(A|R)P(R) = P(A|B ∩ C)P(B ∩ C).

I But P(R) = P(B ∩ C) = P(B|C)P(C).

I Using induction you can prove that:

P(∩ni=1Ai ) = P(A1|A2∩· · ·∩An)P(A2|A3∩· · ·∩An) · · ·P(An−1|An)P(An)

7

Page 17: SDS 321: Introduction to Probability and Statistics ...psarkar/sds321/lecture3-ps.pdf · SDS 321: Introduction to Probability and Statistics Lecture 3: Conditional probability and

Multiplication Rule

7

Page 18: SDS 321: Introduction to Probability and Statistics ...psarkar/sds321/lecture3-ps.pdf · SDS 321: Introduction to Probability and Statistics Lecture 3: Conditional probability and

Multiplication Rule

7

Page 19: SDS 321: Introduction to Probability and Statistics ...psarkar/sds321/lecture3-ps.pdf · SDS 321: Introduction to Probability and Statistics Lecture 3: Conditional probability and

Multiplication Rule

7

Page 20: SDS 321: Introduction to Probability and Statistics ...psarkar/sds321/lecture3-ps.pdf · SDS 321: Introduction to Probability and Statistics Lecture 3: Conditional probability and

Multiplication Rule

7

Page 21: SDS 321: Introduction to Probability and Statistics ...psarkar/sds321/lecture3-ps.pdf · SDS 321: Introduction to Probability and Statistics Lecture 3: Conditional probability and

Multiplication Rule

7

Page 22: SDS 321: Introduction to Probability and Statistics ...psarkar/sds321/lecture3-ps.pdf · SDS 321: Introduction to Probability and Statistics Lecture 3: Conditional probability and

Revisit Example: Umbrella sales

Anita works for an umbrella company. She gets a bonus (event B) iff shesells more than 10 umbrellas in a day (event W ).

I Now W = B. We have P(R) = 0.1, P(W |R) = 0.8 andP(W |Rc ) = 0.25.

I If you knew that Anita got a bonus, then what is the probability thatit rained?

I We are interested in P(R|B), which is the same as P(R|W ). First weneed P(R ∩W ) and then we need P(W ).

I P(R ∩W ) = P(W |R)P(R) = 0.8× 0.1 = 0.08.

I Now what about P(W )?

8

Page 23: SDS 321: Introduction to Probability and Statistics ...psarkar/sds321/lecture3-ps.pdf · SDS 321: Introduction to Probability and Statistics Lecture 3: Conditional probability and

I P(R ∩W ) = P(W |R)P(R) = 0.8× 0.1 = 0.08.

8

Page 24: SDS 321: Introduction to Probability and Statistics ...psarkar/sds321/lecture3-ps.pdf · SDS 321: Introduction to Probability and Statistics Lecture 3: Conditional probability and

I P(R ∩W ) = P(W |R)P(R) = 0.8× 0.1 = 0.08.

8

Page 25: SDS 321: Introduction to Probability and Statistics ...psarkar/sds321/lecture3-ps.pdf · SDS 321: Introduction to Probability and Statistics Lecture 3: Conditional probability and

I P(R ∩W ) = P(W |R)P(R) = 0.8× 0.1 = 0.08.

8

Page 26: SDS 321: Introduction to Probability and Statistics ...psarkar/sds321/lecture3-ps.pdf · SDS 321: Introduction to Probability and Statistics Lecture 3: Conditional probability and

Anita works for an umbrella company. She gets a bonus iff she sells morethan 10 umbrellas in a day.

I First write W as an union of two disjoint events. Guesses?I W = W ∩ Ω = W ∩ (R ∪ Rc) = (W ∩ R) ∪ (W ∩ Rc).I Now additivity gives

P(W ) = P(W ∩ R) + P(W ∩ Rc)

Theorem of total probability

= P(W |R)P(R) + P(W |Rc)P(Rc) = 0.08 + 0.25× 0.9 ≈ 0.3

P(R|B) = P(R|W ) =P(R ∩W )

P(W )

=P(W |R)P(R)

P(W |R)P(R) + P(W |Rc)P(Rc)= 0.08/0.3 ≈ 0.27

The last step is also known as Bayes Rule, which we will study next.

9

Page 27: SDS 321: Introduction to Probability and Statistics ...psarkar/sds321/lecture3-ps.pdf · SDS 321: Introduction to Probability and Statistics Lecture 3: Conditional probability and

I Now what about P(W )?I First write W as an union of two disjoint events. Guesses?

I W = W ∩ Ω = W ∩ (R ∪ Rc) = (W ∩ R) ∪ (W ∩ Rc).I Now additivity gives

P(W ) = P(W ∩ R) + P(W ∩ Rc)

= P(W |R)P(R) + P(W |Rc)P(Rc) = 0.08 + 0.25× 0.9 ≈ 0.3

P(R|B) = P(R|W ) =P(R ∩W )

P(W )

=P(W |R)P(R)

P(W |R)P(R) + P(W |Rc)P(Rc)= 0.08/0.3 ≈ 0.27

9

Page 28: SDS 321: Introduction to Probability and Statistics ...psarkar/sds321/lecture3-ps.pdf · SDS 321: Introduction to Probability and Statistics Lecture 3: Conditional probability and

I Now what about P(W )?I First write W as an union of two disjoint events. Guesses?I W = W ∩ Ω = W ∩ (R ∪ Rc) = (W ∩ R) ∪ (W ∩ Rc).

I Now additivity gives

P(W ) = P(W ∩ R) + P(W ∩ Rc)

= P(W |R)P(R) + P(W |Rc)P(Rc) = 0.08 + 0.25× 0.9 ≈ 0.3

P(R|B) = P(R|W ) =P(R ∩W )

P(W )

=P(W |R)P(R)

P(W |R)P(R) + P(W |Rc)P(Rc)= 0.08/0.3 ≈ 0.27

9

Page 29: SDS 321: Introduction to Probability and Statistics ...psarkar/sds321/lecture3-ps.pdf · SDS 321: Introduction to Probability and Statistics Lecture 3: Conditional probability and

I Now what about P(W )?I First write W as an union of two disjoint events. Guesses?I W = W ∩ Ω = W ∩ (R ∪ Rc) = (W ∩ R) ∪ (W ∩ Rc).I Now additivity gives

P(W ) = P(W ∩ R) + P(W ∩ Rc)

= P(W |R)P(R) + P(W |Rc)P(Rc) = 0.08 + 0.25× 0.9 ≈ 0.3

P(R|B) = P(R|W ) =P(R ∩W )

P(W )

=P(W |R)P(R)

P(W |R)P(R) + P(W |Rc)P(Rc)= 0.08/0.3 ≈ 0.27

9

Page 30: SDS 321: Introduction to Probability and Statistics ...psarkar/sds321/lecture3-ps.pdf · SDS 321: Introduction to Probability and Statistics Lecture 3: Conditional probability and

P(W ) = P(W ∩ R) + P(W ∩ Rc)

= P(W |R)P(R) + P(W |Rc)P(Rc) = 0.08 + 0.25× 0.9 ≈ 0.3

P(R|B) = P(R|W ) =P(R ∩W )

P(W )

=P(W |R)P(R)

P(W |R)P(R) + P(W |Rc)P(Rc)= 0.08/0.3 ≈ 0.27

9

Page 31: SDS 321: Introduction to Probability and Statistics ...psarkar/sds321/lecture3-ps.pdf · SDS 321: Introduction to Probability and Statistics Lecture 3: Conditional probability and

P(W ) = P(W ∩ R) + P(W ∩ Rc) Theorem of total probability

= P(W |R)P(R) + P(W |Rc)P(Rc) = 0.08 + 0.25× 0.9 ≈ 0.3

P(R|B) = P(R|W ) =P(R ∩W )

P(W )

=P(W |R)P(R)

P(W |R)P(R) + P(W |Rc)P(Rc)= 0.08/0.3 ≈ 0.27

9

Page 32: SDS 321: Introduction to Probability and Statistics ...psarkar/sds321/lecture3-ps.pdf · SDS 321: Introduction to Probability and Statistics Lecture 3: Conditional probability and

P(W ) = P(W ∩ R) + P(W ∩ Rc) Theorem of total probability

= P(W |R)P(R) + P(W |Rc)P(Rc) = 0.08 + 0.25× 0.9 ≈ 0.3

P(R|B) = P(R|W ) =P(R ∩W )

P(W )

=P(W |R)P(R)

P(W |R)P(R) + P(W |Rc)P(Rc)= 0.08/0.3 ≈ 0.27

9

Page 33: SDS 321: Introduction to Probability and Statistics ...psarkar/sds321/lecture3-ps.pdf · SDS 321: Introduction to Probability and Statistics Lecture 3: Conditional probability and

Multiplication rule: examples

Three cards are drawn from an ordinary 52 card deck withoutreplacement. What is the probability that there is no heart among thethree?

I Without replacement: drawn cards are not placed back into thedeck.

I Notation: Ai = ithcard is not a heartI Remember: There are thirteen cards with hearts.

I We want: P(A1 ∩ A2 ∩ A3).

I Use multiplication rule:P(A1 ∩ A2 ∩ A3) = P(A1)P(A2|A1)P(A3|A1 ∩ A2).

I P(A1) =

39

52

. P(A2|A1) =

38

51

. P(A3|A1 ∩ A2) =

37

50

.

10

Page 34: SDS 321: Introduction to Probability and Statistics ...psarkar/sds321/lecture3-ps.pdf · SDS 321: Introduction to Probability and Statistics Lecture 3: Conditional probability and

I P(A1) =

39

52

. P(A2|A1) =

38

51

. P(A3|A1 ∩ A2) =

37

50

.

10

Page 35: SDS 321: Introduction to Probability and Statistics ...psarkar/sds321/lecture3-ps.pdf · SDS 321: Introduction to Probability and Statistics Lecture 3: Conditional probability and

I P(A1) =

39

52

. P(A2|A1) =

38

51

. P(A3|A1 ∩ A2) =

37

50

.

10

Page 36: SDS 321: Introduction to Probability and Statistics ...psarkar/sds321/lecture3-ps.pdf · SDS 321: Introduction to Probability and Statistics Lecture 3: Conditional probability and

I P(A1) =39

52. P(A2|A1) =

38

51

. P(A3|A1 ∩ A2) =

37

50

.

10

Page 37: SDS 321: Introduction to Probability and Statistics ...psarkar/sds321/lecture3-ps.pdf · SDS 321: Introduction to Probability and Statistics Lecture 3: Conditional probability and

I P(A1) =39

52. P(A2|A1) =

38

51. P(A3|A1 ∩ A2) =

37

50

.

10

Page 38: SDS 321: Introduction to Probability and Statistics ...psarkar/sds321/lecture3-ps.pdf · SDS 321: Introduction to Probability and Statistics Lecture 3: Conditional probability and

I P(A1) =39

52. P(A2|A1) =

38

51. P(A3|A1 ∩ A2) =

37

50.

10

Page 39: SDS 321: Introduction to Probability and Statistics ...psarkar/sds321/lecture3-ps.pdf · SDS 321: Introduction to Probability and Statistics Lecture 3: Conditional probability and

I have two black balls, and one red ball. I pick two balls randomlywithout replacement.

I What is the probability that the first ball is black?

2/3

I What is the probability that the second ball is black?

I Notation: Xi is color of i th ball. We want P(X2 = B).

I

P(X2 = B) = P(X2 = B ∩ X1 = B) + P(X2 = B ∩ X1 = R)

= P(X2 = B|X1 = B)P(X1 = B) + P(X2 = B|X1 = R)P(X1 = R)

=1

2× 2

3+ 1× 1

3=

2

3

I Is this obvious? If you know nothing about the first ball, then thesecond ball can be any one of the 3 balls you have.

11

Page 40: SDS 321: Introduction to Probability and Statistics ...psarkar/sds321/lecture3-ps.pdf · SDS 321: Introduction to Probability and Statistics Lecture 3: Conditional probability and

I What is the probability that the first ball is black?2/3

I What is the probability that the second ball is black?

I Notation: Xi is color of i th ball. We want P(X2 = B).

I

P(X2 = B) = P(X2 = B ∩ X1 = B) + P(X2 = B ∩ X1 = R)

=1

2× 2

3+ 1× 1

3=

2

3

11

Page 41: SDS 321: Introduction to Probability and Statistics ...psarkar/sds321/lecture3-ps.pdf · SDS 321: Introduction to Probability and Statistics Lecture 3: Conditional probability and

I What is the probability that the second ball is black?I Notation: Xi is color of i th ball. We want P(X2 = B).

I

P(X2 = B) = P(X2 = B ∩ X1 = B) + P(X2 = B ∩ X1 = R)

=1

2× 2

3+ 1× 1

3=

2

3

11

Page 42: SDS 321: Introduction to Probability and Statistics ...psarkar/sds321/lecture3-ps.pdf · SDS 321: Introduction to Probability and Statistics Lecture 3: Conditional probability and

I What is the probability that the second ball is black?I Notation: Xi is color of i th ball. We want P(X2 = B).

I

P(X2 = B) = P(X2 = B ∩ X1 = B) + P(X2 = B ∩ X1 = R)

=1

2× 2

3+ 1× 1

3=

2

3

11

Page 43: SDS 321: Introduction to Probability and Statistics ...psarkar/sds321/lecture3-ps.pdf · SDS 321: Introduction to Probability and Statistics Lecture 3: Conditional probability and

Reading

I Read Sections 1.1, 1.2 and 1.3 of Bertsekas and Tsitsiklis.

12

Download - SDS 321: Introduction to Probability and Statistics ...psarkar/sds321/lecture3-ps.pdf · SDS 321: Introduction to Probability and Statistics Lecture 3: Conditional probability and

Top Related