cse 321 discrete structures winter 2008 lecture 19 probability theory texpoint fonts used in emf....

35
CSE 321 Discrete Structures Winter 2008 Lecture 19 Probability Theory

Upload: alison-singleton

Post on 03-Jan-2016

218 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: CSE 321 Discrete Structures Winter 2008 Lecture 19 Probability Theory TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:

CSE 321 Discrete Structures

Winter 2008

Lecture 19

Probability Theory

Page 2: CSE 321 Discrete Structures Winter 2008 Lecture 19 Probability Theory TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:

Announcements

• Readings – Probability Theory

• 6.1, 6.2 (5.1, 5.2) Probability Theory• 6.3 (New material!) Bayes’ Theorem• 6.4 (5.3) Expectation

– Advanced Counting Techniques – Ch 7.• Not covered

Page 3: CSE 321 Discrete Structures Winter 2008 Lecture 19 Probability Theory TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:

Discrete Probability

Experiment: Procedure that yields an outcome

Sample space: Set of all possible outcomes

Event: subset of the sample space

S a sample space of equally likely outcomes, E an event, the probability of E, p(E) = |E|/|S|

Page 4: CSE 321 Discrete Structures Winter 2008 Lecture 19 Probability Theory TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:

Example: Dice

Page 5: CSE 321 Discrete Structures Winter 2008 Lecture 19 Probability Theory TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:

Example: Poker

Probability of 4 of a kind

Page 6: CSE 321 Discrete Structures Winter 2008 Lecture 19 Probability Theory TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:

Combinations of Events

EC is the complement of E

P(EC) = 1 – P(E)

P(E1 E2) = P(E1) + P(E2) – P(E1 E2)

Page 7: CSE 321 Discrete Structures Winter 2008 Lecture 19 Probability Theory TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:

Combinations of Events

EC is the complement of E

P(EC) = 1 – P(E)

P(E1 E2) = P(E1) + P(E2) – P(E1 E2)

Page 8: CSE 321 Discrete Structures Winter 2008 Lecture 19 Probability Theory TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:

Probability Concepts

• Probability Distribution

• Conditional Probability

• Independence

• Bernoulli Trials / Binomial Distribution

• Random Variable

Page 9: CSE 321 Discrete Structures Winter 2008 Lecture 19 Probability Theory TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:

Discrete Probability Theory

• Set S

• Probability distribution p : S [0,1]– For s S, 0 p(s) 1

– sS p(s) = 1

• Event E, E S

• p(E) = sEp(s)

Page 10: CSE 321 Discrete Structures Winter 2008 Lecture 19 Probability Theory TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:

Examples

Page 11: CSE 321 Discrete Structures Winter 2008 Lecture 19 Probability Theory TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:

Conditional Probability

Let E and F be events with p(F) > 0. The conditional probability of E given F, defined by p(E | F), is defined as:

Page 12: CSE 321 Discrete Structures Winter 2008 Lecture 19 Probability Theory TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:

Examples

Page 13: CSE 321 Discrete Structures Winter 2008 Lecture 19 Probability Theory TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:

Independence

The events E and F are independent if and only if p(E F) = p(E)p(F)

E and F are independent if and only if p(E | F) = p(E)

Page 14: CSE 321 Discrete Structures Winter 2008 Lecture 19 Probability Theory TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:

Are these independent?

• Flip a coin three times– E: the first coin is a head– F: the second coin is a head

• Roll two dice– E: the sum of the two dice is 5– F: the first die is a 1

• Roll two dice– E: the sum of the two dice is 7– F: the first die is a 1

• Deal two five card poker hands– E: hand one has four of a kind– F: hand two has four of a kind

Page 15: CSE 321 Discrete Structures Winter 2008 Lecture 19 Probability Theory TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:

Bernoulli Trials and Binomial Distribution

• Bernoulli Trial– Success probability p, failure probability q

The probability of exactly k successes in n independent Bernoulli trials is

Page 16: CSE 321 Discrete Structures Winter 2008 Lecture 19 Probability Theory TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:

Random Variables

A random variable is a function from a sample space to the real numbers

Page 17: CSE 321 Discrete Structures Winter 2008 Lecture 19 Probability Theory TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:

Bayes’ Theorem

Suppose that E and F are events from a sample space S such that p(E) > 0 and p(F) > 0. Then

Page 18: CSE 321 Discrete Structures Winter 2008 Lecture 19 Probability Theory TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:

False Positives, False Negatives

Let D be the event that a person has the disease

Let Y be the event that a person tests positive for the disease

Page 19: CSE 321 Discrete Structures Winter 2008 Lecture 19 Probability Theory TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:

Testing for disease

Disease is very rare: p(D) = 1/100,000

Testing is accurate:False negative: 1%False positive: 0.5%

Suppose you get a positive result, whatdo you conclude?

Page 20: CSE 321 Discrete Structures Winter 2008 Lecture 19 Probability Theory TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:

P(D|Y)

Page 21: CSE 321 Discrete Structures Winter 2008 Lecture 19 Probability Theory TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:

Spam Filtering

From: Zambia Nation Farmers Union [[email protected]]Subject: Letter of assistance for school installationTo: Richard Anderson

Dear Richard,I hope you are fine, Iam through talking to local headmen about the possible assistance of school installation. the idea is and will be welcome.I trust that you will do your best as i await for more from you.Once againThanking you very muchSebastian Mazuba.

Page 22: CSE 321 Discrete Structures Winter 2008 Lecture 19 Probability Theory TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:

Bayesian Spam filters

• Classification domain– Cost of false negative– Cost of false positive

• Criteria for spam– v1agra, ONE HUNDRED MILLION USD

• Basic question: given an email message, based on spam criteria, what is the probability it is spam

Page 23: CSE 321 Discrete Structures Winter 2008 Lecture 19 Probability Theory TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:

Email message with phrase “Account Review”

• 250 of 20000 messages known to be spam• 5 of 10000 messages known not to be spam• Assuming 50% of messages are spam, what

is the probability that a message with “Account Review” is spam

Page 24: CSE 321 Discrete Structures Winter 2008 Lecture 19 Probability Theory TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:

Proving Bayes’ Theorem

Page 25: CSE 321 Discrete Structures Winter 2008 Lecture 19 Probability Theory TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:

Expectation

The expected value of random variable X(s) on sample space S is:

Page 26: CSE 321 Discrete Structures Winter 2008 Lecture 19 Probability Theory TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:

Flip a coin until the first headExpected number of flips?

Probability Space:

Computing the expectation:

Page 27: CSE 321 Discrete Structures Winter 2008 Lecture 19 Probability Theory TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:

Linearity of Expectation

E(X1 + X2) = E(X1) + E(X2)

E(aX) = aE(X)

Page 28: CSE 321 Discrete Structures Winter 2008 Lecture 19 Probability Theory TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:

Hashing

H: M [0..n-1]

If k elements have been hashed torandom locations, what is the expected number of elements in bucket j?

What is the expected number ofcollisions when hashing k elements to random locations?

Page 29: CSE 321 Discrete Structures Winter 2008 Lecture 19 Probability Theory TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:

Hashing analysis

Sample space: [0..n-1] [0..n-1] . . . [0..n-1]

Random VariablesXj = number of elements hashed to bucket jC = total number of collisions

Bij = 1 if element i hashed to bucket jBij = 0 if element i is not hashed to bucket j

Cab = 1 if element a is hashed to the same bucket as element bCab = 0 if element a is hashed to a different bucket than element b

Page 30: CSE 321 Discrete Structures Winter 2008 Lecture 19 Probability Theory TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:

Counting inversions

Let p1, p2, . . . , pn be a permutation of 1 . . . npi, pj is an inversion if i < j and pi > pj

4, 2, 5, 1, 3

1, 6, 4, 3, 2, 5

7, 6, 5, 4, 3, 2, 1

Page 31: CSE 321 Discrete Structures Winter 2008 Lecture 19 Probability Theory TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:

Expected number of inversions for a random permutation

Page 32: CSE 321 Discrete Structures Winter 2008 Lecture 19 Probability Theory TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:

Insertion sort

for i :=1 to n-1{ j := i; while (j > 0 and A[ j - 1 ] > A[ j ]){

swap(A[ j -1], A[ j ]); j := j – 1; }}

4 2 5 1 3

Page 33: CSE 321 Discrete Structures Winter 2008 Lecture 19 Probability Theory TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:

Expected number of swaps for Insertion Sort

Page 34: CSE 321 Discrete Structures Winter 2008 Lecture 19 Probability Theory TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:

Left to right maxima

max_so_far := A[0];for i := 1 to n-1 if (A[ i ] > max_so_far)

max_so_far := A[ i ];

5, 2, 9, 14, 11, 18, 7, 16, 1, 20, 3, 19, 10, 15, 4, 6, 17, 12, 8

Page 35: CSE 321 Discrete Structures Winter 2008 Lecture 19 Probability Theory TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:

What is the expected number of left-to-right maxima in a random permutation