1 discrete math cs 2800 prof. bart selman [email protected] module probability --- part d) 1)...

53
1 Discrete Math CS 2800 Prof. Bart Selman [email protected] Module Probability --- Part d) 1) Probability Distributions 2) Markov and Chebyshev Bounds

Post on 20-Dec-2015

222 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: 1 Discrete Math CS 2800 Prof. Bart Selman selman@cs.cornell.edu Module Probability --- Part d) 1) Probability Distributions 2) Markov and Chebyshev Bounds

1

Discrete MathCS 2800

Prof. Bart [email protected]

Module Probability --- Part d)

1) Probability Distributions2) Markov and Chebyshev Bounds

Page 2: 1 Discrete Math CS 2800 Prof. Bart Selman selman@cs.cornell.edu Module Probability --- Part d) 1) Probability Distributions 2) Markov and Chebyshev Bounds

2

Discrete Random variable

Discrete random variable– Takes on one of a finite (or at least countable) number of

different values.

– X = 1 if heads, 0 if tails

– Y = 1 if male, 0 if female (phone survey)

– Z = # of spots on face of thrown die

Page 3: 1 Discrete Math CS 2800 Prof. Bart Selman selman@cs.cornell.edu Module Probability --- Part d) 1) Probability Distributions 2) Markov and Chebyshev Bounds

3

Continuous Random variable

Continuous random variable (r.v.)– Takes on one in an infinite range of different values– W = % GDP grows (shrinks?) this year– V = hours until light bulb fails

For a discrete r.v., we have Prob(X=x), i.e., the probability that

r.v. X takes on a given value x.

What is the probability that a continuous r.v. takes on a specific value? E.g. Prob(X_light_bulb_fails = 3.14159265 hrs) = ??

However, ranges of values can have non-zero probability.

E.g. Prob(3 hrs <= X_light_bulb_fails <= 4 hrs) = 0.1

– Ranges of values have a probability

0

Page 4: 1 Discrete Math CS 2800 Prof. Bart Selman selman@cs.cornell.edu Module Probability --- Part d) 1) Probability Distributions 2) Markov and Chebyshev Bounds

4

Probability Distribution

The probability distribution is a complete probabilistic description of a random variable.

All other statistical concepts (expectation, variance, etc) are derived from it.

Once we know the probability distribution of a random variable, we know everything we can learn about it from statistics.

Page 5: 1 Discrete Math CS 2800 Prof. Bart Selman selman@cs.cornell.edu Module Probability --- Part d) 1) Probability Distributions 2) Markov and Chebyshev Bounds

5

Probability Distribution

Probability function– One form the probability distribution of a discrete

random variable may be expressed in.

– Expresses the probability that X takes the value x as a function of x (as we saw before):

)( xXPxPX

Page 6: 1 Discrete Math CS 2800 Prof. Bart Selman selman@cs.cornell.edu Module Probability --- Part d) 1) Probability Distributions 2) Markov and Chebyshev Bounds

6

Probability Distribution

The probability function – May be tabular:

6/1..3

3/1..2

2/1..1

pw

pw

pw

X

Page 7: 1 Discrete Math CS 2800 Prof. Bart Selman selman@cs.cornell.edu Module Probability --- Part d) 1) Probability Distributions 2) Markov and Chebyshev Bounds

7

Probability Distribution

The probability function – May be graphical:

1 2 3

.50

.33

.17

Page 8: 1 Discrete Math CS 2800 Prof. Bart Selman selman@cs.cornell.edu Module Probability --- Part d) 1) Probability Distributions 2) Markov and Chebyshev Bounds

8

Probability Distribution

The probability function– May be formulaic:

1,2,3for x6

4

xxXP

Page 9: 1 Discrete Math CS 2800 Prof. Bart Selman selman@cs.cornell.edu Module Probability --- Part d) 1) Probability Distributions 2) Markov and Chebyshev Bounds

9

Probability Distribution: Fair die

6/1..6

6/1..5

6/1..4

6/1..3

6/1..2

6/1..1

pw

pw

pw

pw

pw

pw

X

1 2 3

.50

.33

.17

4 5 6

Page 10: 1 Discrete Math CS 2800 Prof. Bart Selman selman@cs.cornell.edu Module Probability --- Part d) 1) Probability Distributions 2) Markov and Chebyshev Bounds

10

Probability Distribution

The probability function, properties

xxPX each for 0

x

X xP 1

Page 11: 1 Discrete Math CS 2800 Prof. Bart Selman selman@cs.cornell.edu Module Probability --- Part d) 1) Probability Distributions 2) Markov and Chebyshev Bounds

11

Cumulative Probability Distribution

Cumulative probability distribution– The cdf is a function which describes the probability

that a random variable does not exceed a value.

xXPxFX

Does this make sense for a continuous r.v.? Yes!

Page 12: 1 Discrete Math CS 2800 Prof. Bart Selman selman@cs.cornell.edu Module Probability --- Part d) 1) Probability Distributions 2) Markov and Chebyshev Bounds

12

Cumulative Probability Distribution

Cumulative probability distribution– The relationship between the cdf and the probability

function:

xy

XX yXPxXPxF

Page 13: 1 Discrete Math CS 2800 Prof. Bart Selman selman@cs.cornell.edu Module Probability --- Part d) 1) Probability Distributions 2) Markov and Chebyshev Bounds

13

Cumulative Probability Distribution

Die-throwing

66/6

656/5

546/4

436/3

326/2

216/1

10

x

x

x

x

x

x

x

xFX

1 2 3 4 5 6

1

graphicaltabular

xy

XX yXPxXPxF

( ) 1/ 6XP x P X x

Page 14: 1 Discrete Math CS 2800 Prof. Bart Selman selman@cs.cornell.edu Module Probability --- Part d) 1) Probability Distributions 2) Markov and Chebyshev Bounds

14

Cumulative Probability Distribution

The cumulative distribution function – May be formulaic (die-throwing):

min max x,0 ,6

6

floorP X x

Page 15: 1 Discrete Math CS 2800 Prof. Bart Selman selman@cs.cornell.edu Module Probability --- Part d) 1) Probability Distributions 2) Markov and Chebyshev Bounds

15

Cumulative Probability Distribution

The cdf, properties

xxFX each for 10

decreasing-non isxFX

right thefrom continuous isxFX

Page 16: 1 Discrete Math CS 2800 Prof. Bart Selman selman@cs.cornell.edu Module Probability --- Part d) 1) Probability Distributions 2) Markov and Chebyshev Bounds

16

Of a discrete probability distribution

Of a continuous probability distribution

Of a distribution which has both a continuous part and a discrete part.

Example CDFs

Page 17: 1 Discrete Math CS 2800 Prof. Bart Selman selman@cs.cornell.edu Module Probability --- Part d) 1) Probability Distributions 2) Markov and Chebyshev Bounds

17

Functions of a random variable

It is possible to calculate expectations and variances of functions of random variables

 

x

xXPxgXgE

x

xXPxgExgXgV 2

Page 18: 1 Discrete Math CS 2800 Prof. Bart Selman selman@cs.cornell.edu Module Probability --- Part d) 1) Probability Distributions 2) Markov and Chebyshev Bounds

18

Functions of a random variableExample

– You are paid a number of dollars equal to the square root of the number of spots on a die.

– What is a fair bet to get into this game?

 

x P(X=x) Product1 1 1/6 0.167

2 1.414 1/6 0.236

3 1.732 1/6 0.289

4 2 1/6 0.333

5 2.231 1/6 0.372

6 2.449 1/6 0.408

Tot     1.804

x

Page 19: 1 Discrete Math CS 2800 Prof. Bart Selman selman@cs.cornell.edu Module Probability --- Part d) 1) Probability Distributions 2) Markov and Chebyshev Bounds

19

Functions of a random variable

Linear functions– If a and b are constants and X is a random variable

– It can be shown that:

 

 

bXaEbaXE

XVabaXV 2Intuitively,why does b not appear in variance?And, why a2 ?

Page 20: 1 Discrete Math CS 2800 Prof. Bart Selman selman@cs.cornell.edu Module Probability --- Part d) 1) Probability Distributions 2) Markov and Chebyshev Bounds

20

The Most Common

Discrete Probability Distributions

(some discussed before)

1) --- Bernoulli distribution2) --- Binomial3) --- Geometric4) --- Poisson

Page 21: 1 Discrete Math CS 2800 Prof. Bart Selman selman@cs.cornell.edu Module Probability --- Part d) 1) Probability Distributions 2) Markov and Chebyshev Bounds

21

Bernoulli distribution

The Bernoulli distribution is the “coin flip” distribution.

X is Bernoulli if its probability function is:

1 . .

0 . . 1

w p pX

w p p

X=1 is usually interpreted as a “success.” E.g.:X=1 for heads in coin tossX=1 for male in surveyX=1 for defective in a test of productX=1 for “made the sale” tracking performance

Page 22: 1 Discrete Math CS 2800 Prof. Bart Selman selman@cs.cornell.edu Module Probability --- Part d) 1) Probability Distributions 2) Markov and Chebyshev Bounds

22

Bernoulli distribution

Expectation:

Variance:

pppXE 011

pppp

ppp

XEXEXV

1

0112

222

22

Page 23: 1 Discrete Math CS 2800 Prof. Bart Selman selman@cs.cornell.edu Module Probability --- Part d) 1) Probability Distributions 2) Markov and Chebyshev Bounds

23

Binomial distribution

The binomial distribution is just n independent Bernoullis

added up.

It is the number of “successes” in n trials.

If Z1, Z2, …, Zn are Bernoulli, then X is binomial:

nZZZX 21

Page 24: 1 Discrete Math CS 2800 Prof. Bart Selman selman@cs.cornell.edu Module Probability --- Part d) 1) Probability Distributions 2) Markov and Chebyshev Bounds

24

Binomial distribution

The binomial distribution is just n independent Bernoullis

added up. Testing for defects “with replacement.”

– Have many light bulbs

– Pick one at random, test for defect, put it back

– Pick one at random, test for defect, put it back

– If there are many light bulbs, do not have to replace

Page 25: 1 Discrete Math CS 2800 Prof. Bart Selman selman@cs.cornell.edu Module Probability --- Part d) 1) Probability Distributions 2) Markov and Chebyshev Bounds

Binomial distribution

Let’s figure out a binomial r.v.’s probability function.Suppose we are looking at a binomial with n=3.

We want P(X=0):– Can happen one way: 000– (1-p)(1-p)(1-p) = (1-p)3

We want P(X=1):– Can happen three ways: 100, 010, 001– p(1-p)(1-p)+(1-p)p(1-p)+(1-p)(1-p)p = 3p(1-p)2

We want P(X=2):– Can happen three ways: 110, 011, 101– pp(1-p)+(1-p)pp+p(1-p)p = 3p2(1-p)

We want P(X=3):– Can happen one way: 111– ppp = p3

Page 26: 1 Discrete Math CS 2800 Prof. Bart Selman selman@cs.cornell.edu Module Probability --- Part d) 1) Probability Distributions 2) Markov and Chebyshev Bounds

26

Binomial distribution

So, binomial r.v.’s probability function

3

2

2

3

0 . . 1

1 . . 3 1

2 . . 3 1

3 . .

w p p

w p p pX

w p p p

w p p

xnxX ppxP 1 waysof #

!1

! !n xxn

p px n x

Page 27: 1 Discrete Math CS 2800 Prof. Bart Selman selman@cs.cornell.edu Module Probability --- Part d) 1) Probability Distributions 2) Markov and Chebyshev Bounds

27

Binomial distribution

Typical shape of binomial:– Symmetric

Page 28: 1 Discrete Math CS 2800 Prof. Bart Selman selman@cs.cornell.edu Module Probability --- Part d) 1) Probability Distributions 2) Markov and Chebyshev Bounds

28

Expectation:

1 1 1

n n n

i ii i i

E X E Z E Z p np

Variance:

n

i

n

ii

n

ii

pnppp

ZVZVXV

1

11

11

Aside: ( ) ( ) 2 ( )V X Y V X V Y V X If V(X) = V(Y). And?

But (2 ) 4 ( )V X X V X V X Hmm…

Page 29: 1 Discrete Math CS 2800 Prof. Bart Selman selman@cs.cornell.edu Module Probability --- Part d) 1) Probability Distributions 2) Markov and Chebyshev Bounds

29

Binomial distribution

A salesman claims that he closes a deal 40% of the time.

This month, he closed 1 out of 10 deals.

How likely is it that he did 1/10 or worse given his claim?

Page 30: 1 Discrete Math CS 2800 Prof. Bart Selman selman@cs.cornell.edu Module Probability --- Part d) 1) Probability Distributions 2) Markov and Chebyshev Bounds

30

Binomial distribution

0 10

9

10!0 0.4 0.6 1 1 0.006 0.006

0! 10!

10!1 0.4 0.6 10 0.4 0.010 0.040

1! 9!

X

X

P

P

( 1) 0.046XP X

Less than 5% or1 in 20.So, it’s unlikelythat his successrate is 0.4.

4 6 10 9 8 710!

4 0.4 0.6 0.0256 0.0467 0.2514! 6! 4 3 2XP

Note:

Page 31: 1 Discrete Math CS 2800 Prof. Bart Selman selman@cs.cornell.edu Module Probability --- Part d) 1) Probability Distributions 2) Markov and Chebyshev Bounds

Binomial and normal / Gaussian distribution

The normal distributionis a good approximationto the binomial distribution. (“large” n,small skew.)

B(n, p)

Prob. density function:

Page 32: 1 Discrete Math CS 2800 Prof. Bart Selman selman@cs.cornell.edu Module Probability --- Part d) 1) Probability Distributions 2) Markov and Chebyshev Bounds

32

Geometric Distribution

A geometric distribution is usually interpreted as number of time periods until a failure occurs.

Imagine a sequence of coin flips, and the random variable X is the flip number on which the first tails occurs.

The probability of a head (a success) is p.

Page 33: 1 Discrete Math CS 2800 Prof. Bart Selman selman@cs.cornell.edu Module Probability --- Part d) 1) Probability Distributions 2) Markov and Chebyshev Bounds

Geometric

Let’s find the probability function for the geometric distribution:

pppppP

ppP

pP

X

X

X

113

12

11

2

ppxP xX 11

etc.

So, ?XP x (x is a positive integer)

Page 34: 1 Discrete Math CS 2800 Prof. Bart Selman selman@cs.cornell.edu Module Probability --- Part d) 1) Probability Distributions 2) Markov and Chebyshev Bounds

34

Geometric

Notice, there is no upper limit on how large X can be

Let’s check that these probabilities add to 1:

11

111

11

0

1

1

1

1

1

pppp

ppppxP

x

x

x

x

x

x

xX

Geometric series

Page 35: 1 Discrete Math CS 2800 Prof. Bart Selman selman@cs.cornell.edu Module Probability --- Part d) 1) Probability Distributions 2) Markov and Chebyshev Bounds

35

Geometric

Expectation:

pp

p

xpppxpxxPx

x

x

x

xX

1

1

1

11

11

2

1

1

1

1

1

21 p

pXV

Variance:

See Rosenpage 158,example 17.

0

1

1x

x

pp

(| | 1)p

differentiate both sides w.r.t. p:1

2 20

1 11 1

(1 ) (1 )x

x

xpp p

Page 36: 1 Discrete Math CS 2800 Prof. Bart Selman selman@cs.cornell.edu Module Probability --- Part d) 1) Probability Distributions 2) Markov and Chebyshev Bounds

36

Poisson distribution

The Poisson distribution is typical of random variables which represent counts.

– Number of requests to a server in 1 hour.

– Number of sick days in a year for an employee.

Page 37: 1 Discrete Math CS 2800 Prof. Bart Selman selman@cs.cornell.edu Module Probability --- Part d) 1) Probability Distributions 2) Markov and Chebyshev Bounds

37

The Poisson distribution is derived from the following underlying arrival time model:

– The probability of an unit arriving is uniform through time.

– Two items never arrive at exactly the same time.

– Arrivals are independent --- the arrival of one unit does not make the next unit more or less likely to arrive quickly.

Page 38: 1 Discrete Math CS 2800 Prof. Bart Selman selman@cs.cornell.edu Module Probability --- Part d) 1) Probability Distributions 2) Markov and Chebyshev Bounds

38

Poisson distribution

The probability function for the Poisson distribution with parameter is:

is like the arrival rate --- higher means more/faster arrivals

for x 0,1,2,3,...!

x

x

eP X x

x

XVXE

Page 39: 1 Discrete Math CS 2800 Prof. Bart Selman selman@cs.cornell.edu Module Probability --- Part d) 1) Probability Distributions 2) Markov and Chebyshev Bounds

39

Poisson distribution

Shape

Low Med High

Page 40: 1 Discrete Math CS 2800 Prof. Bart Selman selman@cs.cornell.edu Module Probability --- Part d) 1) Probability Distributions 2) Markov and Chebyshev Bounds

Markov and Chebyshev bounds

40

Page 41: 1 Discrete Math CS 2800 Prof. Bart Selman selman@cs.cornell.edu Module Probability --- Part d) 1) Probability Distributions 2) Markov and Chebyshev Bounds

Often, you don’t know the exact probability distribution

of a random variable.

We still would like to say something about the probabilities involving that random variable…

E.g., what is the probability of X being larger (or smaller) than some given value.

We often can by bounding the probability of events based on partial information about the underlying probability distribution

Markov and Chebyshev bounds.

Page 42: 1 Discrete Math CS 2800 Prof. Bart Selman selman@cs.cornell.edu Module Probability --- Part d) 1) Probability Distributions 2) Markov and Chebyshev Bounds

42

Theorem Markov Inequality

Let X be a nonnegative random variable with E[X] = .

Then, for any t > 0,( )P X t

t

Hmm. What if ? t ( ) 1P X t Sure!

Note: relatescumulative distributionto expected value.

2t gives 1

( 2 )2

P X But“Can’t have too muchprob. to the right of E[X]”

Page 43: 1 Discrete Math CS 2800 Prof. Bart Selman selman@cs.cornell.edu Module Probability --- Part d) 1) Probability Distributions 2) Markov and Chebyshev Bounds

43

Proof. ( ) [ ]t P X t E X

:

. ( ) . ( )x x t

t P X t t P X x

:

( )x x t

x P X x

( )x

x P X x [ ]E X

Where did we use X >= 0? 3rd line

t

( )P X x

x

[ ]( )

E XP X t

t I.e.

Page 44: 1 Discrete Math CS 2800 Prof. Bart Selman selman@cs.cornell.edu Module Probability --- Part d) 1) Probability Distributions 2) Markov and Chebyshev Bounds

44

Alt. proof Markov Inequality

( )P X tt

( ) 0I X

[ ]E X

Define0 X t

Yt X t

A discrete random variable

( ) 0( )

( )Y

P X t yp y

P X t y t

[ ] 0 ( ) ( )E Y P X t t P X t [ ]E X

0t

E[Y] E[X]

Page 45: 1 Discrete Math CS 2800 Prof. Bart Selman selman@cs.cornell.edu Module Probability --- Part d) 1) Probability Distributions 2) Markov and Chebyshev Bounds

Example:

Consider a system with mean time to failure = 100 hours.

Use the Markov inequality to bound the reliability of the system,

R(t) for t = 90, 100, 110, 200

By Markov

( 90) 100 / 90 1.11P X

( 100) 100 /100 1P X

( 110) 100 /110 0.9P X

( 200) 100 / 200 0.5P X

X – time to failure of the system; E[X]=100

R(t)= P[X>t] , with t =90, 100, 110 , 200

( )P X tt

Markov inequality is somewhat crude,

since only the mean is assumed to be known.

Page 46: 1 Discrete Math CS 2800 Prof. Bart Selman selman@cs.cornell.edu Module Probability --- Part d) 1) Probability Distributions 2) Markov and Chebyshev Bounds

46

Assume that mean and variance are given.

Better estimate of probability of events of interest using Chebyshev inequality:

Proof: Apply Markov inequality to non-negative r.v. (X- )2 and number t2 to obtain

Theorem Chebyshev's Inequality

Page 47: 1 Discrete Math CS 2800 Prof. Bart Selman selman@cs.cornell.edu Module Probability --- Part d) 1) Probability Distributions 2) Markov and Chebyshev Bounds

47

Theorem Chebyshev's Inequality

( )P X tt

0t

( ) 0I X [ ]E X

2

2

2

[ ]

[ ]

X

X

XX

E X

V X

P X tt

Alternate form

Page 48: 1 Discrete Math CS 2800 Prof. Bart Selman selman@cs.cornell.edu Module Probability --- Part d) 1) Probability Distributions 2) Markov and Chebyshev Bounds

48

Theorem Chebyshev's Inequality

( )P X tt

0t

( ) 0I X [ ]E X

2

2X

XP X tt

2 2X XP X t P X t

2

2

XE X

t

Because

Page 49: 1 Discrete Math CS 2800 Prof. Bart Selman selman@cs.cornell.edu Module Probability --- Part d) 1) Probability Distributions 2) Markov and Chebyshev Bounds

49

Chebyshev inequality: Alternate forms

Yet two other forms of Chebyshev’s ineqaulity:

2

2X

XP X tt

Says something about the probability ofbeing “k standard deviations from the mean”.

Page 50: 1 Discrete Math CS 2800 Prof. Bart Selman selman@cs.cornell.edu Module Probability --- Part d) 1) Probability Distributions 2) Markov and Chebyshev Bounds

50

XX XX XX

Theorem Chebyshev's Inequality

2

2X

XP X tt

2

1X XP X k

k

XX

2

11X XP X k

k

00.75

0.8890.934

X

Page 51: 1 Discrete Math CS 2800 Prof. Bart Selman selman@cs.cornell.edu Module Probability --- Part d) 1) Probability Distributions 2) Markov and Chebyshev Bounds

51

XX XX XX

Theorem Chebyshev's Inequality

2

2X

XP X tt

2

1X XP X k

k Facts:

XX

2

11X XP X k

k

00.75

0.8890.934

X

2~ ( , )X N

Page 52: 1 Discrete Math CS 2800 Prof. Bart Selman selman@cs.cornell.edu Module Probability --- Part d) 1) Probability Distributions 2) Markov and Chebyshev Bounds

52

Example

Page 53: 1 Discrete Math CS 2800 Prof. Bart Selman selman@cs.cornell.edu Module Probability --- Part d) 1) Probability Distributions 2) Markov and Chebyshev Bounds

53

60

6

2

1X XP X k

k

84X 24 4X

4 1/16X XP X

1161000 62.5 70

( )P X tt

70( 84) 0.8

84P X

Aside “just” Markov: