poisson statistics

84

Upload: narendra-prasad-uniyal

Post on 26-Jan-2015

153 views

Category:

Documents


3 download

DESCRIPTION

POISSON PROBABILITYDISTRIBUTION COVARIANCE BINOMIAL DISTRIBUTIION

TRANSCRIPT

Page 1: Poisson statistics
Page 2: Poisson statistics

Measures of Distribution Shape,Relative Location, and Detecting Outliers

Distribution Shape z-Scoresz-Scores Empirical RuleEmpirical Rule

Detecting OutliersDetecting Outliers

22

Page 3: Poisson statistics

An important measure of the shape of a An important measure of the shape of a distribution is called distribution is called skewnessskewness..

The formula for the skewness of sample data The formula for the skewness of sample data isis

Skewness can be easily computed using Skewness can be easily computed using statistical software.statistical software.

3

)2)(1(Skewness

s

xx

nn

n i

3

)2)(1(Skewness

s

xx

nn

n i

Distribution Shape: SkewnessDistribution Shape: Skewness

33

Page 4: Poisson statistics

Distribution Shape: SkewnessDistribution Shape: Skewness

Rela

tive F

req

uen

cyR

ela

tive F

req

uen

cy

.05.05

.10.10

.15.15

.20.20

.25.25

.30.30

.35.35

00

Skewness = Skewness = 0 0

• Skewness is zero.Skewness is zero.

• Mean and median are equal.Mean and median are equal.

Symmetric (not skewed)Symmetric (not skewed)

44

Page 5: Poisson statistics

Rela

tive F

req

uen

cyR

ela

tive F

req

uen

cy

.05.05

.10.10

.15.15

.20.20

.25.25

.30.30

.35.35

00

Skewness = Skewness = .31 .31

• Skewness is negative.Skewness is negative.

• Mean will usually be less than the median.Mean will usually be less than the median.

Distribution Shape: SkewnessDistribution Shape: Skewness

Moderately Skewed LeftModerately Skewed Left

55

Page 6: Poisson statistics

Rela

tive F

req

uen

cyR

ela

tive F

req

uen

cy

.05.05

.10.10

.15.15

.20.20

.25.25

.30.30

.35.35

00

Skewness = .31 Skewness = .31

• Skewness is positive.Skewness is positive.

• Mean will usually be more than the median.Mean will usually be more than the median.

Distribution Shape: SkewnessDistribution Shape: Skewness

Moderately Skewed RightModerately Skewed Right

66

Page 7: Poisson statistics

Distribution Shape: SkewnessDistribution Shape: Skewness

Highly Skewed RightHighly Skewed RightR

ela

tive F

req

uen

cyR

ela

tive F

req

uen

cy

.05.05

.10.10

.15.15

.20.20

.25.25

.30.30

.35.35

00

Skewness = 1.25 Skewness = 1.25

• Skewness is positive (often above 1.0).Skewness is positive (often above 1.0).

• Mean will usually be more than the median.Mean will usually be more than the median.

77

Page 8: Poisson statistics

Seventy efficiency apartments were randomlySeventy efficiency apartments were randomly

sampled in a college town. The monthly rent sampled in a college town. The monthly rent pricesprices

for the apartments are listed below in for the apartments are listed below in ascending order. ascending order.

Distribution Shape: SkewnessDistribution Shape: Skewness

Example: Apartment RentsExample: Apartment Rents

425 430 430 435 435 435 435 435 440 440440 440 440 445 445 445 445 445 450 450450 450 450 450 450 460 460 460 465 465465 470 470 472 475 475 475 480 480 480480 485 490 490 490 500 500 500 500 510510 515 525 525 525 535 549 550 570 570575 575 580 590 600 600 600 600 615 615

88

Page 9: Poisson statistics

Rela

tive F

req

uen

cyR

ela

tive F

req

uen

cy

.05.05

.10.10

.15.15

.20.20

.25.25

.30.30

.35.35

00

Skewness = .92 Skewness = .92

Distribution Shape: SkewnessDistribution Shape: Skewness

Example: Apartment RentsExample: Apartment Rents

99

Page 10: Poisson statistics

The The z-scorez-score is often called the standardized value. is often called the standardized value. The The z-scorez-score is often called the standardized value. is often called the standardized value.

It denotes the number of standard deviations a dataIt denotes the number of standard deviations a data value value xxii is from the mean. is from the mean. It denotes the number of standard deviations a dataIt denotes the number of standard deviations a data value value xxii is from the mean. is from the mean.

z-Scoresz-Scores

zx xsii

zx xsii

Excel’s STANDARDIZE function can be used toExcel’s STANDARDIZE function can be used to compute the z-score.compute the z-score. Excel’s STANDARDIZE function can be used toExcel’s STANDARDIZE function can be used to compute the z-score.compute the z-score.

1010

Page 11: Poisson statistics

z-Scoresz-Scores

A data value less than the sample mean will have aA data value less than the sample mean will have a z-score less than zero.z-score less than zero. A data value greater than the sample mean will haveA data value greater than the sample mean will have a z-score greater than zero.a z-score greater than zero. A data value equal to the sample mean will have aA data value equal to the sample mean will have a z-score of zero.z-score of zero.

An observation’s z-score is a measure of the relativeAn observation’s z-score is a measure of the relative location of the observation in a data set.location of the observation in a data set.

1111

Page 12: Poisson statistics

• z-Score of Smallest Value (425)

425 490.80 1.20

54.74ix x

zs

425 490.80

1.2054.74

ix xz

s

Standardized Values for Apartment RentsStandardized Values for Apartment Rents-1.20 -1.11 -1.11 -1.02 -1.02 -1.02 -1.02 -1.02 -0.93 -0.93-0.93 -0.93 -0.93 -0.84 -0.84 -0.84 -0.84 -0.84 -0.75 -0.75-0.75 -0.75 -0.75 -0.75 -0.75 -0.56 -0.56 -0.56 -0.47 -0.47-0.47 -0.38 -0.38 -0.34 -0.29 -0.29 -0.29 -0.20 -0.20 -0.20-0.20 -0.11 -0.01 -0.01 -0.01 0.17 0.17 0.17 0.17 0.350.35 0.44 0.62 0.62 0.62 0.81 1.06 1.08 1.45 1.451.54 1.54 1.63 1.81 1.99 1.99 1.99 1.99 2.27 2.27

Example: Apartment RentsExample: Apartment Rents

z-Scoresz-Scores

1212

Page 13: Poisson statistics

Empirical RuleEmpirical Rule

When the data are believed to approximate aWhen the data are believed to approximate a bell-shaped distribution with moderate skew …bell-shaped distribution with moderate skew …

The empirical rule is based on the normalThe empirical rule is based on the normal distribution, which we will discuss later.distribution, which we will discuss later. The empirical rule is based on the normalThe empirical rule is based on the normal distribution, which we will discuss later.distribution, which we will discuss later.

The The empirical ruleempirical rule can be used to determine the can be used to determine the percentage of data values that must be within apercentage of data values that must be within a specified number of standard deviations of thespecified number of standard deviations of the mean.mean.

The The empirical ruleempirical rule can be used to determine the can be used to determine the percentage of data values that must be within apercentage of data values that must be within a specified number of standard deviations of thespecified number of standard deviations of the mean.mean.

1313

Page 14: Poisson statistics

For data having a bell-shaped distribution, approximately

of the values are withinof the values are within of its mean.of its mean.

of the values are withinof the values are within of its mean.of its mean.

68.26%68.26%68.26%68.26%

+/- 1 standard deviation+/- 1 standard deviation+/- 1 standard deviation+/- 1 standard deviation

of the values are within of the values are within of its mean.of its mean.

of the values are within of the values are within of its mean.of its mean.

95.44%95.44%95.44%95.44%

+/- 2 standard deviations+/- 2 standard deviations+/- 2 standard deviations+/- 2 standard deviations

of the values are within of the values are within of its mean.of its mean. of the values are within of the values are within of its mean.of its mean.

99.72%99.72%99.72%99.72%

+/- 3 standard deviations+/- 3 standard deviations+/- 3 standard deviations+/- 3 standard deviations

Empirical RuleEmpirical Rule

1414

Page 15: Poisson statistics

xx – – 33 – – 11

– – 22 + 1+ 1

+ 2+ 2 + 3+ 3

68.26%68.26%95.44%95.44%99.72%99.72%

Empirical RuleEmpirical Rule

1515

Page 16: Poisson statistics

An An outlieroutlier is an unusually small or unusually large is an unusually small or unusually large value in a data set.value in a data set. A data value with a z-score less than -3 or greaterA data value with a z-score less than -3 or greater than +3 might be considered an outlier.than +3 might be considered an outlier.

It might be:It might be:• an incorrectly recorded data valuean incorrectly recorded data value• a data value that was incorrectly included in thea data value that was incorrectly included in the data setdata set• a data value that has occurred by chancea data value that has occurred by chance

Detecting OutliersDetecting Outliers

1616

Page 17: Poisson statistics

• The most extreme z-scores are -1.20 and 2.27The most extreme z-scores are -1.20 and 2.27

• Using |Using |zz| | >> 3 as the criterion for an outlier, there 3 as the criterion for an outlier, there are no outliers in this data set.are no outliers in this data set.

-1.20 -1.11 -1.11 -1.02 -1.02 -1.02 -1.02 -1.02 -0.93 -0.93-0.93 -0.93 -0.93 -0.84 -0.84 -0.84 -0.84 -0.84 -0.75 -0.75-0.75 -0.75 -0.75 -0.75 -0.75 -0.56 -0.56 -0.56 -0.47 -0.47-0.47 -0.38 -0.38 -0.34 -0.29 -0.29 -0.29 -0.20 -0.20 -0.20-0.20 -0.11 -0.01 -0.01 -0.01 0.17 0.17 0.17 0.17 0.350.35 0.44 0.62 0.62 0.62 0.81 1.06 1.08 1.45 1.451.54 1.54 1.63 1.81 1.99 1.99 1.99 1.99 2.27 2.27

Standardized Values for Apartment RentsStandardized Values for Apartment Rents

Example: Apartment RentsExample: Apartment Rents

Detecting OutliersDetecting Outliers

1717

Page 18: Poisson statistics

Exploratory Data Analysis

Exploratory data analysisExploratory data analysis is looking at methods is looking at methods toto summarize data.summarize data. Exploratory data analysisExploratory data analysis is looking at methods is looking at methods toto summarize data.summarize data.

For now we simply sort the data values into ascending For now we simply sort the data values into ascending order and identify the order and identify the five-number summaryfive-number summary and then and thenconstruct a construct a box plotbox plot..

For now we simply sort the data values into ascending For now we simply sort the data values into ascending order and identify the order and identify the five-number summaryfive-number summary and then and thenconstruct a construct a box plotbox plot..

1818

Page 19: Poisson statistics

Five-Number Summary

1111 Smallest ValueSmallest Value Smallest ValueSmallest Value

First QuartileFirst Quartile First QuartileFirst Quartile

MedianMedian MedianMedian

Third QuartileThird Quartile Third QuartileThird Quartile

Largest ValueLargest Value Largest ValueLargest Value

2222

3333

4444

5555

1919

Page 20: Poisson statistics

Five-Number Summary

425 430 430 435 435 435 435 435 440 440440 440 440 445 445 445 445 445 450 450450 450 450 450 450 460 460 460 465 465465 470 470 472 475 475 475 480 480 480480 485 490 490 490 500 500 500 500 510510 515 525 525 525 535 549 550 570 570575 575 580 590 600 600 600 600 615 615

425 430 430 435 435 435 435 435 440 440440 440 440 445 445 445 445 445 450 450450 450 450 450 450 460 460 460 465 465465 470 470 472 475 475 475 480 480 480480 485 490 490 490 500 500 500 500 510510 515 525 525 525 535 549 550 570 570575 575 580 590 600 600 600 600 615 615

Lowest Value = 425Lowest Value = 425 First Quartile = 445First Quartile = 445

Median = 475Median = 475

Third Quartile = 525Third Quartile = 525Largest Value = 615Largest Value = 615

Example: Apartment RentsExample: Apartment Rents

2020

Page 21: Poisson statistics

Box PlotBox Plot

A A box plotbox plot is a graphical summary of data that is is a graphical summary of data that is based on a five-number summary.based on a five-number summary. A A box plotbox plot is a graphical summary of data that is is a graphical summary of data that is based on a five-number summary.based on a five-number summary.

A key to the development of a box plot is theA key to the development of a box plot is the computation of the median and the quartiles computation of the median and the quartiles QQ11 and and QQ33..

A key to the development of a box plot is theA key to the development of a box plot is the computation of the median and the quartiles computation of the median and the quartiles QQ11 and and QQ33..

Box plots provide another way to identify outliers.Box plots provide another way to identify outliers. Box plots provide another way to identify outliers.Box plots provide another way to identify outliers.

2121

They also tell us whether the data are skewed.They also tell us whether the data are skewed. They also tell us whether the data are skewed.They also tell us whether the data are skewed.

Page 22: Poisson statistics

400400

425425

450450

475475

500500

525525

550550

575575

600600

625625

• A box is drawn with its ends located at the first andA box is drawn with its ends located at the first and

third quartiles (Qthird quartiles (Q11 & Q & Q33).).

Box PlotBox Plot

• A vertical line is drawn in the box at the location ofA vertical line is drawn in the box at the location of the median (second quartile).the median (second quartile).

Q1 = 445Q1 = 445 Q3 = 525Q3 = 525

Q2 = 475Q2 = 475

Example: Apartment RentsExample: Apartment Rents

2222

Page 23: Poisson statistics

Box PlotBox Plot

Limits are located (not drawn) using the Limits are located (not drawn) using the interquartile range (IQR = Qinterquartile range (IQR = Q33-Q-Q11): they are ): they are 1.5IQR below Q1.5IQR below Q11 and 1.5 IQR above Q and 1.5 IQR above Q33..

Data outside these limits are considered Data outside these limits are considered outliersoutliers..

The locations of each outlier is shown with the The locations of each outlier is shown with the

symbolsymbol * * ..

2323

Page 24: Poisson statistics

Box PlotBox Plot

Lower Limit: Q1 - 1.5(IQR) = 445 - 1.5(80) = 325Lower Limit: Q1 - 1.5(IQR) = 445 - 1.5(80) = 325

Upper Limit: Q3 + 1.5(IQR) = 525 + 1.5(80) = 645Upper Limit: Q3 + 1.5(IQR) = 525 + 1.5(80) = 645

• The lower limit is located 1.5(IQR) below The lower limit is located 1.5(IQR) below QQ1.1.

• The upper limit is located 1.5(IQR) above The upper limit is located 1.5(IQR) above QQ3.3.

• There are no outliers (values less than 325 orThere are no outliers (values less than 325 or greater than 645) in the apartment rent data.greater than 645) in the apartment rent data.

Example: Apartment RentsExample: Apartment Rents

2424

Page 25: Poisson statistics

• Whiskers (dashed lines) are drawn from the ends of the box to the smallest and largest data values inside the limits.

400400

425425

450450

475475

500500

525525

550550

575575

600600

625625

Smallest valueSmallest valueinside limits = 425inside limits = 425

Largest valueLargest valueinside limits = 615inside limits = 615

Example: Apartment RentsExample: Apartment Rents

Box PlotBox Plot

2525

Page 26: Poisson statistics

An excellent graphical technique for An excellent graphical technique for makingmaking

comparisons among two or more comparisons among two or more groups.groups.

Box PlotBox Plot

2626

Page 27: Poisson statistics

Measures of Association Between Two Variables

Thus far we have examined numerical methods usedThus far we have examined numerical methods used to summarize the data for one variable at a time.to summarize the data for one variable at a time. Thus far we have examined numerical methods usedThus far we have examined numerical methods used to summarize the data for one variable at a time.to summarize the data for one variable at a time.

Often a manager or decision maker is interested inOften a manager or decision maker is interested in the the relationship between two variablesrelationship between two variables.. Often a manager or decision maker is interested inOften a manager or decision maker is interested in the the relationship between two variablesrelationship between two variables..

Two descriptive measures of the relationship Two descriptive measures of the relationship between two variables are between two variables are covariancecovariance and and correlationcorrelation coefficientcoefficient..

Two descriptive measures of the relationship Two descriptive measures of the relationship between two variables are between two variables are covariancecovariance and and correlationcorrelation coefficientcoefficient..

2727

Page 28: Poisson statistics

Positive values indicate a positive relationship.Positive values indicate a positive relationship. Positive values indicate a positive relationship.Positive values indicate a positive relationship.

Negative values indicate a negative relationship.Negative values indicate a negative relationship. Negative values indicate a negative relationship.Negative values indicate a negative relationship.

The The covariancecovariance is a measure of the linear association is a measure of the linear association between two variables.between two variables. The The covariancecovariance is a measure of the linear association is a measure of the linear association between two variables.between two variables.

CovarianceCovariance

2828

Page 29: Poisson statistics

CovarianceCovariance

The covariance is computed as follows:The covariance is computed as follows:

The covariance is computed as follows:The covariance is computed as follows:

forforsamplessamples

forforpopulationspopulations

1 1 n nxy

(x x)(y y) (x x)(y y)s

n 1

1 1 n n

xy

(x x)(y y) (x x)(y y)s

n 1

1 x 1 y N x N yxy

(x )(y ) (x )(y )

N

1 x 1 y N x N yxy

(x )(y ) (x )(y )

N

2929

Page 30: Poisson statistics

Correlation CoefficientCorrelation Coefficient

There are also other types of associations not captured There are also other types of associations not captured by correlation.by correlation. There are also other types of associations not captured There are also other types of associations not captured by correlation.by correlation.

Correlation is a measure of linear association.Correlation is a measure of linear association. Correlation is a measure of linear association.Correlation is a measure of linear association.

3030

Page 31: Poisson statistics

The correlation coefficient is computed as follows:The correlation coefficient is computed as follows:

The correlation coefficient is computed as follows:The correlation coefficient is computed as follows:

forforsamplessamples

forforpopulationspopulations

rs

s sxyxy

x yrs

s sxyxy

x y

xyxy

x y

xyxy

x y

Correlation CoefficientCorrelation Coefficient

3131

2 22 1 x N x

x

(x ) (x )

N

2 22 1 x N x

x

(x ) (x )

N

2 22 1 nx

(x x) (x x)s

n 1

2 2

2 1 nx

(x x) (x x)s

n 1

2 21 y N y2

y

(y ) (y )

N

2 21 y N y2

y

(y ) (y )

N

2 22 1 ny

(y y) (y y)s

n 1

2 2

2 1 ny

(y y) (y y)s

n 1

Page 32: Poisson statistics

Values near +1 indicate a Values near +1 indicate a strong positive linearstrong positive linear relationshiprelationship.. Values near +1 indicate a Values near +1 indicate a strong positive linearstrong positive linear relationshiprelationship..

Values near -1 indicate a Values near -1 indicate a strong negative linearstrong negative linear relationshiprelationship. . Values near -1 indicate a Values near -1 indicate a strong negative linearstrong negative linear relationshiprelationship. .

The coefficient can take on values between -1 and +1.The coefficient can take on values between -1 and +1. The coefficient can take on values between -1 and +1.The coefficient can take on values between -1 and +1.

The closer the correlation is to zero, the weaker theThe closer the correlation is to zero, the weaker the relationship.relationship. The closer the correlation is to zero, the weaker theThe closer the correlation is to zero, the weaker the relationship.relationship.

Correlation CoefficientCorrelation Coefficient

3232

Page 33: Poisson statistics

CorrelationCorrelation

A Positive Relationship: correlation close to 1

xx

yy

3333

Page 34: Poisson statistics

CorrelationCorrelation

xx

yy

3434

A Negative Relationship: correlation close to -1

Page 35: Poisson statistics

CorrelationCorrelation

No Apparent Relationship: Correlation near 0

xx

yy

3535

Page 36: Poisson statistics

A golfer is interested in investigating theA golfer is interested in investigating the

relationship, if any, between driving distance relationship, if any, between driving distance and 18-hole score.and 18-hole score.

277.6277.6259.5259.5269.1269.1267.0267.0255.6255.6272.9272.9

696971717070707071716969

Average DrivingAverage DrivingDistance (yds.)Distance (yds.)

AverageAverage18-Hole Score18-Hole Score

Covariance and Correlation CoefficientCovariance and Correlation Coefficient

Example: Golfing StudyExample: Golfing Study

3636

Page 37: Poisson statistics

Covariance and Correlation CoefficientCovariance and Correlation Coefficient

277.6277.6259.5259.5269.1269.1267.0267.0255.6255.6272.9272.9

696971717070707071716969

xx yy

10.6510.65 -7.45-7.45 2.152.15 0.050.05-11.35-11.35 5.955.95

-1.0-1.0 1.01.0 00 00 1.01.0-1.0-1.0

-10.65-10.65 -7.45-7.45 00 00-11.35-11.35 -5.95-5.95

( )ix x( )ix x ( )( )i ix x y y ( )( )i ix x y y ( )iy y( )iy y

AverageAverageStd. Dev.Std. Dev.

267.0267.0 70.070.0 -35.40-35.408.21928.2192.8944.8944

TotalTotal

Example: Golfing StudyExample: Golfing Study

3737

Page 38: Poisson statistics

• Sample CovarianceSample Covariance

• Sample Correlation CoefficientSample Correlation Coefficient

Covariance and Correlation CoefficientCovariance and Correlation Coefficient

7.08 -.9631

(8.2192)(.8944)xy

xyx y

sr

s s

7.08

-.9631(8.2192)(.8944)

xyxy

x y

sr

s s

( )( ) 35.40 7.08

1 6 1i i

xy

x x y ys

n

( )( ) 35.40

7.081 6 1

i ixy

x x y ys

n

Example: Golfing StudyExample: Golfing Study

3838

So, increasing driving distance decreases score, and the relationis really strong.

Page 39: Poisson statistics

A A random variablerandom variable is a numerical description of the is a numerical description of the outcome of an experiment.outcome of an experiment. A A random variablerandom variable is a numerical description of the is a numerical description of the outcome of an experiment.outcome of an experiment.

Random VariablesRandom Variables

A A discrete random variablediscrete random variable may assume either a may assume either a finite number of values or an infinite sequence offinite number of values or an infinite sequence of values.values.

A A discrete random variablediscrete random variable may assume either a may assume either a finite number of values or an infinite sequence offinite number of values or an infinite sequence of values.values.

A A continuous random variablecontinuous random variable may assume any may assume any numerical value in an interval or collection ofnumerical value in an interval or collection of intervals.intervals.

A A continuous random variablecontinuous random variable may assume any may assume any numerical value in an interval or collection ofnumerical value in an interval or collection of intervals.intervals.

3939

Page 40: Poisson statistics

Random VariablesRandom Variables

QuestionQuestion Random Variable Random Variable xx TypeType

FamilyFamilysizesize

xx = Number of dependents = Number of dependents reported on tax returnreported on tax return

DiscreteDiscrete

Distance fromDistance fromhome to storehome to store

xx = Distance in miles from = Distance in miles from home to the store sitehome to the store site

ContinuousContinuous

Own dogOwn dogor cator cat

xx = 1 if own no pet; = 1 if own no pet; = 2 if own dog(s) only; = 2 if own dog(s) only; = 3 if own cat(s) only; = 3 if own cat(s) only; = 4 if own dog(s) and cat(s)= 4 if own dog(s) and cat(s)

DiscreteDiscrete

4040

Page 41: Poisson statistics

The The probability distributionprobability distribution for a random variable for a random variable describes how probabilities are distributed overdescribes how probabilities are distributed over the values of the random variable.the values of the random variable.

The The probability distributionprobability distribution for a random variable for a random variable describes how probabilities are distributed overdescribes how probabilities are distributed over the values of the random variable.the values of the random variable.

Discrete Probability DistributionsDiscrete Probability Distributions

4141

Page 42: Poisson statistics

The probability distribution is defined by aThe probability distribution is defined by a probability functionprobability function, denoted by , denoted by ff((xx), which provides), which provides the probability for each value of the random variable.the probability for each value of the random variable.

The probability distribution is defined by aThe probability distribution is defined by a probability functionprobability function, denoted by , denoted by ff((xx), which provides), which provides the probability for each value of the random variable.the probability for each value of the random variable.

The required conditions for a discrete probabilityThe required conditions for a discrete probability function are:function are: The required conditions for a discrete probabilityThe required conditions for a discrete probability function are:function are:

Discrete Probability DistributionsDiscrete Probability Distributions

ff((xx) ) >> 0 0 (probabilities are not negative)ff((xx) ) >> 0 0 (probabilities are not negative)

ff((xx) = 1 ) = 1 (sum of all probabilities =1)ff((xx) = 1 ) = 1 (sum of all probabilities =1)

Remember that any probability is a number between 0 and 1.

4242

Page 43: Poisson statistics

Expected Value

The The expected valueexpected value, or mean, of a random variable, or mean, of a random variable is a measure of its central location.is a measure of its central location. The The expected valueexpected value, or mean, of a random variable, or mean, of a random variable is a measure of its central location.is a measure of its central location.

The expected value does not have to be a value theThe expected value does not have to be a value the random variable can assume.random variable can assume. The expected value does not have to be a value theThe expected value does not have to be a value the random variable can assume.random variable can assume.

EE((xx) = ) = = = xfxf((xx))EE((xx) = ) = = = xfxf((xx))

4343

Page 44: Poisson statistics

Variance and Standard DeviationVariance and Standard Deviation

The The variancevariance summarizes the variability in the summarizes the variability in the values of a random variable.values of a random variable. The The variancevariance summarizes the variability in the summarizes the variability in the values of a random variable.values of a random variable.

Var(Var(xx) = ) = 22 = = ((xx - - ))22ff((xx))Var(Var(xx) = ) = 22 = = ((xx - - ))22ff((xx))

The The standard deviationstandard deviation, , , is defined as the positive, is defined as the positive square root of the variance.square root of the variance. The The standard deviationstandard deviation, , , is defined as the positive, is defined as the positive square root of the variance.square root of the variance.

4444

Page 45: Poisson statistics

Four Properties of a Binomial Experiment

3. The probability of a success, denoted by 3. The probability of a success, denoted by pp, does, does not change from trial to trial.not change from trial to trial.3. The probability of a success, denoted by 3. The probability of a success, denoted by pp, does, does not change from trial to trial.not change from trial to trial.

4. The trials are independent.4. The trials are independent.4. The trials are independent.4. The trials are independent.

2. Two outcomes, 2. Two outcomes, successsuccess and and failurefailure, are possible, are possible on each trial.on each trial.2. Two outcomes, 2. Two outcomes, successsuccess and and failurefailure, are possible, are possible on each trial.on each trial.

1. The experiment consists of a sequence of 1. The experiment consists of a sequence of nn identical trials.identical trials.1. The experiment consists of a sequence of 1. The experiment consists of a sequence of nn identical trials.identical trials.

Binomial Probability DistributionBinomial Probability Distribution

4545

Page 46: Poisson statistics

Our interest is in the Our interest is in the number of successesnumber of successes occurring in the occurring in the nn trials. trials. Our interest is in the Our interest is in the number of successesnumber of successes occurring in the occurring in the nn trials. trials.

Binomial Probability DistributionBinomial Probability Distribution

4646

Page 47: Poisson statistics

where:where:

xx = the number of successes = the number of successes

pp = the probability of a success on one trial = the probability of a success on one trial

nn = the number of trials = the number of trials ff((xx) = the probability of ) = the probability of xx successes in successes in nn trials trials

( )( ) (1 ) n xxn

f x p px

( )( ) (1 ) n xxn

f x p px

Binomial Probability DistributionBinomial Probability Distribution

Binomial Probability FunctionBinomial Probability Function

1 2 3! =

( )! ! 1 2 3 ( ) 1 2 3

n nnx n x x n x x

= `n choose x’ = number of ways x people can be chosen out of n 4747

Page 48: Poisson statistics

( )( ) (1 ) n xxn

f x p px

( )( ) (1 ) n xxn

f x p px

Binomial Probability DistributionBinomial Probability Distribution

Binomial Probability Binomial Probability FunctionFunction

Probability of a particularProbability of a particular sequence of trial outcomessequence of trial outcomes with x successes in with x successes in nn trials trials

Probability of a particularProbability of a particular sequence of trial outcomessequence of trial outcomes with x successes in with x successes in nn trials trials

Number of experimentalNumber of experimental outcomes providing exactlyoutcomes providing exactlyxx successes in successes in nn trials trials

Number of experimentalNumber of experimental outcomes providing exactlyoutcomes providing exactlyxx successes in successes in nn trials trials

These values are available in Table 5 of our textbook.

4848

Page 49: Poisson statistics

Binomial Probability DistributionBinomial Probability Distribution

Example: IIT EntranceExample: IIT Entrance

It is known that about 10% of the examinees It is known that about 10% of the examinees taking the IIT entrance qualify.taking the IIT entrance qualify.

Choosing 3 examinees at random, what isChoosing 3 examinees at random, what is

the probability that exactly 1 of them will the probability that exactly 1 of them will qualify?qualify?

Thus, for any examinee chosen at random, Thus, for any examinee chosen at random, there is a probability of 0.1 that the person there is a probability of 0.1 that the person will qualify.will qualify.

4949

Page 50: Poisson statistics

Binomial Probability DistributionBinomial Probability Distribution

f xn

x n xp px n x( )

!!( )!

( )( )

1f xn

x n xp px n x( )

!!( )!

( )( )

1

1 23!(1) (0.1) (0.9) 3(.1)(.81) .243

1!(3 1)!f

1 23!

(1) (0.1) (0.9) 3(.1)(.81) .2431!(3 1)!

f

pp = .10, = .10, nn = 3, = 3, xx = 1 = 1

Example: IIT EntranceExample: IIT EntranceUsing theUsing theprobabilityprobabilityfunctionfunction

Using theUsing theprobabilityprobabilityfunctionfunction

You can just check the binomial probability table in textbook forn= 3, p = 0.1, x = 1.

Or, in Excel, use ‘=BINOMDIST(1,3,0.1,FALSE)’

Just f(1) if FALSE,

f(0)+f(1) if TRUE 5050

Page 51: Poisson statistics

(1 )np p (1 )np p

EE((xx) = ) = = = npnp

Var(Var(xx) = ) = 22 = = npnp(1 (1 pp))

Expected ValueExpected Value

VarianceVariance

Standard DeviationStandard Deviation

Binomial Probability DistributionBinomial Probability Distribution

5151

Page 52: Poisson statistics

3(.1)(.9) .52 em ployees 3(.1)(.9) .52 em ployees

EE((xx) = ) = npnp = 3(.1) = .3 employees out of 3 = 3(.1) = .3 employees out of 3

Var(Var(xx) = ) = npnp(1 – (1 – pp) = 3(.1)(.9) = .27) = 3(.1)(.9) = .27

• Expected ValueExpected Value

• VarianceVariance

• Standard DeviationStandard Deviation

Example: Evans ElectronicsExample: Evans Electronics

Binomial Probability DistributionBinomial Probability Distribution

5252

Page 53: Poisson statistics

A Poisson distributed random variable is oftenA Poisson distributed random variable is often useful in estimating the number of occurrencesuseful in estimating the number of occurrences over a over a specified interval of time or spacespecified interval of time or space

A Poisson distributed random variable is oftenA Poisson distributed random variable is often useful in estimating the number of occurrencesuseful in estimating the number of occurrences over a over a specified interval of time or spacespecified interval of time or space

It is a discrete random variable that may assumeIt is a discrete random variable that may assume an an infinite sequence of valuesinfinite sequence of values (x = 0, 1, 2, . . . ). (x = 0, 1, 2, . . . ). It is a discrete random variable that may assumeIt is a discrete random variable that may assume an an infinite sequence of valuesinfinite sequence of values (x = 0, 1, 2, . . . ). (x = 0, 1, 2, . . . ).

Poisson Probability DistributionPoisson Probability Distribution

5353

Page 54: Poisson statistics

Examples of a Poisson distributed random variable:Examples of a Poisson distributed random variable: Examples of a Poisson distributed random variable:Examples of a Poisson distributed random variable:

the number of defects in 14 pages of a bookthe number of defects in 14 pages of a book the number of defects in 14 pages of a bookthe number of defects in 14 pages of a book

the number of customers arriving at the postthe number of customers arriving at the postoffice in one houroffice in one hour the number of customers arriving at the postthe number of customers arriving at the postoffice in one houroffice in one hour

Poisson Probability DistributionPoisson Probability Distribution

Bell Labs used the Poisson distribution to model theBell Labs used the Poisson distribution to model the arrival of phone calls.arrival of phone calls. Bell Labs used the Poisson distribution to model theBell Labs used the Poisson distribution to model the arrival of phone calls.arrival of phone calls.

5454

Page 55: Poisson statistics

Poisson Probability DistributionPoisson Probability Distribution

Two Properties of a Poisson ExperimentTwo Properties of a Poisson Experiment

2.2. The occurrence or nonoccurrence in any timeThe occurrence or nonoccurrence in any time interval is independent of the occurrence orinterval is independent of the occurrence or nonoccurrence in any other time interval.nonoccurrence in any other time interval.

2.2. The occurrence or nonoccurrence in any timeThe occurrence or nonoccurrence in any time interval is independent of the occurrence orinterval is independent of the occurrence or nonoccurrence in any other time interval.nonoccurrence in any other time interval.

1.1. The probability of an occurrence is the sameThe probability of an occurrence is the same for any two time intervals of equal length.for any two time intervals of equal length.1.1. The probability of an occurrence is the sameThe probability of an occurrence is the same for any two time intervals of equal length.for any two time intervals of equal length.

5555

Page 56: Poisson statistics

Poisson Probability Function

f xex

x( )

!

f x

ex

x( )

!

where:where:

xx = the number of occurrences in an interval = the number of occurrences in an interval

f(x) f(x) = the probability of = the probability of xx occurrences in an interval occurrences in an interval

= mean number of occurrences in an interval= mean number of occurrences in an interval

ee = 2.71828 = 2.71828

Poisson Probability DistributionPoisson Probability Distribution

These values are available in Table 7 of our textbook.

5656

Page 57: Poisson statistics

Poisson Probability FunctionPoisson Probability Function

In practical applications, In practical applications, xx will eventually become will eventually becomelarge enough so that large enough so that ff((xx) is very small and negligible.) is very small and negligible.In practical applications, In practical applications, xx will eventually become will eventually becomelarge enough so that large enough so that ff((xx) is very small and negligible.) is very small and negligible.

Since there is no stated upper limit for the numberSince there is no stated upper limit for the numberof occurrences, the probability function of occurrences, the probability function ff((xx) is) isapplicable for values applicable for values xx = 0, 1, 2, … without limit. = 0, 1, 2, … without limit.

Since there is no stated upper limit for the numberSince there is no stated upper limit for the numberof occurrences, the probability function of occurrences, the probability function ff((xx) is) isapplicable for values applicable for values xx = 0, 1, 2, … without limit. = 0, 1, 2, … without limit.

Poisson Probability DistributionPoisson Probability Distribution

5757

Page 58: Poisson statistics

Example: Mercy HospitalExample: Mercy Hospital

Patients arrive at the emergency room of Patients arrive at the emergency room of MercyMercy

Hospital at the average rate of 6 per hour onHospital at the average rate of 6 per hour on

weekend evenings.weekend evenings.What is the probability of 4 arrivals in 30 What is the probability of 4 arrivals in 30 minutesminutes

on a weekend evening?on a weekend evening?

Poisson Probability DistributionPoisson Probability Distribution

5858

Page 59: Poisson statistics

Poisson Probability DistributionPoisson Probability Distribution

4 33 (2.71828)

(4) .168014!

f

4 33 (2.71828)

(4) .168014!

f

= 6/hour = 3/half-hour, = 6/hour = 3/half-hour, xx = 4 = 4

Example: Mercy HospitalExample: Mercy Hospital Using theUsing theprobabilityprobabilityfunctionfunction

Using theUsing theprobabilityprobabilityfunctionfunction

Or, simply check the table for Poisson probabilities in the bookfor μ = 3, x = 4.

In Excel, use ‘=POISSON(4,3,FALSE)’Just f(4) if FALSE, f(0)+f(1)+...+f(4)

if TRUE

5959

Page 60: Poisson statistics

Poisson Probability DistributionPoisson Probability Distribution

Poisson Probabilities

0.00

0.05

0.10

0.15

0.20

0.25

0 1 2 3 4 5 6 7 8 9 10

Number of Arrivals in 30 Minutes

Pro

bab

ilit

y

actually, actually, the the sequencesequencecontinues:continues:11, 12, …11, 12, …

Example: Mercy HospitalExample: Mercy Hospital

6060

Page 61: Poisson statistics

Poisson Probability DistributionPoisson Probability Distribution

A property of the Poisson distribution is thatA property of the Poisson distribution is thatthe mean and variance are equal.the mean and variance are equal.

A property of the Poisson distribution is thatA property of the Poisson distribution is thatthe mean and variance are equal.the mean and variance are equal.

= = 22

6161

Page 62: Poisson statistics

Continuous Probability Distributions

A A continuous random variablecontinuous random variable can assume any can assume any value in an interval on the real line or in a value in an interval on the real line or in a collection of intervals.collection of intervals.

It is not possible to talk about the probability It is not possible to talk about the probability of the random variable assuming a particular of the random variable assuming a particular value.value. Instead, we talk about the probability of the Instead, we talk about the probability of the random variable assuming a value within a random variable assuming a value within a given interval.given interval.

6262

Page 63: Poisson statistics

6363

We denote the ‘density function’ by f(x). AlsoWe denote the ‘density function’ by f(x). Also

2

( ) 0; ( ) 1

( ) ( )

( ) ( ) ( )

f x f x dx

E X xf x dx

Var X x E X f x dx

2

( ) 0; ( ) 1

( ) ( )

( ) ( ) ( )

f x f x dx

E X xf x dx

Var X x E X f x dx

Page 64: Poisson statistics

Area as a Measure of ProbabilityArea as a Measure of Probability

The area under the graph of The area under the graph of ff((xx) and ) and probability are identical.probability are identical.

This is valid for all continuous random This is valid for all continuous random variables.variables. The probability that The probability that xx takes on a value takes on a value between some lower value between some lower value xx11 and some higher and some higher value value xx22 can be found by computing the area can be found by computing the area under the graph of under the graph of ff((xx) over the interval from ) over the interval from xx11 to to xx22. .

6464

Page 65: Poisson statistics

The normal probability distribution is the most important distribution for describing a continuous random variable.

It is used in a wide variety of applicationsIt is used in a wide variety of applications

including:including:

• Heights of Heights of peoplepeople• Rainfall Rainfall amountsamounts

• Test scoresTest scores• Scientific Scientific measurementsmeasurements

Normal Probability DistributionNormal Probability Distribution

For a large number of similar variables that are For a large number of similar variables that are unrelated, sum and average are approximately unrelated, sum and average are approximately normal.normal.

6565

Page 66: Poisson statistics

Normal Probability Density Function

2 2( ) / 21( )

2xf x e

2 2( ) / 21( )

2xf x e

= mean= mean

= standard deviation= standard deviation

= 3.14159= 3.14159

ee = 2.71828 = 2.71828

where:where:

Normal Probability DistributionNormal Probability Distribution

6666

Page 67: Poisson statistics

The distribution is The distribution is symmetricsymmetric; its skewness; its skewness measure is zero.measure is zero. The distribution is The distribution is symmetricsymmetric; its skewness; its skewness measure is zero.measure is zero.

Normal Probability DistributionNormal Probability Distribution

CharacteristicsCharacteristics

xx

6767

Page 68: Poisson statistics

The The highest pointhighest point on the normal curve is at the on the normal curve is at the meanmean, the middle point., the middle point. The The highest pointhighest point on the normal curve is at the on the normal curve is at the meanmean, the middle point., the middle point.

Normal Probability DistributionNormal Probability Distribution

CharacteristicsCharacteristics

xx

6868

Page 69: Poisson statistics

Normal Probability DistributionNormal Probability Distribution

CharacteristicsCharacteristics

-10-10 00 2525

The mean can be any numerical value: negative,The mean can be any numerical value: negative, zero, or positive.zero, or positive. The mean can be any numerical value: negative,The mean can be any numerical value: negative, zero, or positive.zero, or positive.

xx

6969

Page 70: Poisson statistics

Normal Probability DistributionNormal Probability Distribution

CharacteristicsCharacteristics

= 15= 15

= 25= 25

The standard deviation determines the width of theThe standard deviation determines the width of thecurve: larger values result in wider, flatter curves.curve: larger values result in wider, flatter curves.The standard deviation determines the width of theThe standard deviation determines the width of thecurve: larger values result in wider, flatter curves.curve: larger values result in wider, flatter curves.

xx

7070

Page 71: Poisson statistics

Probabilities for the normal random variable areProbabilities for the normal random variable are given by given by areas under the curveareas under the curve. The total area. The total area under the curve is 1 (.5 to the left of the mean andunder the curve is 1 (.5 to the left of the mean and .5 to the right)..5 to the right).

Probabilities for the normal random variable areProbabilities for the normal random variable are given by given by areas under the curveareas under the curve. The total area. The total area under the curve is 1 (.5 to the left of the mean andunder the curve is 1 (.5 to the left of the mean and .5 to the right)..5 to the right).

Normal Probability DistributionNormal Probability Distribution

CharacteristicsCharacteristics

.5.5 .5.5

xx

7171

Page 72: Poisson statistics

Normal Probability DistributionNormal Probability Distribution

Characteristics (basis for the empirical rule)Characteristics (basis for the empirical rule)

of values of a normal random variableof values of a normal random variable are within of its mean.are within of its mean. of values of a normal random variableof values of a normal random variable are within of its mean.are within of its mean.68.26%68.26%68.26%68.26%

+/- 1 standard deviation+/- 1 standard deviation+/- 1 standard deviation+/- 1 standard deviation

of values of a normal random variableof values of a normal random variable are within of its mean.are within of its mean. of values of a normal random variableof values of a normal random variable are within of its mean.are within of its mean.95.44%95.44%95.44%95.44%

+/- 2 standard deviations+/- 2 standard deviations+/- 2 standard deviations+/- 2 standard deviations

of values of a normal random variableof values of a normal random variable are within of its mean.are within of its mean. of values of a normal random variableof values of a normal random variable are within of its mean.are within of its mean.99.72%99.72%99.72%99.72%

+/- 3 standard deviations+/- 3 standard deviations+/- 3 standard deviations+/- 3 standard deviations

7272

Page 73: Poisson statistics

Normal Probability DistributionNormal Probability Distribution

Characteristics (basis for the empirical rule)Characteristics (basis for the empirical rule)

xx – – 33 – – 11

– – 22 + 1+ 1

+ 2+ 2 + 3+ 3

68.26%68.26%95.44%95.44%99.72%99.72%

7373

Page 74: Poisson statistics

A random variable having a normal distributionA random variable having a normal distribution with a mean of 0 and a standard deviation of 1 iswith a mean of 0 and a standard deviation of 1 is said to have a said to have a standard normal probabilitystandard normal probability distributiondistribution..

A random variable having a normal distributionA random variable having a normal distribution with a mean of 0 and a standard deviation of 1 iswith a mean of 0 and a standard deviation of 1 is said to have a said to have a standard normal probabilitystandard normal probability distributiondistribution..

CharacteristicsCharacteristics

Standard Normal Probability DistributionStandard Normal Probability Distribution

7474

Page 75: Poisson statistics

00zz

The letter The letter z z is used to designate the standardis used to designate the standard normal random variable.normal random variable. The letter The letter z z is used to designate the standardis used to designate the standard normal random variable.normal random variable.

Standard Normal Probability DistributionStandard Normal Probability Distribution

CharacteristicsCharacteristics

7575

Page 76: Poisson statistics

Converting to the Standard Normal Converting to the Standard Normal DistributionDistribution

Standard Normal Probability DistributionStandard Normal Probability Distribution

zx

zx

7676

Page 77: Poisson statistics

The daily demand of the new ipad in a store The daily demand of the new ipad in a store seems to follow a normal distribution with an seems to follow a normal distribution with an average of 15 and a standard deviation of 6. average of 15 and a standard deviation of 6.

Example: DemandExample: Demand

The manager, who does not want to keep The manager, who does not want to keep more than 20 ipads in his store at a time, more than 20 ipads in his store at a time, would like to know the probability of a would like to know the probability of a stockout, i.e. that the demand in a day will stockout, i.e. that the demand in a day will exceed 20. exceed 20.

PP((xx > 20) = ? > 20) = ?

7777

Page 78: Poisson statistics

zz = ( = (xx - - )/)/ = (20 - 15)/6= (20 - 15)/6 = .83= .83

zz = ( = (xx - - )/)/ = (20 - 15)/6= (20 - 15)/6 = .83= .83

Solving for the Stockout ProbabilitySolving for the Stockout Probability

Step 1: Convert Step 1: Convert xx to the standard normal distribution. to the standard normal distribution.Step 1: Convert Step 1: Convert xx to the standard normal distribution. to the standard normal distribution.

7878

Page 79: Poisson statistics

z .00 .01 .02 .03 .04 .05 .06 .07 .08 .09

. . . . . . . . . . .

.5 .6915 .6950 .6985 .7019 .7054 .7088 .7123 .7157 .7190 .7224

.6 .7257 .7291 .7324 .7357 .7389 .7422 .7454 .7486 .7517 .7549

.7 .7580 .7611 .7642 .7673 .7704 .7734 .7764 .7794 .7823 .7852

.8 .7881 .7910 .7939 .7967 .7995 .8023 .8051 .8078 .8106 .8133

.9 .8159 .8186 .8212 .8238 .8264 .8289 .8315 .8340 .8365 .8389

. . . . . . . . . . .

Cumulative Probability Table for Standard Normal Cumulative Probability Table for Standard Normal DistributionDistribution

PP((zz << .83) .83)

These values are available in Table 1 of our textbook.

Step 2: Find the area under the standard normalStep 2: Find the area under the standard normal curve to the left of curve to the left of zz = .83. = .83.Step 2: Find the area under the standard normalStep 2: Find the area under the standard normal curve to the left of curve to the left of zz = .83. = .83.

7979

Page 80: Poisson statistics

8080

In Excel, use ‘=NORMDIST(0.83,0,1,TRUE)’

Just f(0.83) if FALSE, area upto 0.83 if

TRUE

In fact, you can straightaway use ‘=NORMDIST(20,15,6,TRUE)’

P(X ≤ 20) with μ = 15,

σ = 6

Page 81: Poisson statistics

PP((z z > .83) = 1 – > .83) = 1 – PP((zz << .83) .83) = 1- .7967= 1- .7967

= .2033= .2033

PP((z z > .83) = 1 – > .83) = 1 – PP((zz << .83) .83) = 1- .7967= 1- .7967

= .2033= .2033

Solving for the Stockout ProbabilitySolving for the Stockout Probability

Step 3: Compute the area under the standard normalStep 3: Compute the area under the standard normal curve to the right of curve to the right of zz = .83. = .83.Step 3: Compute the area under the standard normalStep 3: Compute the area under the standard normal curve to the right of curve to the right of zz = .83. = .83.

ProbabilityProbability of a of a stockoutstockout

PP((xx > > 20)20)

Standard Normal Probability DistributionStandard Normal Probability Distribution

8181

Page 82: Poisson statistics

Solving for the Stockout ProbabilitySolving for the Stockout Probability

00 .83.83

Area = .7967Area = .7967Area = 1 - .7967Area = 1 - .7967

= .2033= .2033

zz

Standard Normal Probability DistributionStandard Normal Probability Distribution

These values are available in Table 1 of our textbook.

8282

Page 83: Poisson statistics

Standard Normal Probability DistributionStandard Normal Probability Distribution

If the manager of wants the probability of a If the manager of wants the probability of a stockout during replenishment lead-time to stockout during replenishment lead-time to be no more than .05, what should the reorder be no more than .05, what should the reorder point be?point be?

------------------------------------------------------------------------------------------------------------------------------(Hint: Given a probability, we can use the (Hint: Given a probability, we can use the

standardstandard

normal table in an inverse fashion to find thenormal table in an inverse fashion to find the

corresponding corresponding zz value. Give it a try.) value. Give it a try.)

8383

Page 84: Poisson statistics

End of Lecture

8484