statistical review we will be working with two types of probability distributions: discrete...

24
Statistical Review We will be working with two types of probability distributions: Discrete distributions If the random variable of interest can take a countable number of values (e.g., number of defects) it is modeled with a discrete distribution. We will use two such distributions Binomial Distribution Poisson Distribution Continuous distributions If the random variable of interest can take an infinite number of value (e.g., the diameter of a machined part) it is modeled with a continuous distribution We will use one such distribution Normal Distribution

Upload: gervase-thomas

Post on 29-Dec-2015

221 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Statistical Review We will be working with two types of probability distributions: Discrete distributions –If the random variable of interest can take

Statistical Review

• We will be working with two types of probability distributions:

• Discrete distributions– If the random variable of interest can take a

countable number of values (e.g., number of defects) it is modeled with a discrete distribution.

– We will use two such distributions• Binomial Distribution

• Poisson Distribution

• Continuous distributions– If the random variable of interest can take an

infinite number of value (e.g., the diameter of a machined part) it is modeled with a continuous distribution

– We will use one such distribution• Normal Distribution

Page 2: Statistical Review We will be working with two types of probability distributions: Discrete distributions –If the random variable of interest can take

The Binomial Distribution• If the random variable of interest can take one

of two values (e.g., heads or tails, defective or not defective) the binomial distribution is appropriate.

• The binomial distribution is described by two parameters:p = probability of a success on a given trial

n = the number of trials

• If we denote A as the number of successes in n trials then A is said to have a binomial distribution with:mean = E[A] = np

Variance[A] = np(1-p)

Standard Deviation [A] = np(1-p)

Page 3: Statistical Review We will be working with two types of probability distributions: Discrete distributions –If the random variable of interest can take

The Excel Function for the Binomial Distribution is

BINOMDIST• BINOMDIST(A,n,p,cumulative)

• A is the number of successes in trials.

• n is the number of independent trials.

• p is the probability of success on each trial.

• Cumulative is a logical value that determines the form of the function. If cumulative is TRUE, then BINOMDIST returns the cumulative distribution function, which is the probability that there are A successes or less; if FALSE, it returns the probability mass function, which is the probability that there are A successes.

Page 4: Statistical Review We will be working with two types of probability distributions: Discrete distributions –If the random variable of interest can take

Examples

• A manufacturing process is estimated to produce 5% noncomforming items. If a random sample of 5 items is chosen:– What is the probability of getting 2

noncomforming items in the sample?

– What is the probability of getting between 1 and 3 non comforming items?

– What is the probability of getting one or more noncomforming items?

Page 5: Statistical Review We will be working with two types of probability distributions: Discrete distributions –If the random variable of interest can take

Proportion Defective and the Binomial Distribution

• We often use the proportion defective rather than the number defective in SPC.

This is expressed as:

p(bar) = x/n

Where X follows a binomial distribution with parameters p and n and x is an observed value of X.

The mean of p(bar) is p and the variance is

p(1-p)/n

The probability that p(bar) <= a = probability that x <= na

Page 6: Statistical Review We will be working with two types of probability distributions: Discrete distributions –If the random variable of interest can take

Example• A process produces an average of 2%

defective units what is the probability that a sample of 10 will have more than 5% defective?

Page 7: Statistical Review We will be working with two types of probability distributions: Discrete distributions –If the random variable of interest can take

The Poisson Distribution• As n becomes large the binomial distribution

approaches the poisson distribution. Therefore, the poisson distribution is often used to model the number of defects within a product (e.g., where there is a potential for a large number of defects).

• The Poisson distribution is described by one parameter: = the average number of defects

• If we denote A as the number of defects on a given product the then A is said to have a poisson distribution with:mean = E[A] = Variance[A] = Standard Deviation [A] =

Page 8: Statistical Review We will be working with two types of probability distributions: Discrete distributions –If the random variable of interest can take

The Excel Function for the Poisson Distribution is

POISSON• POISSON(A,,cumulative)

• A is the number of events. is the mean.

• Cumulative is a logical value that determines the form of the probability distribution returned. If cumulative is TRUE, POISSON returns the cumulative Poisson probability that the number of random events occurring will be between zero and A inclusive; if FALSE, it returns the Poisson probability mass function that the number of events occurring will be exactly A.

Page 9: Statistical Review We will be working with two types of probability distributions: Discrete distributions –If the random variable of interest can take

Examples• The average number of defects in a

computer produced by an assembly process is known to be 10.

• What is the probability of finding 6 defects?

• What is the probability of finding between 2 an 12 defects

• what is the probability of finding less than 3 defects?

Page 10: Statistical Review We will be working with two types of probability distributions: Discrete distributions –If the random variable of interest can take

The Normal Distribution• The normal distribution has two

parameters:

= mean

2 = variance

= standard deviation

A number of functions are available in Excel

for working with the normal distribution:

NORMDIST

NORMSDIST

NORMINV

NORMSINV

Page 11: Statistical Review We will be working with two types of probability distributions: Discrete distributions –If the random variable of interest can take

NORMDIST• NORMDIST(x,mean,standard_dev,cumula

tive)

• X is the value for which you want the distribution.

• Mean is the arithmetic mean of the distribution.

• Standard_dev is the standard deviation of the distribution.

• Cumulative is a logical value that determines the form of the function. If cumulative is TRUE, NORMDIST returns the cumulative distribution function; if FALSE, it returns the probability mass function.

• NORMDIST(42,40,1.5,TRUE) equals 0.908789

Page 12: Statistical Review We will be working with two types of probability distributions: Discrete distributions –If the random variable of interest can take

NORMSDIST

• NORMSDIST(z)

• Z is the value for which you want the distribution.

• NORMSDIST(1.333333) equals 0.908789

Page 13: Statistical Review We will be working with two types of probability distributions: Discrete distributions –If the random variable of interest can take

NORMINV• NORMINV(probability,mean,standard_de

v)

• Probability is a probability corresponding to the normal distribution.

• Mean is the arithmetic mean of the distribution.

• Standard_dev is the standard deviation of the distribution.

• NORMINV(0.908789,40,1.5) equals 42

Page 14: Statistical Review We will be working with two types of probability distributions: Discrete distributions –If the random variable of interest can take

NORMSINV• NORMSINV(probability)

• Probability is a probability corresponding to the normal distribution.

• NORMSINV(0.908789) equals 1.3333

Page 15: Statistical Review We will be working with two types of probability distributions: Discrete distributions –If the random variable of interest can take

Enumerative vs. Analytical• Enumerative Studies: Statistical investigations that lead

to action on static populations (e.g., calculate income rates by area)

– Time specific and static

– There is no reference to the future

• Analytic Studies: Statistical investigations that lead to action on dynamic populations (e.g., why productivity is low and how can it be increased?)

– If a 100% sample of the population answers the question under investigation, the study is enumerative; otherwise the study is analytic

– Focuses on causes of patterns and variations that take place over different areas, over periods of time, etc.

– Focus is on the future not the present

– Since future process output does not exist, it can not be part of the population.

Page 16: Statistical Review We will be working with two types of probability distributions: Discrete distributions –If the random variable of interest can take

Sampling Distributions

• Sample statistics are used to draw conclusions about population parameters

• The sample mean is used to draw conclusions about the population mean

• The sample variance is used to draw conclusions about the population variance

• The behavior of these statistics over repeated samples is referred to as the sampling distribution of the statistic.

Page 17: Statistical Review We will be working with two types of probability distributions: Discrete distributions –If the random variable of interest can take

Control Charts are Representations of Sampling

Distributions• The center line on a control chart is an

estimate of the the mean of the sampling distribution– The central limit theorem tells us that the mean

of the sampling distribution is a good point estimate of the mean of the population.

• Interval Estimation– An interval estimate or (confidence interval) is

defined by two endpoints such that the probability of the parameter of interest being contained in the interval is of some value (e.g., 99%)

– The control limits on control chart are an example of an interval estimate. They are a function of the point estimates of the mean and standard deviation of the sampling distribution.

Page 18: Statistical Review We will be working with two types of probability distributions: Discrete distributions –If the random variable of interest can take

Example

• The mean of the sampling distribution of the width of a machined shaft is estimated to be 10 inches with a standard deviation of .003 inches. Construct an confidence interval such that you would expect 95% of the means of samples of shafts to be contained within its limits.

Page 19: Statistical Review We will be working with two types of probability distributions: Discrete distributions –If the random variable of interest can take

Hypothesis Testing• Each time we plot a value on a control

chart we are testing a hypothesis.

• Classical hypothesis testing involves 4 steps:

– Formulate the null and alternative hypotheses

– Determine the test statistic

– Determine the rejection region of the null hypothesis based on a chosen level of significance,

– Make a decision.

Page 20: Statistical Review We will be working with two types of probability distributions: Discrete distributions –If the random variable of interest can take

Formulating the Null Hypothesis

• In any hypothesis testing problem there are two hypotheses– The null hypothesis, Ho , is the proposition

being tested.

– The alternative hypothesis, Ha , is formulated as a contradiction to the null hypothesis.

• Example: We expect the mean length of a shaft to be 30 mm. We are interested in determining whether the mean length of a sample of shafts differs from 30 mm:

Ho = 30 mm

Ha 30 mm

Page 21: Statistical Review We will be working with two types of probability distributions: Discrete distributions –If the random variable of interest can take

Determining the Test Statistic

• There are many possible test statistics depending on assumptions made and the parameter being tested. We will use the following statistic to test the hypothesis our hypothesis that the sample mean does not differ from the population mean:

X

oo

Xz

Page 22: Statistical Review We will be working with two types of probability distributions: Discrete distributions –If the random variable of interest can take

Determining the Rejection Region

• The rejection region is determined by the choice of , the significance of the test. Assuming the sample measurements are distributed normally about the population mean (central limit theorem), the normal distribution is referenced to determine which points constitute a (1 - ) confidence interval about the population mean. The area beyond these points is considered the rejection region

Page 23: Statistical Review We will be working with two types of probability distributions: Discrete distributions –If the random variable of interest can take

Make a Decision

• If the test statistic falls in the rejection region: Reject Ho

• If the test statistic does not fall in the rejection region: Fail to reject Ho

• Example: Suppose the estimate of the population standard deviation is 2mm and a sample produces a mean of 25mm. At the .05 level of significance what do you conclude?

Page 24: Statistical Review We will be working with two types of probability distributions: Discrete distributions –If the random variable of interest can take

Errors in Hypothesis Testing• There are two types of errors in hypothesis

testing

– Type I error refers to the probability of rejecting the null hypothesis when it is actually true.

What is the type I error associated with the hypothesis we tested?

What are the implications to quality control?

– Type II error refers to the probability of failing to reject the null hypothesis when it is actually false

What is the type II error associated with the hypothesis we tested?

What are the implications to quality control?