mit opencourseware ://ocw.mit.edu/courses/mechanical-engineering/2-830j-control-of... · • random...

Post on 22-Mar-2020

15 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

MIT OpenCourseWare _________http://ocw.mit.edu___

2.830J / 6.780J / ESD.63J Control of Manufacturing Processes (SMA 6303)Spring 2008

For information about citing these materials or our Terms of Use, visit: ________________http://ocw.mit.edu/terms.

1Manufacturing

Control of Manufacturing Processes

Subject 2.830/6.780/ESD.63Spring 2008Lecture #5

Probability Models, Parameter Estimation, and Sampling

February 21, 2008

2Manufacturing

The Normal Distribution

p(x) =1

σ 2πe

− 12

x − μσ

⎛ ⎝ ⎜

⎞ ⎠ ⎟

2

z

0

0.1

0.2

0.3

0.4

-4 -3 -2 -1 0 1 2 3 4

z =x − μ

σ

“Standard normal”

μz=0σz=1

3Manufacturing

Properties of the Normal pdf

• Symmetric about mean• Only two parameters:

μ and σ

• Mean (μ) and Variance ( σ2 ) have well known “estimators” (average and sample variance)

p(x) =1

σ 2πe

− 12

x − μσ

⎛ ⎝ ⎜

⎞ ⎠ ⎟

2

4Manufacturing

Testing the Model: e.g. Is the Process “Normal” ?

• Is the underlying distribution really normal?– Look at histogram– Look at curve fit to histogram– Look at % of data in 1, 2 and 3σ bands

• Confidence Intervals– Look at “kurtosis”

• Measure of deviation from normal– Probability (or qq) plots (see Mont. 3-3.7 or MATLAB

stats toolbox)

5Manufacturing

Kurtosis: Deviation from Normal

k =n(n +1)

(n −1)(n − 2)(n − 3)xi − x

s⎛ ⎝ ⎜

⎞ ⎠ ⎟ ∑

4⎡

⎣ ⎢ ⎢

⎦ ⎥ ⎥

−3(n −1)2

(n − 2)(n − 3)

For sampled data:

k = 0 - normalk > 0 - more “peaked”k < 0 - more “flat”

6Manufacturing

Kurtosis for Some Common Distributions

D: Laplace (k = 3)L: logistic (k = 1.2)N: normal (k = 0)U: uniform (k = -1.2)

Source: Wikimedia Commons, http://commons.wikimedia.org

7Manufacturing

• Plot– normalized (mean

centered and scaled to s)

vs. – theoretical position

of unit normal distribution for ordered data

• Normal distribution: data should fall along line

Quantile-Quantile (qq) Plots

Source: Wikimedia Commons, http://commons.wikimedia.org

8Manufacturing

Guaranteeing “Normality”The Central Limit Theorem

– If x1, x2 ,x3 ...xN … are N independent observations of a random variable with “moments” μx and σ2

x,

– The distribution of the sum of all the samples will tend toward normal.

9Manufacturing

Example: Uniformly Distributed Data

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

10

20

30

40

50

60

70

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

10

20

30

40

50

60

70

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

10

20

30

40

50

60

70

x1

x2

x100

+

+ . . .

Sum of 100 sets of 1000 points each

35 40 45 50 55 60 650

50

100

150

y = xii=1

100

10Manufacturing

Sampling: Using Measurements (Data) to Model the Random Process

• In general p(x) is unknown• Data can suggest form of p(x)

– e.g.. uniform, normal, weibull, etc.• Data can be used to estimate parameters of distributions

– e.g. μ and σ for normal distribution: p(x) = N(μ ,σ2)

• How to estimate– Sample Statistics

• Uncertainty in estimates– Sample Statistic pdf’s

p(x) =1

σ 2πe

− 12

x − μσ

⎛ ⎝ ⎜

⎞ ⎠ ⎟

2

11Manufacturing

Sample Statistics

Average or sample mean

Sample variance

Sample standard deviation

12Manufacturing

Sample Mean Uncertainty

• If all xi come from a distribution with μx and σ2x, and

we divide the sum by n:

x =1n

xii =1

n

μx = μx σ x2 =

1n

σ x2 or σ x =

1n

σ x

x = c1x1 + c2 x2 + c3x3 + K cn xn

ci =1n

Then: and

13Manufacturing

Manufacturing as Random Processes

• All physical processes have a degree of natural randomness

• We can model this behavior using probability distribution functions

• We can calibrate and evaluate the quality of this model from measurement data

14Manufacturing

Formal Use of Statistical Models

• Discrete Variable Distributions and Uses– Attribute Modeling

• Sampling: Key distributions arising in sampling• Chi-square, t, and F distributions

• Estimation: – Reasoning about the population based on a sample

• Some basic confidence intervals• Estimate of mean with variance known• Estimate of mean with variance not known• Estimate of variance

• Hypothesis tests

15Manufacturing

Discrete Distribution: Bernoulli

Bernoulli trial: an experiment with two outcomes

Probability density function (pdf):

x

f(x)

p1 - p

0 1

¼

¾

16Manufacturing

Discrete Distribution: Binomial

Repeated random Bernoulli trials

• n is the number of trials• p is the probability of “success” on any one trial• x is the number of successes in n trials

17Manufacturing

Binomial DistributionBinomial Distribution

0

0.05

0.1

0.15

0.2

0.25

1 3 5 7 9 11 13 15 17 19 21 23 25 27 29

Number of "Successes"

0

0.2

0.4

0.6

0.8

1

1.2

1 3 5 7 9 11 13 15 17 19 21 23 25 27 29

Series1

e.g. expected number of “rejects”

18Manufacturing

Discrete Distribution: Poisson

• Poisson is a good approximation to Binomial when n is large and p is small (< 0.1)

Example applications:# misprints on page(s) of a book# defects on a wafer

Mean:Variance:

19Manufacturing

Poisson Distribution

0

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09

0.1

1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41

Events per unit

c

Poisson Distribution

0

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

0.18

0.2

1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41

Events per unit

c

Poisson Distribution

0

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41

Events per unit

λ=5

λ=20

λ=30

Poisson Distributions

e.g. defects/device

20Manufacturing

Back to Continuous Distributions

• Uniform Distribution• Normal Distribution

– Unit (Standard) Normal Distribution

21Manufacturing

Continuous Distribution: Uniform

pdf

xa

1

b

xa b

cdf

22Manufacturing

Standard Questions For a Known cdf or pdf

xa

1

b

xa b• Probability x sits within

• Probability x less than orequal to some value

some range

23Manufacturing

Continuous Distribution: Normal or Gaussian

cdf

pdf

0

0.16

0.5

0.84

1

0.9770.99865

0.02270.00135

24Manufacturing

Continuous Distribution: Unit Normal

• Normalization

cdf

Mean

Variance

pdf

25Manufacturing

Using the Unit Normal pdf and cdf

• We often want to talk about “percentage points”of the distribution – portion in the tails

0.1

0.5

0.9

1

26Manufacturing

Use of the pdf:Location of Data

• How likely are certain values of the random variable?

• For a “Standard Normal” Distribution:

z = (x − μ)σ

N(0,1)μ=0

σ=1

z =1 ⇒ x =1σz = 2 ⇒ x = 2σz = 3 ⇒ x = 3σ

0

0.1

0.2

0.3

0.4

-4 -3 -2 -1 0 1 2 3 4

27Manufacturing

Location of Data

P(-1≤ z ≤ 1) = P(z ≤1) - P(z ≤-1) = Φ(1) - Φ(-1) (+ 1σ) = 0.841 - (1-0.841) = 0.682

P(-2≤ z ≤ 2) = P(z ≤2) - P(z ≤-2) = 0.977 - (1-0.977) = 0.954(+ 2σ)

P(-3≤ z ≤ 3) = P(z ≤3) - P(z ≤-3) = 0.998 - (1-0.998) = 0.997(+ 3σ)

Φ(z) tabulated (e.g. p. 752 of Montgomery)

29Manufacturing

Statistics

The field of statistics is about reasoning in the face of uncertainty, based on evidence from

observed data

• Beliefs:– Probability distribution or probabilistic model form– Distribution/model parameters

• Evidence:– Finite set of observations or data drawn from a

population (experimental measurements or observations)

• Models:– Seek to explain data wrt a model of their probability

30Manufacturing

Sampling to Determine Parameters of the Parent Probability Distribution

• Assume Process Under Study has a Parent Distribution p(x)• Take “n” Samples From the Process Output (xi)

• Look at Sample Statistics (e.g. sample mean and sample variance)

• Relationship to Parent• Both are Random Variables• Both Have Their Own Probability Distributions

• Inferences about the process (the parent distribution) via Inferences about the derived sampling distribution

31Manufacturing

Moments of the Population vs. Sample Statistics

• Mean

• Variance

• StandardDeviation

• Covariance

• CorrelationCoefficient

Underlying model or Population Probability

Sample Statistics

32Manufacturing

Sampling and Estimation

• Sampling: act of making observations from populations

• Random sampling: when each observation is identically and independently distributed (IID)

• Statistic: a function of sample data; a value that can be computed from data (contains no unknowns) – Average, median, standard deviation– Statistics are by definition also random variables

33Manufacturing

Population vs. Sampling Distribution

Population(“true”)probability density function)

Sample Mean(statistic)

n = 20

n = 10

n = 2

Sample Mean Distribution(sampling distribution)

n = 1

34Manufacturing

Sampling and Estimation, cont.• Sampling• Random sampling• Statistic• A statistic is a random variable, which itself has a

sampling (probability) distribution– I.e., if we take multiple random samples, the value for the

statistic will be different for each set of samples, but will begoverned by the same sampling distribution

• If we know the appropriate sampling distribution, we can reason about the underlying population based on the observed value of a statistic– E.g. we calculate a sample mean from a random sample; in

what range do we think the actual (population) mean sits?

35Manufacturing

Sampling and Estimation – An Example

• Suppose we know that the thickness of a part is normally distributed with std. dev. of 10:

• We sample n = 50 random parts and compute the mean part thickness:

• First question: What is distribution of the mean of T =

• Second question: can we use knowledge of distribution to reason about the actual (population) mean μgiven observed (sample) mean?

36Manufacturing

Estimation and Confidence Intervals• Point Estimation:

– Find best values for parameters of a distribution– Should be

• Unbiased: expected value of estimate should be true value• Minimum variance: should be estimator with smallest variance

• Interval Estimation: – Give bounds that contain actual value with a given

probability– Must know sampling distribution!

37Manufacturing

Confidence Intervals: Variance Known

• We know σ, e.g. from historical data• Estimate mean in some interval to (1-α)100% confidence

0.1

0.5

0.9

1

Remember the unit normal percentage points

Apply to the sampling distribution for the sample mean

38Manufacturing

95% confidence interval, α = 0.05

Example, Cont’d

• Second question: can we use knowledge of distribution to reason about the actual (population) mean μ given observed (sample) mean?

n = 50

~95% of distributionlies within +/- 2σ of mean

39Manufacturing

Summary• Process as Random Variable

– Histograms to pdf’s• Different Distributions for Different Processes

– Discrete or Binary (e.g. Defects)– Continuous (e.g. Dimensional Variation)

• Parent Distributions and Sampling– Estimating the Parent from Data

• Use of Distributions to establish “Confidence” on Parameter Estimates

top related