2. experiments with one factor (ch.3. analysis of variance...

33
Hae-Jin Choi School of Mechanical Engineering, Chung-Ang University 2. Experiments with One Factor (Ch.3. Analysis of Variance ANOVA) 1 DOE and Optimization

Upload: lamcong

Post on 06-Mar-2018

225 views

Category:

Documents


3 download

TRANSCRIPT

  • Hae-Jin Choi School of Mechanical Engineering,

    Chung-Ang University

    2. Experiments with One Factor

    (Ch.3. Analysis of Variance ANOVA)

    1 DOE and Optimization

  • Experiments with One Factor

    2 DOE and Optimization

    Plasma etching process:

  • 3

    Plasma etching process:

    An engineer is interested in investigating the relationship between the RF power setting and the etch rate for this tool. The objective of an experiment like this is to model the relationship between etch rate and RF power, and to specify the power setting that will give a desired target etch rate.

    The response variable is etch rate.

    She is interested in a particular gas (C2F6) and gap (0.80 cm), and wants to test four levels of RF power: 160W, 180W, 200W, and 220W. She decided to test five wafers at each level of RF power.

    The experimenter chooses 4 levels of RF power 160W, 180W, 200W, and 220W

    The experiment is replicated 5 times runs made in random order

    DOE and Optimization

  • Plasma etching process:

    4

    DOE and Optimization

  • 5

    Does changing the power change the

    mean etch rate?

    We would like to have a systematic way

    to answer these questions

    Analysis of Variance (ANOVA)

    DOE and Optimization

  • The Analysis of Variance (ANOVA)

    In general, there will be a levels of the factor, or a treatments, and n replicates of the experiment, run in random ordera completely randomized design (CRD)

    N = an total runs

    We consider the fixed effects casethe random effects case will be discussed later

    Objective is to test hypotheses about the equality of the a treatment means

    DOE and Optimization 6

  • Statistical Hypothesis

    DOE and Optimization 7

    A statistical hypothesis is a statement about the parameters of a

    probability distribution. For example, we may think that the mean

    values of distributions are equal. This may be stated formally as

    (Null hypothesis)

    (Alternative hypothesis)

    Type I error the null hypothesis is rejected when it is true.

    Type II error the null hypothesis is accepted when it is false

    0 1 2

    1

    :

    : At least one mean is different

    aH

    H

  • Basic Single-factor ANOVA model

    DOE and Optimization 8

    the total of the observations for the ith factor level

    the average of the observations under the ith factor level

    the grand total of all observations

    the grand average of all observation

    1

    = 0a

    i

    i

    . .

    1

    . = / 1,2, ,n

    i ij i i

    j

    y y y y n i a

    .. .. ..

    1 1

    = /a n

    ij

    i j

    y y y y N

    .iy

    .iy

    ..y

    ..y

  • Basic Single-factor ANOVA model

    DOE and Optimization 9

    .

    ij i ijy

    i = 1 i = 2 i = 3 i = 4

  • 10

    Basic Single-factor ANOVA model

    1,2,...,,

    1, 2,...,

    a parameter common to all factors called an overall mean,

    the deviation at the th factor level from the overall mean,

    called the th factor effe

    ij i ij

    i

    i ay

    j n

    i

    i

    2

    ct,

    random error, (0, )ij NID

    Typically, resulting from measurement error, the effects of variables not

    included in the experiment, and so on. (1) normally and independently distributed random variables (2) zero mean value (3) Variance ( ) is assumed constant for all levels of the factor. 2

    DOE and Optimization

  • 11

    The Analysis of Variance

    Total variability is measured by the total sum of

    squares:

    The basic ANOVA partitioning is:

    2

    ..

    1 1

    ( )a n

    T ij

    i j

    SS y y

    2 2

    .. . .. .

    1 1 1 1

    2 2

    . .. .

    1 1 1

    ( ) [( ) ( )]

    ( ) ( )

    a n a n

    ij i ij i

    i j i j

    a a n

    i ij i

    i i j

    T Treatments E

    y y y y y y

    n y y y y

    SS SS SS

    2..2

    1 1

    = - a n

    ijT

    i j

    ySS y

    anor

    DOE and Optimization

  • The Analysis of Variance

    12

    SSTreatments is the sum of squares due to the factor, that is the sum of squares of differences between factor-level averages and the grand average. It measurers the differences between factor levels

    A large value of SSTreatments reflects large differences in treatment means

    A small value of SSTreatments likely indicates no differences in treatment means

    2

    . ..

    1

    ( )a

    Treatments i

    i

    SS n y y2 2

    . ..

    1

    = - a

    i

    Treatments

    i

    y ySS

    n anor

    DOE and Optimization

  • The Analysis of Variance

    13

    SSE is the sum of squares due to error, that is the sum of square of difference of observations within a specific factor level and the factor-level average. It measures the random error.

    If SSTreatments is large, it is due to differences among the means at the

    different factor levels.

    Thus, by comparing the magnitude of SSTreatments to SSE we can see how

    much variability is due to changing factor levels and how much is due to

    error.

    This comparison is facilitated if we first divide this sums by their number

    of degrees of freedom.

    or 2

    .

    1 1

    ( )a n

    E ij i

    i j

    SS y y = - E T TreatmentsSS SS SS

    DOE and Optimization

  • The Analysis of Variance

    14

    While sums of squares cannot be directly compared to test the hypothesis of equal means, mean squares can be compared.

    A mean square is a sum of squares divided by its degrees of freedom:

    If the treatment means are equal, the treatment and error mean squares will be (theoretically) equal.

    If treatment means differ, the treatment mean square will be larger than the error mean square.

    1 1 ( 1)

    ,1 ( 1)

    Total Treatments Error

    Treatments ETreatments E

    df df df

    an a a n

    SS SSMS MS

    a a n

    DOE and Optimization

  • ANOVA Table

    15

    The ratio between MS Treatments and MSE follows a F, distribution,

    which is completely determined by u, the numerator degrees of

    freedom, and v, the denominator degrees of freedom .

    In analysis of variance, u is equal to (a-1), the degrees of freedom

    of MS Treatments and v is equal to a(n-1), the degrees of freedom of MSE.

    The test statistic is

    If Fo > F a-1,a(n-1) , we may conclude that the factor-level means are

    different

    = TreatmentsoE

    MSF

    MS

    DOE and Optimization

  • 16

    ANOVA Table

    The reference distribution for F0 is the Fa-1, a(n-1) distribution

    Reject the null hypothesis (equal treatment means) if

    P-value is the probability of Type I error i.e., the probability of rejecting the null hypothesis even if it is actually TRUE

    0 , 1, ( 1)a a nF F

    DOE and Optimization

  • 17

    Plasma etching process:

    DOE and Optimization

  • 18

    Plasma etching process:

    DOE and Optimization

  • 19

    Why Does the ANOVA Work?

    2 2

    1 0 ( 1)2 2

    0

    We are sampling from normal populations, so

    if is true, and

    Cochran's theorem gives the independence of

    these two chi-square random variables

    / (So

    Treatments Ea a n

    Treatments

    SS SSH

    SSF

    2

    11, ( 1)2

    ( 1)

    2

    2 21

    1) / ( 1)

    / [ ( 1)] / [ ( 1)]

    Finally, ( ) and ( )1

    Therefore an upper-tail test is appropriate.

    aa a n

    E a n

    n

    i

    iTreatments E

    a aF

    SS a n a n

    n

    E MS E MSa

    F

    DOE and Optimization

  • Residual Analysis

    DOE and Optimization 20

    We define a residual as the difference between the actual

    observation and the factor-level mean . Therefore, the residual is

    The difference between an observation and the corresponding

    factor-level mean.

    The normality assumption can be checked by plotting the residuals

    on normal probability paper.

    Residual Plot

    Normal Probability Plot

    . = - ij ij ie y y

  • Residual Analysis

    DOE and Optimization 21

    Plasma etching process

  • Residual Plot

    DOE and Optimization 22

    Plasma etching process

  • Residual Plot

    DOE and Optimization 23

    To check the assumption of zero mean and equal variances at each

    factor level, plot the residuals against the factor levels and compare

    the spread in the residuals.

    The independent assumption can be checked by plotting the

    residuals against the run order in which the experiment was

    performed.

    A pattern in this plot, such as sequences of positive and negative

    residuals, may indicate that the observations are not independent.

  • Normal Probability Plot

    DOE and Optimization 24

  • Normal Probability Plot

    DOE and Optimization 25

    A normal probability paper is a graph of the cumulative normal

    distribution of the residuals, that is, graph paper with the ordinate

    scaled so that the cumulative normal distribution plots as a straight

    line.

    To construct a normal probability plot, arrange the residuals in

    increasing order and plot the kth of these ordered residuals versus

    the cumulative probability point,

    on normal probability paper. If the underlying error distribution is

    normal, this plot will resemble a straight line

    = ( - 0.5) /kP k N

  • Practical Interpretation of Results

    DOE and Optimization 26

    Regression Model

    Contrasts

  • Regression Model

    DOE and Optimization 27

    Plasma etching process

  • Contrasts

    DOE and Optimization 28

    We might suspect at the outset of the experiment that 200W and

    220W produce the same etch rate, implying that we would like to

    test the hypothesis:

    This hypothesis could be tested by investigating an appropriate

    linear combination of factor level totals say

    3 4

    1 3 4

    : =

    : =

    oH

    H

    3. 4. - = 0y y

    3 4 - = 0

    Or

  • Contrasts

    DOE and Optimization 29

    If we have suspected that the average of 160 W and 180W did not

    differ from the average of 200W and 220W, then the hypothesis,

    would have been

    which implies the linear combination

    1 2 3 4

    1 1 2 3 4

    : + = +

    : + = +

    oH

    H

    1. 2. 3. 4. + - y - y = 0y y

    1 2 3 4 + - - = 0

    Or

  • Contrasts

    DOE and Optimization 30

    In general, the comparison of factor level means of interest will

    imply a linear combination of factor level totals such as

    with the restriction that Such linear combinations are

    called contrasts. The sum of squares of any contract is

    .

    1

    = ya

    i i

    i

    C c

    1

    = 0.a

    i

    i

    c

    2

    .

    1

    2

    1

    c y

    = 1

    a

    i i

    i

    c a

    i

    i

    SS

    cn

  • Contrasts

    DOE and Optimization 31

    Plasma etching problem

    Hypothesis Contrast

    1 2

    1 2 3 4

    3 4

    : =

    : + +

    : =

    o

    o

    o

    H

    H

    H

    1 1. 2.

    2 1. 2. 3. 4.

    3 3. 4.

    +

    C y y

    C y y y y

    C y y

  • Contrasts

    DOE and Optimization 32

    Plasma etching problem

    1

    2

    1

    2

    2

    2

    1(551.2) 1(587.4) 36.2

    36.23276.10

    1(2)

    5

    1(551.2) 1(587.4) 1(625.4) 1(707.0) 193.8

    ( 193.8)46,948.05

    1(4)

    5

    C

    C

    C

    SS

    C

    SS

    3

    3

    2

    1(625.4) 1(707.6) 81.6

    ( 81.6)16,646.40

    1(2)

    5

    C

    C

    SS

  • Contrasts

    DOE and Optimization 33

    Plasma etching problem

    A contrast is tested by comparing its sum of squares to the error mean square. The resulting statistic would be distributed as F with 1 and a(n-1) degrees of freedom.

    Contrast is always single degree of freedom since two comparing data and one equation