response surface method principle component analysis

35
Response Surface Method Principle Component Analysis 1 Daniel Baur / Numerical Methods for Chemical Engineers / RSM & PCA Daniel Baur ETH Zurich, Institut für Chemie- und Bioingenieurwissenschaften ETH Hönggerberg / HCI F128 – Zürich E-Mail: [email protected] http://www.morbidelli-group.ethz.ch/education/index max( ( )) opt x x Sx

Upload: lorand

Post on 16-Feb-2016

108 views

Category:

Documents


0 download

DESCRIPTION

Response Surface Method Principle Component Analysis. Daniel Baur ETH Zurich, Institut für Chemie- und Bioingenieurwissenschaften ETH Hönggerberg / HCI F128 – Zürich E-Mail: [email protected] http://www.morbidelli-group.ethz.ch/education/index . Definitions. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Response Surface Method Principle Component Analysis

1Daniel Baur / Numerical Methods for Chemical Engineers / RSM & PCA

Response Surface MethodPrinciple Component Analysis

max( ( ))opt xx S x

Daniel BaurETH Zurich, Institut für Chemie- und Bioingenieurwissenschaften

ETH Hönggerberg / HCI F128 – ZürichE-Mail: [email protected]

http://www.morbidelli-group.ethz.ch/education/index

Page 2: Response Surface Method Principle Component Analysis

2Daniel Baur / Numerical Methods for Chemical Engineers / RSM & PCA

Definitions

The response surface method is a tool to Investigate the repsonse of a variable to changes in a set of design

or explanatory variables Fine the optimal conditions for the response

Example: Consider a chemical process where the yield is a (unknown) function of temperature and pressure, and you want to maximize the yield

( , )Y Y T P

Page 3: Response Surface Method Principle Component Analysis

3Daniel Baur / Numerical Methods for Chemical Engineers / RSM & PCA

COVT Approach

COVT stands for «Change One Variable per Time» This approach makes a fundamental assupmtion:

Often, experimentation starts in a region far from the optimum

Example: We do not know the response surface for Y(T,P), but we start investigating it by first changing T, then P.

Changing one parameter at a time is independent of the effects of changes in the others.

This is usually not true!

Page 4: Response Surface Method Principle Component Analysis

4Daniel Baur / Numerical Methods for Chemical Engineers / RSM & PCA

COVT Approach (Example)T

5060

70

80

Contour curves for the yield (Y)

Starting point

Design of experiments

Optimum ???

Optimum !!!

P

Page 5: Response Surface Method Principle Component Analysis

5Daniel Baur / Numerical Methods for Chemical Engineers / RSM & PCA

2k Factorial DesignT

5060

70

80

Contour curves for the yield (Y)

Design of experiments

Optimum-1

-1

+1

+1

P T Y-1 -1 40

-1 +1 78

+1 -1 59

+1 +1 58

Initial investigation starts with a first order approximation of the response surface

P

Page 6: Response Surface Method Principle Component Analysis

6Daniel Baur / Numerical Methods for Chemical Engineers / RSM & PCA

Example: Plastic Wrap

The strength of a plastic wrap (Y) is a function of the sealing temperature (T) and the percentage of polyethylene additive (P). A process engineer tries to make the wrap as strong as possible (maximize Y).

The response function (unknown to the engineer!) reads:

Starting conditions: T = 140 C, P = 4.0%

Optimal conditions (analytical): T = 216 C, P = 9.2%

2 220 0.85 1.5 0.0025 0.375 0.025Y T P T P T P

Page 7: Response Surface Method Principle Component Analysis

7Daniel Baur / Numerical Methods for Chemical Engineers / RSM & PCA

Ture Response Surface

20

30

30

40

40

50

50

50

50

50

60

60

60

6060

60

70

70

70

70

70

70

75

75

75

75

78

78

PE Additive (%)

Tem

pera

ture

(o C)

0 5 10 15100

120

140

160

180

200

220

240

260

280

300

Starting point

Optimum

Page 8: Response Surface Method Principle Component Analysis

8Daniel Baur / Numerical Methods for Chemical Engineers / RSM & PCA

2k Factorial Design

T PCoded

t p120 2 -1 -1120 6 -1 +1160 2 +1 -1160 6 +1 +1

14020

Tt

42Pp

0 1 2Y b b p b t Initial regression model:

-1

-1

+1

+1

Page 9: Response Surface Method Principle Component Analysis

9Daniel Baur / Numerical Methods for Chemical Engineers / RSM & PCA

22 Factorial Design

-1

0

1

-1

0

1

45

50

55

60

65

70

75

pt

Y

True Response Surface

Contour Curves of Y

Page 10: Response Surface Method Principle Component Analysis

10Daniel Baur / Numerical Methods for Chemical Engineers / RSM & PCA

First Order Regression

-1.5-1

-0.50

0.51

1.5

-1

0

1

40

50

60

70

80

pt

Y

Regressed Response

Page 11: Response Surface Method Principle Component Analysis

11Daniel Baur / Numerical Methods for Chemical Engineers / RSM & PCA

2k Factorial Design with Center Point

T PCoded

t p120 2 -1 -1120 6 -1 +1160 2 +1 -1160 6 +1 +1140 4 0 0

14020

Tt

42Pp

0 1 2Y b b p b t Initial regression model:

-1

-1

+1

+1

Central point does not influence the regression of the

slope

Page 12: Response Surface Method Principle Component Analysis

12Daniel Baur / Numerical Methods for Chemical Engineers / RSM & PCA

22 Factorial Design with Center Point

-1.5-1

-0.50

0.51

1.5

-1.5-1

-0.50

0.51

1.540

50

60

70

80

pt

Y

True Response Surface

Contour Curves of Y

Experimental

Responses

Page 13: Response Surface Method Principle Component Analysis

13Daniel Baur / Numerical Methods for Chemical Engineers / RSM & PCA

First Order Regression

-1.5-1

-0.50

0.51

1.5

-1.5-1

-0.50

0.51

1.540

50

60

70

80

pt

Y

Regressed Response

Page 14: Response Surface Method Principle Component Analysis

14Daniel Baur / Numerical Methods for Chemical Engineers / RSM & PCA

Curvature

The center point can give us an indication about the curvature of the surface and its statistical significance

If there is no curvature and the linear model is appropriate in the region of interest, then the average value of the experimental responses in the center point(s) and in all the corners is roughly equal (within the standard deviation)

2

1 1, var2 2curv center center

center

s t n Yn

center corner curvC E Y E Y s C- C+

Page 15: Response Surface Method Principle Component Analysis

15Daniel Baur / Numerical Methods for Chemical Engineers / RSM & PCA

Tukey-Anscombe Plot

50 55 60 65 70 75-4

-3

-2

-1

0

1

2

3

Y Regressed

Res

idua

ls

Page 16: Response Surface Method Principle Component Analysis

16Daniel Baur / Numerical Methods for Chemical Engineers / RSM & PCA

Steepest Ascent Direction

-1.5 -1 -0.5 0 0.5 1 1.5-1.5

-1

-0.5

0

0.5

1

1.5

p

t

Contour Lines of the

Regressed 1st order Surface

Steepest Ascent Direction

Experimental Points

Page 17: Response Surface Method Principle Component Analysis

17Daniel Baur / Numerical Methods for Chemical Engineers / RSM & PCA

Steepest Ascent Direction

-1.5 -1 -0.5 0 0.5 1 1.5

-1

0

1

45

50

55

60

65

70

75

80

pt

Y

Page 18: Response Surface Method Principle Component Analysis

18Daniel Baur / Numerical Methods for Chemical Engineers / RSM & PCA

Monodimensional Search

20

20

30

30

40

4050

50

50

50

50

60

60

60

6060

60

70

70

70

70

70

70

75

75

75

75

7878

P

T

0 5 10 15100

120

140

160

180

200

220

240

260

280

300

Steepest Ascent Direction

Monodimensional search

Page 19: Response Surface Method Principle Component Analysis

19Daniel Baur / Numerical Methods for Chemical Engineers / RSM & PCA

Monodimensional Search

0 1 2 3 4 5 6 7 8 964

66

68

70

72

74

76

78

80

Step Number

Y

Experimental points

True Response along the

steepest ascent direction

Page 20: Response Surface Method Principle Component Analysis

20Daniel Baur / Numerical Methods for Chemical Engineers / RSM & PCA

22 Factorial Design with Center Points

-1.5 -1 -0.5 0 0.5 1 1.5-1.5

-1

-0.5

0

0.5

1

1.5

p

t

Maximum from the

monodimensional search

Maximum of response

surface (unknown)

New 2k Factorial Design

Page 21: Response Surface Method Principle Component Analysis

21Daniel Baur / Numerical Methods for Chemical Engineers / RSM & PCA

22 Factorial Design with Center Points

-1.5-1

-0.50

0.51

1.5

-1.5-1

-0.50

0.51

1.570

72

74

76

78

80

pt

Experimental Points

True response surface

Page 22: Response Surface Method Principle Component Analysis

22Daniel Baur / Numerical Methods for Chemical Engineers / RSM & PCA

First Order Regression

-1.5-1

-0.50

0.51

1.5

-1.5-1

-0.50

0.51

1.570

72

74

76

78

80

pt

Regressed Response

Page 23: Response Surface Method Principle Component Analysis

23Daniel Baur / Numerical Methods for Chemical Engineers / RSM & PCA

Central Composite Design

-1.5 -1 -0.5 0 0.5 1 1.5-1.5

-1

-0.5

0

0.5

1

1.5

p

t

2k Factorial Design

r = 21/2

Central Composite

Design

At least three different levels are needed to estimate a second order function

Page 24: Response Surface Method Principle Component Analysis

24Daniel Baur / Numerical Methods for Chemical Engineers / RSM & PCA

Central Composite Design

-1.5 -1 -0.5 0 0.5 1 1.5

-1

0

1

70

72

74

76

78

80

pt

Y

2 20 1 2 3 4 5Y p t p t pt

Page 25: Response Surface Method Principle Component Analysis

25Daniel Baur / Numerical Methods for Chemical Engineers / RSM & PCA

Central Composite Design

73 74 75 76 77 78 79-2.5

-2

-1.5

-1

-0.5

0

0.5

1

1.5

2

2.5

Regressed Y

Res

idua

lsTukey-Anscombe Plot

Page 26: Response Surface Method Principle Component Analysis

26Daniel Baur / Numerical Methods for Chemical Engineers / RSM & PCA

Response Surface Method Algorithm

1. Use 2k factorial design to generate linearization points around a starting point x(0), where k is the number of variables

2. Fit a linear regression model

3. Check if the curvature is large. If so, jump to point 7. If you think you are far from the maximum, you can try smaller steps.

4. Find the steepest ascent direction

0 0 1 1 1k k kY b x b x b x b

0 12

0

1 , , , Tkk

ii

d b b bb

Page 27: Response Surface Method Principle Component Analysis

27Daniel Baur / Numerical Methods for Chemical Engineers / RSM & PCA

Response Surface Method Algorithm (Continued)

5. Conduct experiments at points along the steepest ascent direction

6. When a maximum in the response variable occurs, setx(0) = x(k) and go back to point 1.

7. Perform a central composite design around the current point. Fit a second order linear regression.

8. Find the extremum of the regression curve by setting the Jacobian equal to zero and solving the resulting linear system

9. Check that J is negative definite (all eigenvalues < 0) to ensure a maximum in the function

( ) (0) 1,2,kx x k d k

Page 28: Response Surface Method Principle Component Analysis

28Daniel Baur / Numerical Methods for Chemical Engineers / RSM & PCA

Principal Component Analysis (PCA)

Consider a large sets of data (e.g., many spectra (n) of a chemical reaction as a function of the wavelength (p))

Objective: Data reduction: find a smaller set of (k) derived (composite) variables that retain as much information as possible

n

p

An

k

X

Page 29: Response Surface Method Principle Component Analysis

29Daniel Baur / Numerical Methods for Chemical Engineers / RSM & PCA

PCA

PCA takes a data matrix of n objects by p variables, which may be correlated, and summarizes it by uncorrelated axes (principal components or principal axes) that are linear combinations of the original p variables

New axes = new coordinate system Construct the Covariance Matrix of the data (which need to

be centered), and find its eigenvalues and eigenvectors

Page 30: Response Surface Method Principle Component Analysis

30Daniel Baur / Numerical Methods for Chemical Engineers / RSM & PCA

PCA in Matlab

There are two possibilities to perform PCA with Matlab: 1) Use Singular Value Decomposition: [U,S,V]=svd(data); where U contains the scores, V the eigenvectors of the covariance

matrix, or loading vectors. SVD does not require the statistics toolbox.

2) [COEFF,Scores]=princomp(data); is a specialized command to perform principal value decomposition. It requires the statistics toolbox.

Page 31: Response Surface Method Principle Component Analysis

31Daniel Baur / Numerical Methods for Chemical Engineers / RSM & PCA

Exercise 1

A chemical engineer tries to optimize the a reaction by maximizing the yield. There are two variables which influence the yield: The reaction time and the reaction temperature. Currently, the reaction is carried out for 35 minutes at 155 F, resulting in a yield of about 40%.

Three sets of experiments were conducted, given in the data files reactionYield-1 through 3. The datasets are structured identically, with the first two columns being time and temperature, the third and fourth column the same variables in coded units (-1, +1, etc.) and the last column is the yield y.

Page 32: Response Surface Method Principle Component Analysis

32Daniel Baur / Numerical Methods for Chemical Engineers / RSM & PCA

Assignment 11. The first data set is near the current operating point.

Fit a first order (planar) surface to the data. What is the direction of the steepest ascent? Plot the operating conditions, experimental design points and the

direction you found in the parameters plane Time vs. Temperature.

2. The second data set contains more experiments in the direction found in part 1. Plot the data (for example as Yield vs. Temperature) and find out

where the yield reaches a maximum along this direction.

Page 33: Response Surface Method Principle Component Analysis

33Daniel Baur / Numerical Methods for Chemical Engineers / RSM & PCA

Assignment 1 (Continued)

3. The maximum in 2. is used for another first order design, this data is found in the third data set. Show that the curvature of the response surface is significantly

different from zero.

4. The data from 3. is now extended to a central composite design. Fit a second order (quadratic) response surface to the data and calculate the maximum analytically. If you are using LinearModel, you can specify second order terms in

the modelspec by using the * and ^ operators, for example'y ~ a*b' will incorporate a, b and a*b, and 'y ~ a^2' will use the quadratic term. So for two variables a and b, the modelspec string for a second order linear regression will read'y ~ a^2 + a*b + b^2'

Page 34: Response Surface Method Principle Component Analysis

34Daniel Baur / Numerical Methods for Chemical Engineers / RSM & PCA

Assignment 2

The dataset d_react contains data of IR spectra measured during a chemical reaction (122 x 700). The first row contains the wavelength, all other rows the spectra.

1. Create a matrix centeredData, obtained by centering the data, i.e. subtracting the column mean from each column. What can observe when looking at the centered spectra? What distinguishes the different observations (spectra) regarding the different variables (wavelengths)?

2. Perform singular value decomposition on the centered data. The U matrix of this decomposition contains the «scores» in terms of PCA. Use [U,S,V] = svd(centeredData);

Page 35: Response Surface Method Principle Component Analysis

35Daniel Baur / Numerical Methods for Chemical Engineers / RSM & PCA

Assignment 2 (Continued)

3. Plot the first 3 scores in a scatterplot matrix using the plotmatrix function.

4. Plot the first three loading vectors (columns of V) versus the wavelength. What can you observe? Compare with what you have seen in point 2.