data mining for knowledge extraction in data overloaded

58
Data Mining for Knowledge Extraction in Data Overloaded Process Environments Part 2 Sirish Shah Professor and NSERC-Matrikon-ASRA Industrial Research Chair University of Alberta, Canada Credits: D. Chang, V. Kumar, H. Raghavan, S. Choudhury. S. Lakshminarayanan, H. Fujii

Upload: others

Post on 07-Dec-2021

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Data Mining for Knowledge Extraction in Data Overloaded

Data Mining for Knowledge Extractionin

Data Overloaded Process EnvironmentsPart 2

Sirish ShahProfessor and NSERC-Matrikon-ASRA Industrial Research Chair

University of Alberta, Canada

Credits: D. Chang, V. Kumar, H. Raghavan, S. Choudhury. S. Lakshminarayanan, H. Fujii

Page 2: Data Mining for Knowledge Extraction in Data Overloaded

2

Copy

right

: S.

L. S

hah,

U. o

f Alb

erta

; 20

03

Examples and Case Studiesof Process Monitoring

Page 3: Data Mining for Knowledge Extraction in Data Overloaded

3

Copy

right

: S.

L. S

hah,

U. o

f Alb

erta

; 20

03

Cluster Analysis

Hundreds of variables may be measured for a particular process.

An important but difficult task is the selection of useful variables for analysis.

Cluster analysis groups process variables according to their correlation structure.

Page 4: Data Mining for Knowledge Extraction in Data Overloaded

4

Copy

right

: S.

L. S

hah,

U. o

f Alb

erta

; 20

03

Numerical values in a correlation matrix!

1 -0.185294206 0.027055773 -0.171819975 0.093276222 -0.027994229 0.098787646 -0.020583535 0.032817213 -0.132101895 0.007443381-0.185294206 1 0.046731844 0.192725409 0.06221338 -0.121835155 0.007001147 0.072787318 0.256501727 -0.332697412 0.3011214280.027055773 0.046731844 1 0.497561061 0.097669495 -0.042977998 0.073150009 0.01752308 0.446311963 -0.065613281 0.577804913

-0.171819975 0.192725409 0.497561061 1 0.098720947 0.04428365 0.109814696 0.207062089 0.277424246 0.017018503 0.3740579680.093276222 0.06221338 0.097669495 0.098720947 1 -0.16665222 0.088320804 0.017838289 0.083761143 0.0118807 0.047680363

-0.027994229 -0.121835155 -0.042977998 0.04428365 -0.16665222 1 -0.083250958 -0.113249872 -0.066049714 0.083771862 -0.0246369920.098787646 0.007001147 0.073150009 0.109814696 0.088320804 -0.083250958 1 0.11444586 0.18562112 -0.115407265 0.14030574

-0.020583535 0.072787318 0.01752308 0.207062089 0.017838289 -0.113249872 0.11444586 1 0.008346999 -0.031577329 -0.0298117910.032817213 0.256501727 0.446311963 0.277424246 0.083761143 -0.066049714 0.18562112 0.008346999 1 -0.548843663 0.897175324

-0.132101895 -0.332697412 -0.065613281 0.017018503 0.0118807 0.083771862 -0.115407265 -0.031577329 -0.548843663 1 -0.4109610640.007443381 0.301121428 0.577804913 0.374057968 0.047680363 -0.024636992 0.14030574 -0.029811791 0.897175324 -0.410961064 10.074563952 -0.290999002 -0.166431901 -0.191828797 0.137354964 -0.318379029 0.361032714 0.140153123 0.041112456 0.020956227 -0.121890887

-0.113935147 0.096219583 -0.05286439 -0.015289902 0.019324048 0.016093852 0.079420974 0.30604406 0.012369794 -0.118651352 -0.024855722-0.004367924 0.415302265 0.212895416 0.422025428 0.121021716 -0.096013616 0.208845673 0.062699756 0.305190719 -0.192434667 0.383145433-0.041892367 0.292542753 0.088478765 0.242138917 0.091224942 -0.044085648 0.030151736 -0.048883722 -0.018817843 0.001127689 0.090586488-0.012796946 0.428193732 0.148027299 0.374229362 -0.036415302 0.031485232 0.014195414 0.093444237 0.111307503 -0.142990907 0.198198421-0.072987889 0.639653323 0.240224974 0.430130045 0.025582625 0.028934191 0.074669894 -0.003315806 0.264468969 -0.263129122 0.384595716-0.081425541 0.650446804 0.264165028 0.526494862 0.024959155 0.016578929 0.086833576 0.043588107 0.234276435 -0.236008471 0.3583759990.220771143 -0.298196834 -0.095225139 -0.369718105 0.018743973 0.082060135 -0.102865842 0.113548775 -0.131073669 0.077167167 -0.117265396

-0.024869432 -0.190880521 0.216974667 0.238447454 0.043845472 0.02495333 -0.005302967 -0.052698509 0.117642296 0.030820275 0.12686478-0.030647599 0.099397397 0.060271104 -0.149448693 0.101629373 0.107726218 0.138653219 -0.135938599 0.29944496 -0.143383865 0.324332399-0.105195337 -0.1838672 -0.148673467 -0.213141108 0.059956448 0.018061058 -0.017384883 -0.104230537 0.034619576 0.092101398 -0.0319182910.017414435 -0.037744275 0.109437444 -0.048190722 -0.019413361 0.085313857 -0.03284987 -0.090511989 0.148464685 -0.022750366 0.1854388330.038764349 -0.713947863 -0.052827235 -0.199718753 -0.077278511 0.145219414 -0.072139066 -0.148773823 -0.170502021 0.300067528 -0.221413011

-0.045966454 -0.478970547 0.019338226 0.217745669 -0.040564805 0.097708972 0.112071592 0.174626221 -0.127211683 0.192457187 -0.1987993940.004199633 -0.557289909 -0.28471802 -0.50961326 -0.10351578 0.076308499 -0.029991343 -0.086972975 -0.114870901 -0.040561871 -0.281984840.08599719 0.007025338 0.154179893 0.232546946 0.007835329 -0.086667223 0.156685077 0.023192903 -0.02996409 0.102102098 0.0147997420.07875747 -0.429241921 -0.344287431 -0.244148911 -0.023116012 -0.027914037 -0.035809918 0.05492866 -0.371758851 0.202342995 -0.500472826

-0.058448605 0.611130166 -0.119253918 0.057550975 0.117777025 -0.183408655 0.082432127 0.10213768 -0.008270005 -0.326478145 -0.006977244-0.02580405 -0.150816152 0.120793341 0.066957772 0.025269938 0.006378438 -0.075330271 0.07187571 -0.106953812 0.090117193 -0.0678767740.159960053 -0.46153707 -0.061782981 -0.275932869 0.000719008 0.025290936 0.007663131 -0.053698037 -0.101831356 0.058117208 -0.1733652060.026409868 -0.563862298 -0.386080423 -0.546002219 -0.101953064 0.031863056 -0.01116078 -0.088594064 -0.187499891 0.05573025 -0.3700930340.215815744 -0.782048799 -0.133757845 -0.413764596 -0.063567993 0.057507301 0.005673666 -0.045465041 -0.198905102 0.157678121 -0.2927683720.210107163 -0.762753731 -0.113514688 -0.377067501 -0.080202766 0.058553599 -0.013312535 -0.049311774 -0.178286335 0.116737358 -0.262262234

-0.160786539 0.822759906 0.082111296 0.319369724 0.073604799 -0.119269237 0.02934483 -0.005308015 0.179202768 -0.192898469 0.2390704920.17932223 -0.707919187 -0.147742155 -0.468437394 -0.086347239 0.046364378 0.023420083 -0.056932385 -0.199912109 0.13527489 -0.320831812

0.149319082 -0.752830887 -0.06056419 -0.348884293 -0.090379995 0.156053854 0.00402923 -0.087617203 -0.010696796 0.026324582 -0.10811970.010059676 0.616117603 0.346256652 0.323705183 0.037419038 -0.036121112 0.01731208 0.015883625 0.443358533 -0.280423399 0.559601776

Imagine looking at thousands of numerical values of data!

Page 5: Data Mining for Knowledge Extraction in Data Overloaded

5

Copy

right

: S.

L. S

hah,

U. o

f Alb

erta

; 20

03

Cluster Analysis

Correlationmatrix beforethe optimalordering ofvariables.(>150 variables!)

Page 6: Data Mining for Knowledge Extraction in Data Overloaded

6

Copy

right

: S.

L. S

hah,

U. o

f Alb

erta

; 20

03

Cluster Analysis

Correlationmatrix afterthe optimalordering ofvariables.

Page 7: Data Mining for Knowledge Extraction in Data Overloaded

7

Copy

right

: S.

L. S

hah,

U. o

f Alb

erta

; 20

03

Process Monitoring usingPrincipal Components Analysis

Page 8: Data Mining for Knowledge Extraction in Data Overloaded

8

Copy

right

: S.

L. S

hah,

U. o

f Alb

erta

; 20

03

Yet another motivational argument for Multivariate Statistics

FuelFlow

Abnormal Data

Steam Demand

Page 9: Data Mining for Knowledge Extraction in Data Overloaded

9

Copy

right

: S.

L. S

hah,

U. o

f Alb

erta

; 20

03

A different view of boiler data

FuelowRather than monitoringfuel flow and steam demand separately (which would anyway give misleading results),it makes sense to monitor a variablewhich is a linear combination of thetwo variables

Fl

Steam Demand Abnormal Data

Page 10: Data Mining for Knowledge Extraction in Data Overloaded

10

Copy

right

: S.

L. S

hah,

U. o

f Alb

erta

; 20

03

-20

24

6

0

0.5

1

1.50

0.5

1

1.5

2

2.5

PC1

PC2

23 →

Principal Components Analysis (PCA)

PCA is concerned with the coordinate transformation of data so that they can be represented in a reduced dimensional plane (e.g. )

The premise is that there is usually a simpler and inherent underlying structure to the process and therefore the data that originates from it. PC’s often have physical meaning.

Page 11: Data Mining for Knowledge Extraction in Data Overloaded

11

Copy

right

: S.

L. S

hah,

U. o

f Alb

erta

; 20

03

Principal Components Analysis

Given: Measurements of process variables, XMeasurements are possibly correlatedNumber of Samples: nsNumber of Variables: nx

Sample # X1 X2 X3 ………… Xnx

12345...ns

Page 12: Data Mining for Knowledge Extraction in Data Overloaded

12

Copy

right

: S.

L. S

hah,

U. o

f Alb

erta

; 20

03

-20

24

6

00.5

11.5

0

0.5

1

1.5

2

2.5

PC1

PC2

PC2

Analyze data in this reduced 2 dimensional plane

PC1

3D Euclidean basis space Principal

subspace

Page 13: Data Mining for Knowledge Extraction in Data Overloaded

13

Copy

right

: S.

L. S

hah,

U. o

f Alb

erta

; 20

03

Process Monitoring usingPrincipal Components Analysis

A simple example to illustrate the application of PCA

Page 14: Data Mining for Knowledge Extraction in Data Overloaded

14

Copy

right

: S.

L. S

hah,

U. o

f Alb

erta

; 20

03

Description of the process

Cold water Mixed water 1 Mixed water 2

F3F1 F4

F2 Hot water

Page 15: Data Mining for Knowledge Extraction in Data Overloaded

15

Copy

right

: S.

L. S

hah,

U. o

f Alb

erta

; 20

03

4

6

8

10

12

15

16

17

18

19

20

21

22

20

25

30

35

f2

3D V is ualiz at ion of F low Data

f1

f3

Page 16: Data Mining for Knowledge Extraction in Data Overloaded

16

Copy

right

: S.

L. S

hah,

U. o

f Alb

erta

; 20

03

567

891011

1416

1820

22

22

24

26

28

30

32

f2

3D V is ualiz at ion of F low Data

f1

f3

Page 17: Data Mining for Knowledge Extraction in Data Overloaded

17

Copy

right

: S.

L. S

hah,

U. o

f Alb

erta

; 20

03

-20

24

6

0

0.5

1

1.50

0.5

1

1.5

2

2.5

PC1

PC2

P defines the new coordinate system spanned by PC1 and PC2The scores are the projection of the data onto PC1 and PC2

Page 18: Data Mining for Knowledge Extraction in Data Overloaded

18

Copy

right

: S.

L. S

hah,

U. o

f Alb

erta

; 20

03

567891011

1416

1820

22

22

24

26

28

30

32

f2

3D Visualization of Data

f1

f3

Prediction Error

Normal rangeof operation

(ellipse boundaries)

Large prediction error, i.e. correlation structure breaks

down. (This point will flare upon the SPE plot)

Correlation structure holds,but operation is outside normal range.These points show up on the T2 plots.

Page 19: Data Mining for Knowledge Extraction in Data Overloaded

19

Copy

right

: S.

L. S

hah,

U. o

f Alb

erta

; 20

03

Normal Operating Data

•Since the two mixed water flow rates are almost identical, only one of the flow rates (mixed water 1) is shown in this plot

•We take this flow rate data as normal operating data, and build a PCA model based on this data.

Page 20: Data Mining for Knowledge Extraction in Data Overloaded

20

Copy

right

: S.

L. S

hah,

U. o

f Alb

erta

; 20

03

Description of the process

Cold water Mixed water 1 Mixed water 2

F3F1 F4

F2 Hot water

Page 21: Data Mining for Knowledge Extraction in Data Overloaded

21

Copy

right

: S.

L. S

hah,

U. o

f Alb

erta

; 20

03

Square Prediction Error Plot

•If any of the measurements is wrong, i.e. if the model does not hold then the SPE plot will flare up.

•The SPE plot shown above is based on normal operating data, and therefore one should expect that almost all the data will lie within 99% confidence interval.

Page 22: Data Mining for Knowledge Extraction in Data Overloaded

22

Copy

right

: S.

L. S

hah,

U. o

f Alb

erta

; 20

03

0123456789

0 50 100 150 200 250 300 350 400 450 500

95%

99%

Hotelling's T²H

otel

ling'

s T

²

Sample

•The T2 chart show that all measurements are within 99% control limits.

Page 23: Data Mining for Knowledge Extraction in Data Overloaded

23

Copy

right

: S.

L. S

hah,

U. o

f Alb

erta

; 20

03

Since we have built a PCA model based on the normal operation data, we can now use this model to check:

•The shift of operating point of the plant

•Possible sensor fault in the online measurements

•The presence of any disturbance to the plant

Page 24: Data Mining for Knowledge Extraction in Data Overloaded

24

Copy

right

: S.

L. S

hah,

U. o

f Alb

erta

; 20

03

•In order to use the PCA model to check sensor faults, a bias or a drift in the measurement sensor of the total water flow rate is introduced between samples 501-700

Page 25: Data Mining for Knowledge Extraction in Data Overloaded

25

Copy

right

: S.

L. S

hah,

U. o

f Alb

erta

; 20

03

Many scores are out of the 99% control limits, indicating a possibly faulty sensor or an out of nominal zone excursion.

Page 26: Data Mining for Knowledge Extraction in Data Overloaded

26

Copy

right

: S.

L. S

hah,

U. o

f Alb

erta

; 20

03

T2 chart also indicates a out of control status beginning at sample 501

Page 27: Data Mining for Knowledge Extraction in Data Overloaded

27

Copy

right

: S.

L. S

hah,

U. o

f Alb

erta

; 20

03 •SPE increases significantly, since the correlations among the variables no longer hold

•SPE chart is more sensitive than T2 chart to check for possible sensor failure. While T2 chart is more suitable for detecting any change in operating states of the process (assuming that the correlations among the variables still hold.)

Page 28: Data Mining for Knowledge Extraction in Data Overloaded

28

Copy

right

: S.

L. S

hah,

U. o

f Alb

erta

; 20

03

Process Monitoring

Data is often correlated in ways we don’t easily understand.Univariate analysis does not provide overall picture of the process.Multivariate techniques help us predict abnormal process operation before it becomes visible to operators.

Page 29: Data Mining for Knowledge Extraction in Data Overloaded

Industrial Case Study:PCA-based Process Monitoring

Fault Detection and Diagnosis Decomposition in a Polymer Reactor

Page 30: Data Mining for Knowledge Extraction in Data Overloaded

30

Copy

right

: S.

L. S

hah,

U. o

f Alb

erta

; 20

03

H igh P ressure P rocess

P rim ary C o m presso rB ooste r S econdary C om p resso r

R eacto r

In itia tor P u m ps

S eparator

H opper

E x truderS ilos

C o-m ono m erP urgeM odifie r

E th ylen e

H eat E x ch

H eat E xch

P ackag in g

300ats20%

100%

74%6%

Page 31: Data Mining for Knowledge Extraction in Data Overloaded

31

Copy

right

: S.

L. S

hah,

U. o

f Alb

erta

; 20

03

Cooler cook onset

Fault

High Density Plot

Page 32: Data Mining for Knowledge Extraction in Data Overloaded

32

Copy

right

: S.

L. S

hah,

U. o

f Alb

erta

; 20

03

Modeling: Pre-filtered data

012345678

95 %

99 %

SPEDECOMP MODEL DEVELOPMENT

SP

E

Sample905 1810 2715 3620 4525 5430 6335 7240 8145 9050 9955

-3

-2

-1

0

1

2

3

-10 -5 0 5 10

PC-1 vs. PC-5DECOMP MODEL DEVELOPMENT

PC

-5 S

co

res

PC-1 Scores

PCA scores 2DPCA scores 2D

11 2 3 4 5 6

Page 33: Data Mining for Knowledge Extraction in Data Overloaded

33

Copy

right

: S.

L. S

hah,

U. o

f Alb

erta

; 20

03

EWMA data filteringOff-line analysis of reactor decomposition

0

5

10

95%

99%

SPEDECOMP

SP

E

Sample1027 2054 3081 4108 5135 6162 7189 8216 9243 10270

3456789

1011

95%

99%

SPEDECOMP

SP

E

Sample10836 10879 10922 10965 11008 11051 11094 11137 11180 11223 11266

Prediction time 9 minutes

Page 34: Data Mining for Knowledge Extraction in Data Overloaded

34

Copy

right

: S.

L. S

hah,

U. o

f Alb

erta

; 20

03

Contribution Plots fpr root cause diagnosis

05

10

SPE Contributions - (11195)Highest Contributors TIC52027D.PV ; PIC52021.PV;PI51135/64.PV

Per

cent

Con

tribu

tion

Tags

AI51001C

.PV

PI51135/64.P

V

PI52054.P

V R

FG

Pr

PIC

52021.OU

T

TI51015.P

V

TI51158.P

V1*disch

TI52024.P

V

TI52031A

.PV

TIC

52021D.P

V.

TIC

52027D.O

UT

TIC

52028D.P

V

Tags% Cntr. 1.664 7.765 3.660 3.075 0.293 0.174 0.453 1.535 1.047 3.560 3.661

Page 35: Data Mining for Knowledge Extraction in Data Overloaded

35

Copy

right

: S.

L. S

hah,

U. o

f Alb

erta

; 20

03

MRI Images

3 different MRI images of the same slice of the brain

Same slice with modified

“abnormality” in some of the images

Page 36: Data Mining for Knowledge Extraction in Data Overloaded

36

Copy

right

: S.

L. S

hah,

U. o

f Alb

erta

; 20

03

MRIA using PCAAxial T1 weighted

Axial FLAIR

Axial T2 weightedNeudecker product PCA

256X256X3X)

3X3P (loadings)

256X1

.

.

.Inverse Neudecker

productPC1

PC2

PC3X

256X256X3

Axial T1 weighted

Axial FLAIR

Axial T2 weighted

Score Images 256X256X3

2562X3T (scores)

2562X3

Score Plots 1024X1024

PC1 vs. PC2

(or other combination)

Page 37: Data Mining for Knowledge Extraction in Data Overloaded

37

Copy

right

: S.

L. S

hah,

U. o

f Alb

erta

; 20

03

Score images

Score images of original images

Score images with modified “abnormality”

Page 38: Data Mining for Knowledge Extraction in Data Overloaded

38

Copy

right

: S.

L. S

hah,

U. o

f Alb

erta

; 20

03

Scores plot (PC1 vs. PC2)

Scores plot of normal images Scores plot of image with modified “abnormality”

Page 39: Data Mining for Knowledge Extraction in Data Overloaded

39

Copy

right

: S.

L. S

hah,

U. o

f Alb

erta

; 20

03

Segmentation using scores plot

Page 40: Data Mining for Knowledge Extraction in Data Overloaded

Building Softsensors via Partial Least Squares (PLS)

Page 41: Data Mining for Knowledge Extraction in Data Overloaded

41

Copy

right

: S.

L. S

hah,

U. o

f Alb

erta

; 20

03

PLS Modeling

X Y

Covariance Maximization

t1

t1

u1

PLS explains variationin both X and Y andsimultaneously alsomaximizes the X and Ycovariance

u1OUTER MODEL

Can fit a line or curve throughthis cluster ofpoints.

INNER MODEL

Page 42: Data Mining for Knowledge Extraction in Data Overloaded

42

Copy

right

: S.

L. S

hah,

U. o

f Alb

erta

; 20

03

Softsensor developmentIndustrial Case Study-1

Credits: Mitsubishi Chemicals, Japan

Page 43: Data Mining for Knowledge Extraction in Data Overloaded

43

Copy

right

: S.

L. S

hah,

U. o

f Alb

erta

; 20

03

PLS application to an industrial column: Modeling Results

0 1 2 3 4 5 6 7 8 9 10day

GC output Model output

0 1 2 3 4 5 6 7 8 9 10day

PLS model prediction

GC output Model output

Original model prediction1200

1000

800

ppm

600

400

1200

1000

800

ppm

600

400

Data are from the part of validation periods (Sep. 95)

Page 44: Data Mining for Knowledge Extraction in Data Overloaded

44

Copy

right

: S.

L. S

hah,

U. o

f Alb

erta

; 20

03

Inferential control using PLS model

FC

(R/D)

Reflux(R)

Distillate(D)

Condenser

Reboiler FC

FC

Bottoms

Steam

(D) sp

FC

FCFeedStreams

LC

Comp.Control

PLSModel

Page 45: Data Mining for Knowledge Extraction in Data Overloaded

45

Copy

right

: S.

L. S

hah,

U. o

f Alb

erta

; 20

03

Control Results

0

0

1 2 3 4 5 6 7 8 9 100day

p

GC output Model output

1 2 3 4 5 6 7 8 9 100

1000

day

p

Impurity after controlGC output Model output

On ControlSet Point Up

Specification Upper Limit

Impurity before control1500

1000

pm500

1500

pm

500

Page 46: Data Mining for Knowledge Extraction in Data Overloaded

46

Copy

right

: S.

L. S

hah,

U. o

f Alb

erta

; 20

03

Controller Performance Assessment

Page 47: Data Mining for Knowledge Extraction in Data Overloaded

47

Copy

right

: S.

L. S

hah,

U. o

f Alb

erta

; 20

03

Performance assessment

Performance of univariate or multivariate controllers ???Simplistically ask: How “healthy” is your controller?

ProcessControllerdr u y

Main benefit: Develop a tool that would help towardslow maintenance and optimal process performance.

Page 48: Data Mining for Knowledge Extraction in Data Overloaded

48

Copy

right

: S.

L. S

hah,

U. o

f Alb

erta

; 20

03

What is Performance Assessment?

Is your controller doing a satisfactory job? Can you get a measure of the ‘state of health’ of a closed loop system from routine operating data?

For diagnosis, look at system objects such as, actuators, constraints, disturbances, models etc.

Page 49: Data Mining for Knowledge Extraction in Data Overloaded

49

Copy

right

: S.

L. S

hah,

U. o

f Alb

erta

; 20

03

Quality and Variance

Remember that:

variables)d(controlle Variance1ProfitandQuality ∝

Page 50: Data Mining for Knowledge Extraction in Data Overloaded

50

Copy

right

: S.

L. S

hah,

U. o

f Alb

erta

; 20

03

Example of Variance Reduction

0 100 200 300 400 50092

94

96

98

100MOIC25276: Trend plots before and after tuning

Sample No.

0 100 200 300 400 50092

94

96

98

100

Sample No.

9293

9495

9697

9899

1000 10 20 30 40 50 60 70 80 90

9293

9495

9697

98

991000

10

20

30

40

50

60

70

80

90

Before tuning

After tuning

Page 51: Data Mining for Knowledge Extraction in Data Overloaded

51

Copy

right

: S.

L. S

hah,

U. o

f Alb

erta

; 20

03

Look at a real industrial process

Schematic Diagram of an Industrial Process

Loop to be evaluated

Page 52: Data Mining for Knowledge Extraction in Data Overloaded

52

Copy

right

: S.

L. S

hah,

U. o

f Alb

erta

; 20

03

Quality and Variance

❖ How good a job are we doing in regulating this temperature ?

❖ Can we do any better ?

310

Can this variance be reducedby retuning this loop?Te

mpe

ratu

re

300

290

2800 2 4 6 10 12 14 168

hrs

What is the lowest possible variance that we can achieve for this loop?

Page 53: Data Mining for Knowledge Extraction in Data Overloaded

53

Copy

right

: S.

L. S

hah,

U. o

f Alb

erta

; 20

03

Performance Assessment: SISO System

0 2 4 6 8 10 12 14 16280

Temperature

hrs0.6

Tighter temperatureregulation resulted in

22% increase in catalystlife

before tuning after tuning

310

300

290

02 4 6 8 10 12 14 16

0

0.2

0.4 Performance Index

hrs

Page 54: Data Mining for Knowledge Extraction in Data Overloaded

54

Copy

right

: S.

L. S

hah,

U. o

f Alb

erta

; 20

03

Flow control loop 2 (corroded valve)

500600700800900

1000110012001300

1015202530354045

Trend Plot of %LO-FC-0045.PV

Valu

e

Right Y Axis

Sample419 838 1257 1676 2095 2514 2933 3352 3771 4190 4609

%LO-FC-0045.PV LO-FC-0045.SP LO-FC-0045.OP

10 1000 2000 3000 4000 5000

-50

-40

-30

-20

-10

0

10

20

30

40

50

erro

rtime

error signal to controller

0 200 400 600 800 1000-50

-40

-30

-20

-10

0

10

20

30

40

50

mag

nitu

de o

f erro

r

no. of occurrence

Histogram of error signal

215 15.5 16 16.5 17 17.5 18 18.5 19 19.5 20

1170

1180

1190

1200

1210

1220

1230

1240

op

pv

3 0.00

0.05

0.10

PI(var) vs. Delay: %LO-FC-0045.PV

PI(v

ar)

Delay 1 2 3 4 5DelayPI(var) 0.012 0.036 0.063 0.092 0.125

4

0.0

0.5

1.0

0 5 10 15 20 25 30 35 40

Min DelayMax Delay

Auto Correlation - %LO-FC-0045.PV

ACF

Lag

ACF 95 % Confidence

5

0.0

0.5

1.0

1.5

2.0Impulse Response - %LO-FC-0045.PV

Impu

lse

Lag9 18 27 36 45 54 63 72 81 90 99

6

-10-505

101520253035

10-2 10-1

Closed-Loop vs. Min.Variance Output Response%LO-FC-0045.PV

Mag

nitu

de (d

B)

Normalized Frequency

Current Minimum Variance

7

f1

0.1

0.2

0.3

0.4

0.5

0.6

0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.450

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

f2

Bicoherence estimated via the direct (FFT) method

89

10-2

10-1

1000

0.5

1

1.5

2

2.5

3

3.5

4 x 104

frequency, Hz

pow

er o

f the

sig

nal

Comments:pv – op shows distinct loops which are indicative of valve problemsPI plot along with ACF and IR show the time delay shows that theperformance is not satisfactory. IR plot indicates oscillations The bicoherence plot clearly indicates presence of significant nonlinearities

April data analysis

Page 55: Data Mining for Knowledge Extraction in Data Overloaded

55

Copy

right

: S.

L. S

hah,

U. o

f Alb

erta

; 20

03

Flow control loop 2: Corroded valve (cont’d)

July Data Analysis

0

500

1000

1500

0

10

20

30

40

50

60

Trend Plot of LO-FC-0045.PV

Valu

e

Right Y Axis

Sample756 1008 1260 1512 1764 2016 2268 2520 2772 3024 3276

LO-FC-0045.PV LO-FC-0045.SP LO-FC-0045.OP

1

0 1000 2000 3000 4000 5000-150

-100

-50

0

50

100

150

erro

rtime

error signal to controller

0 500 1000 1500-150

-100

-50

0

50

100

150

mag

nitu

de o

f erro

r

no. of occurence

Histogram of error signal

2

6 8 10 12 14 16 18 20 22 24 26400

500

600

700

800

900

1000

controller output, op

proc

ess

outp

ut, p

v

30.0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9PI(var) vs. Delay: LO-FC-0045.PV

PI(v

ar)

Delay 1 2 3 4 5DelayPI(var) 0.222 0.485 0.667 0.791 0.879

4

0.0

0.5

1.0

0 5 10 15 20 25 30 35 40

Min DelayMax Delay

Auto Correlation - LO-FC-0045.PV

ACF

Lag

ACF 95 % Confidence

5

0.00.10.20.30.40.50.60.70.80.91.01.1

Impulse Response - LO-FC-0045.PV

Impu

lse

Lag4 8 12 16 20 24 28 32 36 40

6

-20-15-10-505

1015

10-2 10-1

Closed-Loop vs. Min.Variance Output ResponseLO-FC-0045.PV

Mag

nitu

de (d

B)

Normalized Frequency

Current Minimum Variance

7

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.450

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

f1

f2

Bicoherence estimated via the direct (FFT) method

89

10-2

10-1

1000

1

2

3

4

5

6

7

8 x 104

frequency, Hz

pow

er o

f the

sig

nal

Comments:Performance has been improved significantly (see the PI, ACF, and IR plots)pv – op plot indicates there may still be some nonlinearities in this loop.The bicoherence plots indicate that the nonlinearity has been decreased substantially.

Page 56: Data Mining for Knowledge Extraction in Data Overloaded

56

Copy

right

: S.

L. S

hah,

U. o

f Alb

erta

; 20

03

Turning data into knowledge: A New Paradigm

Information KnowledgeKnowledgeData

Process andPerformance Monitoring requiresturning rawdata into avalue addedresource

Develop virtual process variables that can give a proper insight into the workings of a process

scoresloadsmodelsperformance indicesothers

M

Page 57: Data Mining for Knowledge Extraction in Data Overloaded

57

Copy

right

: S.

L. S

hah,

U. o

f Alb

erta

; 20

03

Concluding Remarks

The real world is multivariate and NOTunivariate. So analyze data in a multivariate framework.Modeling, control, fault detection and diagnostics... etc. is possible via multivariate statistical analysis of process data. Process is monitored intelligently.Process monitoring in a predictive mode can avert serious plant upsets.

Page 58: Data Mining for Knowledge Extraction in Data Overloaded

58

Copy

right

: S.

L. S

hah,

U. o

f Alb

erta

; 20

03

Acknowledgments

NSERC, Matrikon Inc., ASRA and the University of AlbertaAT Plastics and Mitsubishi ChemicalsComputer Process Control Group@Ualberta