multiway data analysis johan westerhuis biosystems data analysis swammerdam institute for life...
Post on 22-Dec-2015
245 views
TRANSCRIPT
![Page 1: Multiway Data Analysis Johan Westerhuis Biosystems Data Analysis Swammerdam Institute for Life Sciences Universiteit van Amsterdam](https://reader033.vdocuments.mx/reader033/viewer/2022042504/56649d775503460f94a589e6/html5/thumbnails/1.jpg)
Multiway Data Analysis
Johan Westerhuis
Biosystems Data Analysis
Swammerdam Institute for Life Sciences
Universiteit van Amsterdam
![Page 2: Multiway Data Analysis Johan Westerhuis Biosystems Data Analysis Swammerdam Institute for Life Sciences Universiteit van Amsterdam](https://reader033.vdocuments.mx/reader033/viewer/2022042504/56649d775503460f94a589e6/html5/thumbnails/2.jpg)
The “future” science faculty of the Universiteit van Amsterdam
![Page 3: Multiway Data Analysis Johan Westerhuis Biosystems Data Analysis Swammerdam Institute for Life Sciences Universiteit van Amsterdam](https://reader033.vdocuments.mx/reader033/viewer/2022042504/56649d775503460f94a589e6/html5/thumbnails/3.jpg)
The Biosystems Data Analysis group officially started in 2004 as a follow up of the process analysis group at the Universiteit van Amsterdam.Its aims are: Developing and validation of new data analysis methods for summarizing and visualizing complex structured biological data (Metabolomics / Proteomics).
![Page 4: Multiway Data Analysis Johan Westerhuis Biosystems Data Analysis Swammerdam Institute for Life Sciences Universiteit van Amsterdam](https://reader033.vdocuments.mx/reader033/viewer/2022042504/56649d775503460f94a589e6/html5/thumbnails/4.jpg)
Three-way Data
Three-way Models
Three-way Applications
![Page 5: Multiway Data Analysis Johan Westerhuis Biosystems Data Analysis Swammerdam Institute for Life Sciences Universiteit van Amsterdam](https://reader033.vdocuments.mx/reader033/viewer/2022042504/56649d775503460f94a589e6/html5/thumbnails/5.jpg)
Three-way Data
![Page 6: Multiway Data Analysis Johan Westerhuis Biosystems Data Analysis Swammerdam Institute for Life Sciences Universiteit van Amsterdam](https://reader033.vdocuments.mx/reader033/viewer/2022042504/56649d775503460f94a589e6/html5/thumbnails/6.jpg)
Three-way data
Three-way data is a set of two-way matrices of the same objects and variables.
IR, Raman, NMR spectra of the same samples will not give a three-way data set, but a multi-block data set.
IR Raman NMR
![Page 7: Multiway Data Analysis Johan Westerhuis Biosystems Data Analysis Swammerdam Institute for Life Sciences Universiteit van Amsterdam](https://reader033.vdocuments.mx/reader033/viewer/2022042504/56649d775503460f94a589e6/html5/thumbnails/7.jpg)
Examples of three-way data
BatchProcessB
atch
esTim
e
Process variables
Fluorescence
Sam
ples
Emiss
ion
Excitation
Sensory Analysis
Pro
duct
sJu
dges
Attributes
Chromatography
Sam
ples
UV
Chromatogram
ImageAnalysisIm
age
RGB
Image
![Page 8: Multiway Data Analysis Johan Westerhuis Biosystems Data Analysis Swammerdam Institute for Life Sciences Universiteit van Amsterdam](https://reader033.vdocuments.mx/reader033/viewer/2022042504/56649d775503460f94a589e6/html5/thumbnails/8.jpg)
From noway to multi-wayScalar
1-way
2-way
3-way
4-way
5-way
1
1
1
I
I
I
J
J
J
J J
J
JJ
J
I I
I
II
I
1
1
1 L
M
L
K K K
K
KK
K
![Page 9: Multiway Data Analysis Johan Westerhuis Biosystems Data Analysis Swammerdam Institute for Life Sciences Universiteit van Amsterdam](https://reader033.vdocuments.mx/reader033/viewer/2022042504/56649d775503460f94a589e6/html5/thumbnails/9.jpg)
Slabs and tubes
Vertical slab
Horizontal slab
Vertical tube
Horizontal tube
Lateral tube
Frontal slab
![Page 10: Multiway Data Analysis Johan Westerhuis Biosystems Data Analysis Swammerdam Institute for Life Sciences Universiteit van Amsterdam](https://reader033.vdocuments.mx/reader033/viewer/2022042504/56649d775503460f94a589e6/html5/thumbnails/10.jpg)
Three slabs of fluorescence data5 Samples x 60 Excitation x 200 Emission
Fluorescence
Sam
ples
Emiss
ion
Excitation
![Page 11: Multiway Data Analysis Johan Westerhuis Biosystems Data Analysis Swammerdam Institute for Life Sciences Universiteit van Amsterdam](https://reader033.vdocuments.mx/reader033/viewer/2022042504/56649d775503460f94a589e6/html5/thumbnails/11.jpg)
Three-way batch process data
‘Engineering’ process data i.e. temperature, pressure, flow rate
Spectroscopic process data i.e. NIR, Raman, UV-Vis
One batch A series of batches X (J K) X (I J K)
process variable
time
ba
tchtime
process variable
![Page 12: Multiway Data Analysis Johan Westerhuis Biosystems Data Analysis Swammerdam Institute for Life Sciences Universiteit van Amsterdam](https://reader033.vdocuments.mx/reader033/viewer/2022042504/56649d775503460f94a589e6/html5/thumbnails/12.jpg)
SBR batch process dataEngineering variables
0 100 2008
8.5
9
9.5x 10
-3 Flow S
0 100 2008
8.5
9
9.5x 10
-3 Flow B
0 100 20049.95
50
50.05
50.1Temp Feed
0 100 20049.5
50
50.5
51T React
0 100 20044
46
48
50T Cool
0 100 20046
48
50T Jacket
0 100 200970
980
990
1000Density
0 100 2000
0.5
1Conversion
0 100 2000
500
1000E Release
![Page 13: Multiway Data Analysis Johan Westerhuis Biosystems Data Analysis Swammerdam Institute for Life Sciences Universiteit van Amsterdam](https://reader033.vdocuments.mx/reader033/viewer/2022042504/56649d775503460f94a589e6/html5/thumbnails/13.jpg)
Spectroscopic three-way batch data
2 batch runs of a reaction followed with UV-Vis spectroscopy during 45 minutes
![Page 14: Multiway Data Analysis Johan Westerhuis Biosystems Data Analysis Swammerdam Institute for Life Sciences Universiteit van Amsterdam](https://reader033.vdocuments.mx/reader033/viewer/2022042504/56649d775503460f94a589e6/html5/thumbnails/14.jpg)
Batch Fermentation in two steps: Threeway multiblock
Bat
ches
Variables Tim
e
Bat
ches
Variables Tim
eInoculum
Fermentation
API
![Page 15: Multiway Data Analysis Johan Westerhuis Biosystems Data Analysis Swammerdam Institute for Life Sciences Universiteit van Amsterdam](https://reader033.vdocuments.mx/reader033/viewer/2022042504/56649d775503460f94a589e6/html5/thumbnails/15.jpg)
Four-way data in combinatorial catalysis
Composition
Con
diti
ons
What we want
What we measure
...
...
...
...
...
...
...
...
Composition
Con
diti
ons
![Page 16: Multiway Data Analysis Johan Westerhuis Biosystems Data Analysis Swammerdam Institute for Life Sciences Universiteit van Amsterdam](https://reader033.vdocuments.mx/reader033/viewer/2022042504/56649d775503460f94a589e6/html5/thumbnails/16.jpg)
Multiway data from the Omics age
Gene expression
Exp
erim
ents
Time
Metabolites
Exp
erim
ents
Time
![Page 17: Multiway Data Analysis Johan Westerhuis Biosystems Data Analysis Swammerdam Institute for Life Sciences Universiteit van Amsterdam](https://reader033.vdocuments.mx/reader033/viewer/2022042504/56649d775503460f94a589e6/html5/thumbnails/17.jpg)
Three-way Models
![Page 18: Multiway Data Analysis Johan Westerhuis Biosystems Data Analysis Swammerdam Institute for Life Sciences Universiteit van Amsterdam](https://reader033.vdocuments.mx/reader033/viewer/2022042504/56649d775503460f94a589e6/html5/thumbnails/18.jpg)
M.C. Escher:
Some history
Small problem with orthogonality
![Page 19: Multiway Data Analysis Johan Westerhuis Biosystems Data Analysis Swammerdam Institute for Life Sciences Universiteit van Amsterdam](https://reader033.vdocuments.mx/reader033/viewer/2022042504/56649d775503460f94a589e6/html5/thumbnails/19.jpg)
More history
Psychometrics (1944-1980) Catell 1944: Parallel Proportional profiles (Common factors
fitted simultaneously to many data matrices). Tucker 1964: Tucker models Carroll & Chang 1970: Canonical Decomposition
(CANDECOMP) Harshman 1970: Parallel Factor Analysis (PARAFAC)
Chemistry Ho 1978: Rank Annihilation (close to Parafac) on
fluorescence data. End 80’s beginning 90’s: Threeway methods to resolve
LC-UV data.
![Page 20: Multiway Data Analysis Johan Westerhuis Biosystems Data Analysis Swammerdam Institute for Life Sciences Universiteit van Amsterdam](https://reader033.vdocuments.mx/reader033/viewer/2022042504/56649d775503460f94a589e6/html5/thumbnails/20.jpg)
Multiway PCA:Unfolding of three-way data
IK
J
J
I
K
J
I
K
I
JK
Wold MacGregor
![Page 21: Multiway Data Analysis Johan Westerhuis Biosystems Data Analysis Swammerdam Institute for Life Sciences Universiteit van Amsterdam](https://reader033.vdocuments.mx/reader033/viewer/2022042504/56649d775503460f94a589e6/html5/thumbnails/21.jpg)
Two ways of unfoldingDifferent assumptions in MSPC
Wold Nonlinear behavior in the data Batch trajectories are monitored Online monitoring
MacGregor Nonlinearities removed Whole batch is considered a
measurement Off-line monitoring
![Page 22: Multiway Data Analysis Johan Westerhuis Biosystems Data Analysis Swammerdam Institute for Life Sciences Universiteit van Amsterdam](https://reader033.vdocuments.mx/reader033/viewer/2022042504/56649d775503460f94a589e6/html5/thumbnails/22.jpg)
Extension of SVD to Parafac
UX
VT
= = +
X A
CT
+
S
=
B
G
u1 u2
v1T v2
T
=a1 a2
c1 c2
b1 b2
![Page 23: Multiway Data Analysis Johan Westerhuis Biosystems Data Analysis Swammerdam Institute for Life Sciences Universiteit van Amsterdam](https://reader033.vdocuments.mx/reader033/viewer/2022042504/56649d775503460f94a589e6/html5/thumbnails/23.jpg)
Parafac / Candecomp
Parafac is not sequential Need to re-estimate whole model when more
components are calculated [no deflation]. Parafac solution is unique
No rotational freedom Changing parameters will reduce the fit. NB! A PCA model is not unique X = T*PT + E = T*R*R-1*PT + E = C*ST + E Unique ≠ true
![Page 24: Multiway Data Analysis Johan Westerhuis Biosystems Data Analysis Swammerdam Institute for Life Sciences Universiteit van Amsterdam](https://reader033.vdocuments.mx/reader033/viewer/2022042504/56649d775503460f94a589e6/html5/thumbnails/24.jpg)
Extension of Two Mode component Analysis (TMCA)
AX
G CT
=
X ACT
=
G
B
P
P RR
Q Q
P
P
R
R
Tucker III
![Page 25: Multiway Data Analysis Johan Westerhuis Biosystems Data Analysis Swammerdam Institute for Life Sciences Universiteit van Amsterdam](https://reader033.vdocuments.mx/reader033/viewer/2022042504/56649d775503460f94a589e6/html5/thumbnails/25.jpg)
Tucker models
Tucker I,
Tucker II,
Tucker IIIA
CTG
B
ACTG
AG Equals MPCAX
X
X =
=
=
![Page 26: Multiway Data Analysis Johan Westerhuis Biosystems Data Analysis Swammerdam Institute for Life Sciences Universiteit van Amsterdam](https://reader033.vdocuments.mx/reader033/viewer/2022042504/56649d775503460f94a589e6/html5/thumbnails/26.jpg)
Tucker models
Core array can be fully filled PxQxR triads (1,1,1 / 1,1,2 / 1,2,1 etc) Not unique rotational freedom
Components can be rotated towards orthogonality.
Not sequential Restricted Tucker models can be developed
when using prior chemical knowledge
![Page 27: Multiway Data Analysis Johan Westerhuis Biosystems Data Analysis Swammerdam Institute for Life Sciences Universiteit van Amsterdam](https://reader033.vdocuments.mx/reader033/viewer/2022042504/56649d775503460f94a589e6/html5/thumbnails/27.jpg)
Number of parameters
X(IxJxK) example I=50, J=9, K=100, P = Q = R = 3
Parafac: Rx(I + J + K) 477 Tucker3: PxI + QxJ + RxK + PxQxR 504 MPCA: Rx(I + JK) 2850
Fit MPCA > Parafac (Overfit?)
![Page 28: Multiway Data Analysis Johan Westerhuis Biosystems Data Analysis Swammerdam Institute for Life Sciences Universiteit van Amsterdam](https://reader033.vdocuments.mx/reader033/viewer/2022042504/56649d775503460f94a589e6/html5/thumbnails/28.jpg)
Soft models vs hard models
Two-way bilinear model: Beer’s law
PCA
Trilinear model: Parafac Fluorescence
ijjijiij eptptx 2211
ijkkjikjiijk ecbacbax 222111
,2,,21,,1, iiii eccA No orthogonal constraints
Orthogonal constraints
No orthogonal constraints
![Page 29: Multiway Data Analysis Johan Westerhuis Biosystems Data Analysis Swammerdam Institute for Life Sciences Universiteit van Amsterdam](https://reader033.vdocuments.mx/reader033/viewer/2022042504/56649d775503460f94a589e6/html5/thumbnails/29.jpg)
Multiway Regression I
Two step approach:
fAby
EPAX
~~
P~
Can be Parafac, Tucker, MPCA etc
Decomposition of X to A and modelRegression of y on A
No information of Y is used in the decompositionSimilar to PCR method
P~
X Y
2
2
~,
min
~~min
Aby
PAX
b
PA
y
![Page 30: Multiway Data Analysis Johan Westerhuis Biosystems Data Analysis Swammerdam Institute for Life Sciences Universiteit van Amsterdam](https://reader033.vdocuments.mx/reader033/viewer/2022042504/56649d775503460f94a589e6/html5/thumbnails/30.jpg)
Multiway Regression II
Direct approach
22
,~
,
~~min AbyPAX
bPA
Now X is decomposed with y in mind.This leads to a not optimal decomposition of X but an improved fit of y.
fAby
EPAX
~~
X Yy
![Page 31: Multiway Data Analysis Johan Westerhuis Biosystems Data Analysis Swammerdam Institute for Life Sciences Universiteit van Amsterdam](https://reader033.vdocuments.mx/reader033/viewer/2022042504/56649d775503460f94a589e6/html5/thumbnails/31.jpg)
When data are not exactly 3-way
process variable
time
ba
tch
Time
Indi
cato
r va
riabl
e
Tim
e /
Var
iabl
e
Indicator variableTime
varia
ble
![Page 32: Multiway Data Analysis Johan Westerhuis Biosystems Data Analysis Swammerdam Institute for Life Sciences Universiteit van Amsterdam](https://reader033.vdocuments.mx/reader033/viewer/2022042504/56649d775503460f94a589e6/html5/thumbnails/32.jpg)
Alignment problems
Peakshifts in LCMS/GCMS
Warping methods to align the peaks Dynamic Time Warping Correlation optimized warping
![Page 33: Multiway Data Analysis Johan Westerhuis Biosystems Data Analysis Swammerdam Institute for Life Sciences Universiteit van Amsterdam](https://reader033.vdocuments.mx/reader033/viewer/2022042504/56649d775503460f94a589e6/html5/thumbnails/33.jpg)
Three-way Applications
![Page 34: Multiway Data Analysis Johan Westerhuis Biosystems Data Analysis Swammerdam Institute for Life Sciences Universiteit van Amsterdam](https://reader033.vdocuments.mx/reader033/viewer/2022042504/56649d775503460f94a589e6/html5/thumbnails/34.jpg)
Fluorescence data
5 samples with varying concentration of tyrosine, tryptophan and phenylalanine dissolved in phosphate buffered water.
Excitation wavelength: 240 – 300 nm
Emission wavelength: 250 – 450 nm
![Page 35: Multiway Data Analysis Johan Westerhuis Biosystems Data Analysis Swammerdam Institute for Life Sciences Universiteit van Amsterdam](https://reader033.vdocuments.mx/reader033/viewer/2022042504/56649d775503460f94a589e6/html5/thumbnails/35.jpg)
Unfold PCA model of Fluorescence data
99.97% explained with 3 PC’s
Loadings refolded into Excitation / Emission form
Overfit of data:
Loading 2 has negative parts. This is not according fluorescence theory.
1 2 3 4 5-1.5
-1
-0.5
0
0.5
1
1.5
2
2.5x 10
4
![Page 36: Multiway Data Analysis Johan Westerhuis Biosystems Data Analysis Swammerdam Institute for Life Sciences Universiteit van Amsterdam](https://reader033.vdocuments.mx/reader033/viewer/2022042504/56649d775503460f94a589e6/html5/thumbnails/36.jpg)
Parafac model of Fluorescence data
99.93% explained variation: Good Fit
Loadings are very well interpretable.
Intensity in A mode can be related to concentration
A mode
B and C mode
![Page 37: Multiway Data Analysis Johan Westerhuis Biosystems Data Analysis Swammerdam Institute for Life Sciences Universiteit van Amsterdam](https://reader033.vdocuments.mx/reader033/viewer/2022042504/56649d775503460f94a589e6/html5/thumbnails/37.jpg)
Fluorescence data
ijkkExEmkExEmkExEmkExEm ecbacbacbaI 333222111,,
Florescence data perfectly fits the trilinear model that is applied by Parafac
Due to uniqueness property of Parafac, the loadings found will perfectly resemble the Emission spectra and Excitation spectra of the three compounds in de mixtures.
This is a nice example of Mathematical chromatography
![Page 38: Multiway Data Analysis Johan Westerhuis Biosystems Data Analysis Swammerdam Institute for Life Sciences Universiteit van Amsterdam](https://reader033.vdocuments.mx/reader033/viewer/2022042504/56649d775503460f94a589e6/html5/thumbnails/38.jpg)
Pseudo-first-order reaction:A + BC D + E
UV-Vis spectrum (300-500nm) measured every 10 seconds.
Obeys Lambert-Beer law
35 NOC batches. X (35 201 271)
In addition, some disturbed batches were measured pH disturbance during the reaction Temperature change Impurity
0 50 100 150 200 250 3000
5
10
15
20
25
30
35
40
45
Time (s)
Concentr
ation (
MM
ol)
ReactantIntermediateProduct
300 320 340 360 380 400 420 440 460 480 5000
0.005
0.01
0.015
0.02
0.025
Wavelength (nm)
Absorb
ance (
units)
ReactantIntermediateProduct
Batch reaction monitoring
![Page 39: Multiway Data Analysis Johan Westerhuis Biosystems Data Analysis Swammerdam Institute for Life Sciences Universiteit van Amsterdam](https://reader033.vdocuments.mx/reader033/viewer/2022042504/56649d775503460f94a589e6/html5/thumbnails/39.jpg)
Aims and goals of research I
Data modelling: Improve understanding of process by interpretation
of model parameters
Analysis of historical batches: Are the current process measurements able to
distinguish between ‘good’ and ‘bad’ batches? On-line monitoring:
Rapid fault detection Easier fault diagnosis: what is the cause of the fault? Prediction of batch duration
![Page 40: Multiway Data Analysis Johan Westerhuis Biosystems Data Analysis Swammerdam Institute for Life Sciences Universiteit van Amsterdam](https://reader033.vdocuments.mx/reader033/viewer/2022042504/56649d775503460f94a589e6/html5/thumbnails/40.jpg)
Which batch is different ?
Aims and goals of research II
![Page 41: Multiway Data Analysis Johan Westerhuis Biosystems Data Analysis Swammerdam Institute for Life Sciences Universiteit van Amsterdam](https://reader033.vdocuments.mx/reader033/viewer/2022042504/56649d775503460f94a589e6/html5/thumbnails/41.jpg)
Unfold PCA model
PT
E
jki,r
rk,jri,jki, eptx
TX
= +
Unfold keeping the batch direction (IxJK)
![Page 42: Multiway Data Analysis Johan Westerhuis Biosystems Data Analysis Swammerdam Institute for Life Sciences Universiteit van Amsterdam](https://reader033.vdocuments.mx/reader033/viewer/2022042504/56649d775503460f94a589e6/html5/thumbnails/42.jpg)
Unfold PCA model
Many parameters estimated, likely to overfit the data
![Page 43: Multiway Data Analysis Johan Westerhuis Biosystems Data Analysis Swammerdam Institute for Life Sciences Universiteit van Amsterdam](https://reader033.vdocuments.mx/reader033/viewer/2022042504/56649d775503460f94a589e6/html5/thumbnails/43.jpg)
Unrestricted Parafac model
The simplest three-way model is the PARAFAC model:
X
wavelengths
time
ba
tch
EB
C
A
I +=
![Page 44: Multiway Data Analysis Johan Westerhuis Biosystems Data Analysis Swammerdam Institute for Life Sciences Universiteit van Amsterdam](https://reader033.vdocuments.mx/reader033/viewer/2022042504/56649d775503460f94a589e6/html5/thumbnails/44.jpg)
Unrestricted Parafac model
Loadings are highly correlated - solution may be unstable.
Model is difficult to interpret.
99.4% fit Can external
knowledge of the process be used to improve the model?
1 27-5
0
5Batch mode
Load
ing
1
1 27-5
0
5
Load
ing
2
1 27-0.5
0
0.5
Load
ing
3
Batch number
300 500-0.2
0
0.2Wavelength mode
300 500-0.2
0
0.2
300 500-0.2
0
0.2
Wavelength
0 450.085
0.09
0.095Time mode
0 450.06
0.08
0.1
0.12
0 45-0.5
0
0.5
Time
![Page 45: Multiway Data Analysis Johan Westerhuis Biosystems Data Analysis Swammerdam Institute for Life Sciences Universiteit van Amsterdam](https://reader033.vdocuments.mx/reader033/viewer/2022042504/56649d775503460f94a589e6/html5/thumbnails/45.jpg)
‘Black-box’ or ‘soft’ models are empirical models which aim to fit the data as well as possible e.g. PCA, neural networks.
‘White’ or ‘hard’ models use known external knowledge of the process e.g. physicochemical model, mass-energy balances.
Difficult to interpret
Good fit
Easy to interpret
Not always availableGood fit
+
University of Amsterdam
‘Grey’ or ‘hybrid’ models combine the two.
Grey Modelling of batch data
![Page 46: Multiway Data Analysis Johan Westerhuis Biosystems Data Analysis Swammerdam Institute for Life Sciences Universiteit van Amsterdam](https://reader033.vdocuments.mx/reader033/viewer/2022042504/56649d775503460f94a589e6/html5/thumbnails/46.jpg)
Total variation Systematic variation due to known causes
Systematic variation due to unknown causes
Unsystematic variation
Modelling batch data
= ++white part black partX E
![Page 47: Multiway Data Analysis Johan Westerhuis Biosystems Data Analysis Swammerdam Institute for Life Sciences Universiteit van Amsterdam](https://reader033.vdocuments.mx/reader033/viewer/2022042504/56649d775503460f94a589e6/html5/thumbnails/47.jpg)
External information
Incorporating external information can increase model interpretability increase model stability
300 320 340 360 380 400 420 440 460 480 5000
0.005
0.01
0.015
0.02
0.025
Wavelength (nm)
Absorb
ance (
units)
ReactantIntermediateProduct
ttt
tktkt
tkt
eekk
k
e
CAAD
AC
AA
0
12
01
0
21
1
Pure Spectra Reaction kinetics
![Page 48: Multiway Data Analysis Johan Westerhuis Biosystems Data Analysis Swammerdam Institute for Life Sciences Universiteit van Amsterdam](https://reader033.vdocuments.mx/reader033/viewer/2022042504/56649d775503460f94a589e6/html5/thumbnails/48.jpg)
Restricted ‘white’ model
External information is introduced in the form of parameter restrictions:
X
wavelengths
time
ba
tch
EB
C
A
G +=
REACTION KINETICS
KNOWN SPECTRA
LAMBERT-BEER LAW
![Page 49: Multiway Data Analysis Johan Westerhuis Biosystems Data Analysis Swammerdam Institute for Life Sciences Universiteit van Amsterdam](https://reader033.vdocuments.mx/reader033/viewer/2022042504/56649d775503460f94a589e6/html5/thumbnails/49.jpg)
1 27-0.5
0
0.5Batch mode
Load
ing
1
Batch number300 5000
0.1
0.2Wavelength mode
300 5000
0.1
0.2
Load
ing
2300 5000
0.1
0.2
Load
ing
3
Wavelength
0 450
0.5
1Time mode
0 450
0.5
1
0 450
0.5
1
Time
Restricted Tucker model
Model is stable. 97.6% fit - lower than
for black model Some systematic
variation in the data is left unexplained by this model.
![Page 50: Multiway Data Analysis Johan Westerhuis Biosystems Data Analysis Swammerdam Institute for Life Sciences Universiteit van Amsterdam](https://reader033.vdocuments.mx/reader033/viewer/2022042504/56649d775503460f94a589e6/html5/thumbnails/50.jpg)
Grey model
White components Black components describe known effects can be interpreted
99.8% fit (corresponds well with estimated level of spectral noise of 0.13%)
1 32-0.4-0.2
00.2
0.4
Batch mode
1 32-0.6
-0.4-0.2
00.2
Batch number
300 500-0.1
0
0.1
Wavelength mode
300 500
0
0.1
0.2
Wavelength
0 45-0.1
0
0.1
0.2
Time mode
0 45
0.08
0.09
0.1
Time
1 32-0.5
0
0.5Batch mode
Batch number300 5000
0.1
0.2Wavelength mode
300 5000
0.1
0.2
300 5000
0.1
0.2
Wavelength
0 450
0.5
1Time mode
0 450
0.5
1
0 450
0.5
1
Time
![Page 51: Multiway Data Analysis Johan Westerhuis Biosystems Data Analysis Swammerdam Institute for Life Sciences Universiteit van Amsterdam](https://reader033.vdocuments.mx/reader033/viewer/2022042504/56649d775503460f94a589e6/html5/thumbnails/51.jpg)
Core array of restricted Tucker model
Only combinations: g111,a1,b1,c1
g122,a1,b2,c2
g133,a1,b3,c3
g244,a2,b4,c4
g355,a3,b5,c5
g111 0 0 0 0 0 g122 0 0 0 0 0 g133 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 g244 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 g355
G
3x5x5 core array
![Page 52: Multiway Data Analysis Johan Westerhuis Biosystems Data Analysis Swammerdam Institute for Life Sciences Universiteit van Amsterdam](https://reader033.vdocuments.mx/reader033/viewer/2022042504/56649d775503460f94a589e6/html5/thumbnails/52.jpg)
Grey model residuals
0 10 200
0.005
0.01
0.015
0.02
Batch number
Squ
ared
res
idua
ls
300 350 400 450 5000
1
2
3
4
5x 10
-3
Wavelength
Squ
ared
res
idua
ls
0 5 10 15 20 25 30 35 40 450
0.002
0.004
0.006
0.008
0.01
Time
Squ
ared
res
idua
ls
![Page 53: Multiway Data Analysis Johan Westerhuis Biosystems Data Analysis Swammerdam Institute for Life Sciences Universiteit van Amsterdam](https://reader033.vdocuments.mx/reader033/viewer/2022042504/56649d775503460f94a589e6/html5/thumbnails/53.jpg)
Properties of grey models
White and black model parts can be calculated simultaneously (via restricted core matrix) with
better % fit sequentially with better diagnostics - allows
partitioning of variance
100% = 97.1% + 1.9% + 0.2% simultaneously but with orthogonality restrictions
which also allow partitioning of variance
2222EXXX bw
![Page 54: Multiway Data Analysis Johan Westerhuis Biosystems Data Analysis Swammerdam Institute for Life Sciences Universiteit van Amsterdam](https://reader033.vdocuments.mx/reader033/viewer/2022042504/56649d775503460f94a589e6/html5/thumbnails/54.jpg)
Off-line batch monitoring
NOC: # 1:32 Validation: # 33-35 pH Disturbed: # 36 Temp. problem # 37 Impurity # 38
0 5 10 15 20 25 30 35 4010
-3
10-2
10-1
100
101
102
103
36
37
38
8 11 13
Batch number
ln(Q
-sta
tistic
)
Off-line monitoring: Q-statistic with 95% and 99% confidence limits
![Page 55: Multiway Data Analysis Johan Westerhuis Biosystems Data Analysis Swammerdam Institute for Life Sciences Universiteit van Amsterdam](https://reader033.vdocuments.mx/reader033/viewer/2022042504/56649d775503460f94a589e6/html5/thumbnails/55.jpg)
On-line monitoring of a validation batch
0 5 10 15 20 25 30 35 40 4510
0
101
102
Time
ln(D
-sta
tistic
)
On-line monitoring of batch 33: D-statistic with 95% and 99% confidence limits
0 5 10 15 20 25 30 35 40 4510
-5
100
Time
ln(S
PE
)
On-line monitoring of batch 33: SPE with 95% and 99% confidence limits
![Page 56: Multiway Data Analysis Johan Westerhuis Biosystems Data Analysis Swammerdam Institute for Life Sciences Universiteit van Amsterdam](https://reader033.vdocuments.mx/reader033/viewer/2022042504/56649d775503460f94a589e6/html5/thumbnails/56.jpg)
On-line monitoring of the pH disturbed batch
0 5 10 15 20 25 30 35 40 4510
0
101
102
Time
ln(D
-sta
tistic
)
On-line monitoring of batch 36: D-statistic with 95% and 99% confidence limits
0 5 10 15 20 25 30 35 40 4510
-4
10-3
10-2
10-1
Time
ln(S
PE
)
On-line monitoring of batch 36: SPE with 95% and 99% confidence limits
After 23 minutes SPE goes outside control limits
pH was disturbed after 21 minutes
Only small change in D-statistic
![Page 57: Multiway Data Analysis Johan Westerhuis Biosystems Data Analysis Swammerdam Institute for Life Sciences Universiteit van Amsterdam](https://reader033.vdocuments.mx/reader033/viewer/2022042504/56649d775503460f94a589e6/html5/thumbnails/57.jpg)
On-line monitoring of the temperature disturbed batch
0 5 10 15 20 25 30 35 40 4510
0
101
102
103
Time
ln(D
-sta
tistic
)
On-line monitoring of batch 37: D-statistic with 95% and 99% confidence limits
0 5 10 15 20 25 30 35 40 4510
-4
10-2
100
Time
ln(S
PE
)
On-line monitoring of batch 37: SPE with 95% and 99% confidence limits
Temperature slowly decreasing from start of reaction
Rate constant k1 lower than usual.
Contribution plot shows difference spectrum between reactant (too high) and intermediate (too low)
![Page 58: Multiway Data Analysis Johan Westerhuis Biosystems Data Analysis Swammerdam Institute for Life Sciences Universiteit van Amsterdam](https://reader033.vdocuments.mx/reader033/viewer/2022042504/56649d775503460f94a589e6/html5/thumbnails/58.jpg)
Want to know moreLook at Rasmus Bro’s website