lecture 3: estimating parameters and assessing …thulin/mm/l3.pdfdepartment of mathematics, uppsala...
TRANSCRIPT
Lecture 3: Estimating Parameters and AssessingNormality
Mans Thulin
Department of Mathematics, Uppsala University
Multivariate Methods • 1/4 2011
1/36
Homeworks
I To pass the course (grade 3), all mandatory problems must besolved, you must hold an oral presentation of a clusteringmethod and you must pass the exam.
I For grade 4, you must present satisfactory solutions to at least4 bonus problems (at least one from each homework).
I For grade 5, you must present satisfactory solutions to at least8 bonus problems (at least two from each homework).
I Bonus problems on the exam can be counted as belonging tothe corresponding homework.
2/36
Outline
I Sample momentsI UnbiasednessI Asymptotics
I Estimation for the multivariate normal distributionI Maximum likelihood estimationI Distributions of estimators
I Assessing normalityI How to investigate the validity of the assumption of normality
I Outliers
I Transformations to normality
3/36
Outline
I Sample momentsI UnbiasednessI Asymptotics
I Estimation for the multivariate normal distributionI Maximum likelihood estimationI Distributions of estimators
I Assessing normalityI How to investigate the validity of the assumption of normality
I Outliers
I Transformations to normality
3/36
Outline
I Sample momentsI UnbiasednessI Asymptotics
I Estimation for the multivariate normal distributionI Maximum likelihood estimationI Distributions of estimators
I Assessing normalityI How to investigate the validity of the assumption of normality
I Outliers
I Transformations to normality
3/36
Outline
I Sample momentsI UnbiasednessI Asymptotics
I Estimation for the multivariate normal distributionI Maximum likelihood estimationI Distributions of estimators
I Assessing normalityI How to investigate the validity of the assumption of normality
I Outliers
I Transformations to normality
3/36
Outline
I Sample momentsI UnbiasednessI Asymptotics
I Estimation for the multivariate normal distributionI Maximum likelihood estimationI Distributions of estimators
I Assessing normalityI How to investigate the validity of the assumption of normality
I Outliers
I Transformations to normality
3/36
Multivariate data
We study a p-dimensional data set consisting of n observations.The data is stored in a matrix:
X =
x11 x12 . . . x1px21 x22 . . . x2p
......
. . ....
xn1 xn2 . . . xnp
Row j contains the p measurements for subject j .xjk = measurement k for subject j .
4/36
Sample moments
Sample mean: X =
x1x2...
xp
where xk =1
n
n∑j=1
xjk .
Sample covariance matrix: S =
s11 s12 · · · s1ps12 s22 · · · s2p...
.... . .
...s1p s2p · · · spp
wheresk` = s`k = 1
n−1∑n
j=1(xjk − xk)(xj` − x`), k , ` = 1, 2, . . . , p.
5/36
Sample moments: Unbiasedness
Theorem. Let X1, . . . ,Xn be an i.i.d. sample from a jointdistribution with mean µ and covariance matrix Σ.
Then
E (X) = µ
Cov(X) =1
nΣ
Furthermore,E (S) = Σ
X is an unbiased estimator of µ and S is an unbiased estimator ofΣ.
6/36
Sample moments: Unbiasedness
Theorem. Let X1, . . . ,Xn be an i.i.d. sample from a jointdistribution with mean µ and covariance matrix Σ.Then
E (X) = µ
Cov(X) =1
nΣ
Furthermore,E (S) = Σ
X is an unbiased estimator of µ and S is an unbiased estimator ofΣ.
6/36
Sample moments: Unbiasedness
Theorem. Let X1, . . . ,Xn be an i.i.d. sample from a jointdistribution with mean µ and covariance matrix Σ.Then
E (X) = µ
Cov(X) =1
nΣ
Furthermore,E (S) = Σ
X is an unbiased estimator of µ and S is an unbiased estimator ofΣ.
6/36
Sample moments: Unbiasedness
Theorem. Let X1, . . . ,Xn be an i.i.d. sample from a jointdistribution with mean µ and covariance matrix Σ.Then
E (X) = µ
Cov(X) =1
nΣ
Furthermore,E (S) = Σ
X is an unbiased estimator of µ and S is an unbiased estimator ofΣ.
6/36
Sample moments: Asymptotics
Let X1, . . . ,Xn be i.i.d. observations with mean µ. Then we have
The multivariate law of large numbers:
Xp−→ µ as n→∞,
that is, P(|X− µ| > ε)→ 0 as n→∞ for all ε > 0.
If we further assume that the observations have a finite covariancematrix Σ, then we also have
The multivariate central limit theorem:
√n(X− µ)
d−→ N(0,Σ) as n→∞,
that is, the distribution function of√
n(X− µ) converges to thedistribution function of the N(0,Σ) distribution.
7/36
Sample moments: Asymptotics
Let X1, . . . ,Xn be i.i.d. observations with mean µ. Then we have
The multivariate law of large numbers:
Xp−→ µ as n→∞,
that is, P(|X− µ| > ε)→ 0 as n→∞ for all ε > 0.
If we further assume that the observations have a finite covariancematrix Σ, then we also have
The multivariate central limit theorem:
√n(X− µ)
d−→ N(0,Σ) as n→∞,
that is, the distribution function of√
n(X− µ) converges to thedistribution function of the N(0,Σ) distribution.
7/36
Estimation
We would like to be able to estimate the parameters µ and Σ forthe multivariate normal distribution.
I X and S seem like natural (unbiased!) estimators. What aretheir properties?
I We’ll find the maximum likelihood estimators of µ and Σ andstudy their distributions.
8/36
Estimation
We would like to be able to estimate the parameters µ and Σ forthe multivariate normal distribution.
I X and S seem like natural (unbiased!) estimators. What aretheir properties?
I We’ll find the maximum likelihood estimators of µ and Σ andstudy their distributions.
8/36
Estimation
We would like to be able to estimate the parameters µ and Σ forthe multivariate normal distribution.
I X and S seem like natural (unbiased!) estimators. What aretheir properties?
I We’ll find the maximum likelihood estimators of µ and Σ andstudy their distributions.
8/36
Estimation: ML principle
Let X1, . . . ,Xn be observations with densities fXi,θwith an
unknown parameter θ. The maximum likelihood estimate of θ isthe value of θ that maximizes the likelihood function
L(θ) =n∏
i=1
fXi(xi ).
In general, maximum likelihood estimators have desirableproperties, such as consistency and asymptotic efficiency.
9/36
Estimation: MLE for MVN
For i.i.d. X1, . . . ,Xn from Np(µ,Σ), the likelihood function is
L(µ,Σ) =1
(2π)np/21
|Σ|n/2exp
(−1
2
n∑i=1
(xi − µ)′Σ−1(xi − µ))
I Taking µ = X maximizes L(µ,Σ) with respect to µ.
I Taking Σ = n−1n S maximizes L(µ,Σ) with respect to Σ.
10/36
Estimation: MLE for MVN
For i.i.d. X1, . . . ,Xn from Np(µ,Σ), the likelihood function is
L(µ,Σ) =1
(2π)np/21
|Σ|n/2exp
(−1
2
n∑i=1
(xi − µ)′Σ−1(xi − µ))
I Taking µ = X maximizes L(µ,Σ) with respect to µ.
I Taking Σ = n−1n S maximizes L(µ,Σ) with respect to Σ.
10/36
Estimation: MLE for MVN
For i.i.d. X1, . . . ,Xn from Np(µ,Σ), the likelihood function is
L(µ,Σ) =1
(2π)np/21
|Σ|n/2exp
(−1
2
n∑i=1
(xi − µ)′Σ−1(xi − µ))
I Taking µ = X maximizes L(µ,Σ) with respect to µ.
I Taking Σ = n−1n S maximizes L(µ,Σ) with respect to Σ.
10/36
Estimation: MLE for MVN
Some further remarks:I Functions of parameters: If θ is the ML estimator of θ, then
h(θ) is the ML estimator of h(θ).
I For the multivariate normal distribution:– The ML estimator of µ′Σ−1µ is µ′Σ
−1µ.
– The ML estimator of√σii is
√σii .
I For the multivariate normal distribution, X and Sn aresufficient statistics.
I Thus all the information about µ and Σ in the data matrix Xis contained in X and Sn.
11/36
Estimation: MLE for MVN
Some further remarks:I Functions of parameters: If θ is the ML estimator of θ, then
h(θ) is the ML estimator of h(θ).I For the multivariate normal distribution:– The ML estimator of µ′Σ−1µ is µ′Σ
−1µ.
– The ML estimator of√σii is
√σii .
I For the multivariate normal distribution, X and Sn aresufficient statistics.
I Thus all the information about µ and Σ in the data matrix Xis contained in X and Sn.
11/36
Estimation: MLE for MVN
Some further remarks:I Functions of parameters: If θ is the ML estimator of θ, then
h(θ) is the ML estimator of h(θ).I For the multivariate normal distribution:– The ML estimator of µ′Σ−1µ is µ′Σ
−1µ.
– The ML estimator of√σii is
√σii .
I For the multivariate normal distribution, X and Sn aresufficient statistics.
I Thus all the information about µ and Σ in the data matrix Xis contained in X and Sn.
11/36
Estimation: Distribution of X
Theorem. Let X1, . . . ,Xn be i.i.d. observations from Np(µ,Σ).Then
X ∼ Np
(µ,
1
nΣ).
12/36
Estimation: Distribution of Sn
In the univariate setting,(n − 1)s2 =
∑ni=1(Xi − X )2 ∼ σ2 · χ2(n − 1).
By the definition of the χ2-distribution, this means that∑ni=1(Xi − X )2 is distributed as
σ2(Z 21 + . . .Z 2
n−1) = (σZ1)2 + . . .+ (σZn−1)2,
where the Zi are i.i.d. N(0, 1), so that σZi ∼ N(0, σ2).
Returning to the multivariate setting, let Z1, . . . ,Zm bei.i.d.Np(0,Σ). The distribution of the matrix
∑mi=1 ZiZi
′ is calledthe Wishart distribution, denoted Wm(Σ), where m is calleddegrees of freedom and Σ is called the scale matrix.
13/36
Estimation: Distribution of Sn
In the univariate setting,(n − 1)s2 =
∑ni=1(Xi − X )2 ∼ σ2 · χ2(n − 1).
By the definition of the χ2-distribution, this means that∑ni=1(Xi − X )2 is distributed as
σ2(Z 21 + . . .Z 2
n−1) = (σZ1)2 + . . .+ (σZn−1)2,
where the Zi are i.i.d. N(0, 1), so that σZi ∼ N(0, σ2).
Returning to the multivariate setting, let Z1, . . . ,Zm bei.i.d.Np(0,Σ). The distribution of the matrix
∑mi=1 ZiZi
′ is calledthe Wishart distribution, denoted Wm(Σ), where m is calleddegrees of freedom and Σ is called the scale matrix.
13/36
Estimation: Distribution of Sn
In the univariate setting,(n − 1)s2 =
∑ni=1(Xi − X )2 ∼ σ2 · χ2(n − 1).
By the definition of the χ2-distribution, this means that∑ni=1(Xi − X )2 is distributed as
σ2(Z 21 + . . .Z 2
n−1) = (σZ1)2 + . . .+ (σZn−1)2,
where the Zi are i.i.d. N(0, 1), so that σZi ∼ N(0, σ2).
Returning to the multivariate setting, let Z1, . . . ,Zm bei.i.d.Np(0,Σ). The distribution of the matrix
∑mi=1 ZiZi
′ is calledthe Wishart distribution, denoted Wm(Σ), where m is calleddegrees of freedom and Σ is called the scale matrix.
13/36
Estimation: Distribution of Sn
Theorem. Let X1, . . . ,Xn be i.i.d. observations from Np(µ,Σ).Then
(n − 1)S ∼Wn−1(Σ).
Properties of the Wishart distribution:
I If A1 ∼Wm1(Σ), A2 ∼Wm2(Σ) and A1 and A1 areindependent, then A1 + A2 ∼Wm1+m2(Σ), i.e. their sum isWishart distributed with m1 + m2 degrees of freedom.
I If A ∼Wm1(Σ) then CAC′ ∼Wm1(CΣC′).
14/36
Estimation: Distribution of Sn
Theorem. Let X1, . . . ,Xn be i.i.d. observations from Np(µ,Σ).Then
(n − 1)S ∼Wn−1(Σ).
Properties of the Wishart distribution:
I If A1 ∼Wm1(Σ), A2 ∼Wm2(Σ) and A1 and A1 areindependent, then A1 + A2 ∼Wm1+m2(Σ), i.e. their sum isWishart distributed with m1 + m2 degrees of freedom.
I If A ∼Wm1(Σ) then CAC′ ∼Wm1(CΣC′).
14/36
Assessing normality
I The assumption of normality is fundamental for manymethods of multivariate statistics.
I Due to the multivariate central limit theorem, methods basedon the normal distribution can often be used asapproximations for large n, but it is often better to use other(perhaps non-parametric) methods if the data is non-normalor if n is small.
I For multivariate data, the ways in which distributions candeviate from normality are many and varied.
I Using univariate normality tests on the marginal distributionsmay miss departures in multivariate combinations of variables.
I Using multivariate tests may dilute the effects of a singlenon-normal variable.
15/36
Assessing normality
I The assumption of normality is fundamental for manymethods of multivariate statistics.
I Due to the multivariate central limit theorem, methods basedon the normal distribution can often be used asapproximations for large n, but it is often better to use other(perhaps non-parametric) methods if the data is non-normalor if n is small.
I For multivariate data, the ways in which distributions candeviate from normality are many and varied.
I Using univariate normality tests on the marginal distributionsmay miss departures in multivariate combinations of variables.
I Using multivariate tests may dilute the effects of a singlenon-normal variable.
15/36
Assessing normality
I The assumption of normality is fundamental for manymethods of multivariate statistics.
I Due to the multivariate central limit theorem, methods basedon the normal distribution can often be used asapproximations for large n, but it is often better to use other(perhaps non-parametric) methods if the data is non-normalor if n is small.
I For multivariate data, the ways in which distributions candeviate from normality are many and varied.
I Using univariate normality tests on the marginal distributionsmay miss departures in multivariate combinations of variables.
I Using multivariate tests may dilute the effects of a singlenon-normal variable.
15/36
Assessing normality
I The assumption of normality is fundamental for manymethods of multivariate statistics.
I Due to the multivariate central limit theorem, methods basedon the normal distribution can often be used asapproximations for large n, but it is often better to use other(perhaps non-parametric) methods if the data is non-normalor if n is small.
I For multivariate data, the ways in which distributions candeviate from normality are many and varied.
I Using univariate normality tests on the marginal distributionsmay miss departures in multivariate combinations of variables.
I Using multivariate tests may dilute the effects of a singlenon-normal variable.
15/36
Assessing normality
I The assumption of normality is fundamental for manymethods of multivariate statistics.
I Due to the multivariate central limit theorem, methods basedon the normal distribution can often be used asapproximations for large n, but it is often better to use other(perhaps non-parametric) methods if the data is non-normalor if n is small.
I For multivariate data, the ways in which distributions candeviate from normality are many and varied.
I Using univariate normality tests on the marginal distributionsmay miss departures in multivariate combinations of variables.
I Using multivariate tests may dilute the effects of a singlenon-normal variable.
15/36
Assessing normality: Graphical methods
Graphical presentations of the data can be very useful for detectingdeviations from normality. Useful methods include:
I Scatter plots
I Q-Q-plots of marginal distributions
I χ2-plots (β-plots)
16/36
Assessing normality: Scatter plotsHere all variables are normal. The histograms resemble the normaldensity and the point clouds are elliptic.
x1
−2 −1 0 1 2
●
●●
●●
●●
●●
●
●
●● ●
●
●
●
●
●
●
●
●
●●
●
●
●●
●●
●●
●
●●
●
●
●
●●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●●●
●
●
●●
●●
●
●
●
●●●
●
●
●●
●
●
●
●
●
●●
●●
● ●
●●
●
●●●
●
●
●
●
●
● ●
●
●
●
●●
●●●
●
●
●●
●●
●
●●
●
●●
●
●
●●
●
●●
● ●
●●●
●
●
●
●
●
●
●
●
●● ●
●
●
●●
●●
●●
●
●
●
●
●●
●●
●●
●
●●
●
● ●●●●
●
●
●
●
●●
●
●
●
●
●●
●
●●
●
●
●●
●
●
●
●●●
●●
●●
●●
●●
●
●
●● ●
●
●
●
●
●
●
●
●
●●
●
●
●●
●●
●●
●
●●
●
●
●
●●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●●●
●
●
● ●
●●
●
●
●
● ●●
●
●
●●
●
●
●
●
●
●●
●●
● ●
●●
●
●●
●
●
●
●
●
●
●●
●
●
●
●●
●●●
●
●
●●
●●
●
●●●
●●
●
●
●●
●
●●
●●
●●●
●
●
●
●
●
●
●
●
●● ●
●
●
●●
●●
●●
●
●
●
●
●●
●●
●●
●
●●
●
● ●●● ●
●
●
●
●
●●
●
●
●
●
●●
●
●●
●
●
●●
●
●
●
●●
−8 −6 −4 −2
−5
05
10
●
●●
●●
●●
●●
●
●
●● ●
●
●
●
●
●
●
●
●
●●
●
●
●●
●●
● ●
●
●●
●
●
●
●●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●●
●●
●
●
● ●
●●
●
●
●
●●●
●
●
●●
●
●
●
●
●
●●
●●
● ●
●●
●
●●
●
●
●
●
●
●
● ●
●
●
●
●●
●●●
●
●
●●
●●
●
●●
●
●●
●
●
●●
●
●●
●●
●●●
●
●
●
●
●
●
●
●
●● ●
●
●
●●
●●
●●
●
●
●
●
●●
●●
●●
●
●●
●
● ●● ●●
●
●
●
●
●●
●
●
●
●
●●
●
●●
●
●
●●
●
●
●
●●
−2
−1
01
2
●
●
●
●
●
●●●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●●●
●
●
●
●
● ●●
●●
● ●
●
●
●●
●
●
●●●
●
●●
●●●
●●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●●
● ●●
●●
●
● ●
●
●
●
● ●
●
●
●●
●●
●●
●●
●
●
●●
●
●
●●
●
●●
●
●
●
●●
●
●●
●
●●
●
●
●
●
●
●
●
●
● ●
●●●
● ●●
●
●
●
●
●
●●●
● ●
●●
●●
●●
●●
●
●●
●
● ●
●
●
●
●●
●
●
●
● ●
●
●
●
●
●
●
●●
x2
●
●
●
●
●
●●●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●●●
●
●
●
●
● ●●
●●
● ●
●
●
●●
●
●
● ●●
●
●●
●●
●
●●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●●
● ●●
●●
●
● ●
●
●
●
● ●
●
●
●●
●●
●●
●●
●
●
●●
●
●
●●
●
●●
●
●
●
●●
●
●●
●
●●
●
●
●
●
●
●
●
●
● ●
●●●
●●●
●
●
●
●
●
●●●
● ●
●●
●●
●●
●●●
●●
●
● ●
●
●
●
●●●
●
●
● ●
●
●
●
●
●
●
●● ●
●
●
●
●
●●●
● ●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●● ●
●
●
●
●
●●●
●●
●●
●
●
●●
●
●
● ●●
●
●●
●●
●
●●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
● ●
●● ●
●●
●
●●
●
●
●
●●
●
●
●●
●●
●●
●●
●
●
●●
●
●
●●
●
●●
●
●
●
●●
●
●●
●
●●
●
●
●
●
●
●
●
●
● ●
●● ●
●●●
●
●
●
●
●
●●●
●●
●●
●●
●●
●●
●
●●
●
● ●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●●
●
●●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●●
●●●●
●
●
●
●●
●●●●
●
●
●●
●
●●
●
●●
●
●●
●
●
●
●
●
●
●
●●●●●
●
●
●
●
●
●
●
●● ●●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●●
●
●
●
● ●
●●●
●
●
●
●
●
●●●
●
●
●●
●
●
●
●
●●
●
●
●
●
●●
●
●●
●●
●
●
●
●
●●●
●
●
●
●
●●
●●
●
●●
●●
●●
●
●
●●
●●
●●
●●
●
●
● ●
●
●●
●
● ●
●
●
●
●
●●
●
●
●
●●
●
●●
●
●●
● ●
●
●
●●
●
●
● ●
●
●●
●●
●
●
●
●
● ●
●
●
●
●
●
●
●
●●
●
●
●●
●●●●
●
●
●
●●
●●
●●
●
●
●●
●
●●
●
●●
●
●●
●
●
●
●
●
●
●
●●●
●●
●
●
●
●
●
●
●
● ●●●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●●
●
●
●
●
●
●
● ●●
●
●
●●
●
●
●
●
●●
●
●
●
●
●●
●
●●
● ●
●
●
●
●
● ●●
●
●
●
●
●●
●●
●
●●
●●
●●
●
●
●●
●●
●●
●●
●
●
●●
●
●●
●
●●
●
●
●
●
●●
●
●
●
●●
●
●●
●
●●
●●
●
●
●●
x3
−5
05
10
●
●
●●
●
●●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●●
●●● ●
●
●
●
●●
●●●●
●
●
●●
●
●●
●
● ●
●
●●
●
●
●
●
●
●
●
●●●
●●
●
●
●
●
●
●
●
● ●●●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●●
●
●
●
●
●
●
●●●
●
●
●●
●
●
●
●
●●
●
●
●
●
●●
●
●●
●●
●
●
●
●
● ●●
●
●
●
●
●●
●●
●
●●
●●●●
●
●
●●
●●
●●
●●
●
●
●●
●
●●
●
●●
●
●
●
●
●●
●
●
●
● ●
●
●●
●
●●
●●
●
●
●●
−5 0 5 10
−8
−6
−4
−2
●
●●
●
●
●
●
●
●
●●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●●
●●
●
● ●●●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●●●
●●●● ●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●●
●
●
●●
●●
●
●● ●●
●
●
●
●●
●●
●
● ●
●● ●
●
●●
●● ●
●
●●●
●
●
●
● ●●
●
● ●
●●
●
●
● ●
●
●
●●
●
●●
●
●
●
●
●
●
● ●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●●
●●
●
● ●●●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●●●
●●
● ● ●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●●
●
●
●●
●●●
● ●●●
●
●
●
●●
●●
●
●●
●●●
●
●●
●●●●
●●●
●
●
●
●●●
●
● ●
●●
●
●
●●
●
●
●●
−5 0 5 10
●
●●
●
●
●
●
●
●
●●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●●
●●
●
● ●●●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●●●
●●
●● ●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●●
●
●
●●
●●
●
●●●●
●
●
●
●●
●●
●
● ●
● ●●
●
●●
●●●
●
●●●
●
●
●
● ●●
●
● ●
●●
●
●
●●
●
●
●●
x4
17/36
Assessing normality: Scatter plotsHere X3 and X4 are non-normal. Their histograms are far from thenormal density and the clouds are not elliptic.
x1
−3 −1 0 1 2 3
●●
● ●
●
●
●●
●
●
●●
●
●
●
●●
●●
●
● ●
●
●
●●
●
●●
●
●●● ●
●●
●
●
●
●
●●
●
●
●●
●
●
●●
●●
●
●
●
●
●● ●
●●
●
●
●
●
●●●
● ●
●●
●
●●
●
●
●●●
●
● ●
●●
●●
●
●
●●
● ●●
●
●
●
●●●●
●●
●
● ●●
●
●●
●●
●●●
●●
●
●
●
● ●●
●●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●●●
●●
●
●
●
●
● ●
●●●
●
●
●●
●
●
●
● ●
●
●
●●
●
●
●●
●
● ●
●
●
●
● ●●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●●
●●
●
●
●●
●
●
●●
●
●
●
●●
●●
●
●●
●
●
●●
●
●●
●
●●● ●
●●
●
●
●
●
●●●
●
●●
●
●
●●
●●
●
●
●
●
●●●
●●
●
●
●
●
● ●●
● ●
●●
●
●●
●
●
● ●●
●
● ●
●●
●●
●
●
●●
● ● ●
●
●
●
●●●●
●●
●
●●●
●
● ●
●●
●● ●
●●
●
●
●
●●●
●●
●
●
●
●
●
● ●
●
●
●
●●
●
●
●
●
●● ●
●●
●
●
●
●
●●
●●●
●
●
●●
●
●
●
● ●
●
●
●●
●
●
●●
●
●●
●
●
●
● ●●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
−0.5 0.5 1.5 2.5
−5
05
10
●●
●●
●
●
●●
●
●
●●
●
●
●
●●
●●
●
● ●
●
●
●●
●
●●
●
●● ●●●
●
●
●
●
●
●●
●
●
●●
●
●
●●
●●
●
●
●
●
●● ●
●●
●
●
●
●
●●●
● ●
●●
●
●●
●
●
● ●●
●
●●
●●●
●
●
●
●●
●●●
●
●
●
●● ●●
●●
●
●●●
●
● ●
●●
●●●
●●
●
●
●
●●●
●●●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
● ●●
●●
●
●
●
●
●●
●●●
●
●
●●
●
●
●
●●
●
●
●●
●
●
●●
●
●●
●
●
●
●●● ●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
−3
−1
01
23
●●
●●
●●
●
●
●●
●●●
●
●
●
● ● ●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●● ●
●
●
●
●
●
●
●
●●
●●
●
● ●
●
●
●
●
●
●
●
●
●
●
●●●● ●
●
●
●
●
●
●
●
●● ●
●●●
●●●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
● ●
●
●
● ●
●
●
●●
●●
●
●
●
●
●●
●
●
●●
●
●
●
●●
●
●●●
●
●●●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●●●
●
●
●
●
●
●
●
● ● ●●
●
●
●●
●
●
●
●●●
●●
●
●
●●
●
●
●
●●
●●
●
● ●
●
●
●
●
●
●
●
x2●
●●●●
●●
●
●●
●●●
●
●
●
●●●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
●●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●●●●●●
●
●
●
●
●
●
●● ●
●●●
●● ●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●●
●
●
●●
●●
●
●
●
●
●●
●
●
●●●
●
●
●●
●
●●●
●
●●●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●●●
●
●
●
●
●
●
●
●●●●
●
●
●●
●
●
●
●●●
●●
●
●
●●
●
●
●
●●
●●
●
●●
●
●
●
●
●
●
●
●●
●●●
●●
●
●●
●●●
●
●
●
●● ●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●● ●
●
●
●
●
●
●
●
●●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●●●●●
●
●
●
●
●
●
●
●●●
●● ●●● ●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●●
●
●
●●●●
●
●
●
●
●●
●
●
●●
●
●
●
●●
●
●●●
●
●●●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●●●
●
●
●
●
●
●
●
●● ●●
●
●
●●
●
●
●
●●●
●●
●
●
●●
●
●
●
●●
●●
●
●●
●
●
●
●
●
●
●
●
●
●●●
●
●●●
●
●
●
●
●
●●● ●
●
●●●
●
●●
●
●● ●
●
●
●
●
●
●
●
●
● ●
●●
●●
●
● ●●●
●
●
●
●
●●●
●●
●
●
●
●
●●
●
●
●
●
●●
● ●●
●
●
●
●
●
●
●●
●
●
●●
●
●
● ●
●
●
●
●
●●
●●
●●●
●
●
●
●●●●●
●●
● ●●
●
●
●●●
●
●
●●
●
●
●●
●
● ● ●
●
●●
●
●
●
●●
●
●
●
●
●
●
● ●
●
●
●●
●
●
●
●●●●
●
●
●
●
●
●
●
● ● ●●
●
●
●
●
●●●
●●
● ●
●
●●●
●
●
●
●
●
●
●
●
●● ● ●
●
●
● ●
●
● ●
●
● ●●
●
●●●
●
●
●
●
●
●● ●●
●
●● ●
●
●●
●
●●●
●
●
●
●
●
●
●
●
● ●
●●
● ●
●
●● ●●
●
●
●
●
●●●
● ●
●
●
●
●
●●
●
●
●
●
●●
●●●
●
●
●
●
●
●
●●●
●
●●
●
●
●●
●
●
●
●
●●
● ●
●● ●
●
●
●
●●● ●●
● ●
● ●●
●
●
● ●●
●
●
●●
●
●
●●
●
●● ●
●
●●●
●
●
●●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●●●●
●
●
●
●
●
●
●
●●● ●
●
●
●
●
●● ●
●●
●●
●
●●●
●
●
●
●
●
●
●
●
●●● ●
●
●
● ●
●
●
x3
−0.
6−
0.4
−0.
20.
0
●
●
●●●
●
● ●●
●
●
●
●
●
● ●●●
●
●● ●
●
●●
●
●● ●
●
●
●
●
●
●
●
●
●●
●●
●●
●
●●●●
●
●
●
●
●● ●
●●
●
●
●
●
●●
●
●
●
●
●●
●●●
●
●
●
●
●
●
●●
●
●
●●
●
●
● ●
●
●
●
●
●●
●●
●●●
●
●
●
●● ●● ●
●●
●● ●
●
●
● ●●
●
●
●●
●
●
●●
●
●●●
●
●●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●●●●●
●
●
●
●
●
●
●● ●●
●
●
●
●
●●●
●●
●●
●
●● ●
●
●
●
●
●
●
●
●
● ●● ●
●
●
● ●
●
●
−5 0 5 10
−0.
50.
51.
52.
5
●●
●●●
●
●
●
●
●●
●●
● ●
●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
●●
●
●
●●●
●
●
● ●●
●
●
●
●
● ●●
●●●
●
●
●
●
●
●
●●
●
●●
●
●
● ●● ●
●
●
●
●●
●●●
●
●
●●
●
●
● ●
●
●●
●
●●
●
●
●
●
●
●
●●
●●
● ●
●
●
●
●● ●
●
●
●
●
●
●●●●
●●●●
● ●
●
●●
●
●
●
●●
●
●●
●
●
●●
●●●
●●
●
●●
●●
●● ●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●●●
●
●●
●●●
●●
●
● ●●
● ●
●●
●
●
●
●
●
●●
●●
●●●
●
●
●
●
●●
●●
● ●
●
●
●
●
●
●
●
●
● ●●
●
●
●
●
●
●
●
●
●
●●
●
●
●●●
●
●
●●●
●
●
●
●
● ●●
●● ●
●
●
●
●
●
●
●●
●
●●
●
●
●●●●
●
●
●
●●
●●●●
●
●●
●
●
●●
●
●●
●
● ●
●
●
●
●
●
●
●●
● ●
●●
●
●
●
● ●●
●
●
●
●
●
●● ●●
●●● ●
● ●
●
●●●
●
●
●●
●
● ●
●
●
●●
●● ●
●●
●
●●
●●
●● ●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
● ●●●
●
●●
●●●
●●
●
●●●
● ●
●●
●
●
●
●
●
●●
−0.6 −0.4 −0.2 0.0
●●
●●●
●
●
●
●
●●
●●
● ●
●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
●●
●
●
● ●●
●
●
●●●
●
●
●
●
● ●●
●●●
●
●
●
●
●
●
●●
●
●●
●
●
●●● ●
●
●
●
●●
●●●●
●
● ●
●
●
● ●
●
●●
●
●●
●
●
●
●
●
●
●●●●
●●
●
●
●
●● ●
●
●
●
●
●
●●●●
●●● ●
●●
●
●●
●
●
●
●●
●
●●
●
●
●●
●● ●
●●
●
●●
●●
●●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●●●●
●●
●●●
●●
●
● ●●
● ●
●●
●
●
●
●
●
●●
x4
18/36
Assessing normality: Q-Q-plotsNormal samples. First row n = 15, second row n = 40 and thirdrow n = 100.
●
●
●
●●●
●
●
●
●
●
●
●
●
●
−1 0 1
−2.
0−
0.5
0.5
1.5
Normal Q−Q Plot
Theoretical Quantiles
Sam
ple
Qua
ntile
s
●
●
●●
●
●
●
●
●
●
●
●
●
●●
−1 0 1
−1.
00.
01.
0
Normal Q−Q Plot
Theoretical Quantiles
Sam
ple
Qua
ntile
s
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
−1 0 1
−3
−2
−1
01
Normal Q−Q Plot
Theoretical Quantiles
Sam
ple
Qua
ntile
s
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
−2 −1 0 1 2
−1
01
2
Normal Q−Q Plot
Theoretical Quantiles
Sam
ple
Qua
ntile
s
●
●
●
●
●
●●●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
−2 −1 0 1 2
−2
−1
01
Normal Q−Q Plot
Theoretical Quantiles
Sam
ple
Qua
ntile
s
●●
●●
●●
●
●
●
●●
●
●●
●
●
●
●
●
●●●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
−2 −1 0 1 2
−2
−1
01
2
Normal Q−Q Plot
Theoretical Quantiles
Sam
ple
Qua
ntile
s
●
●
●●
●●●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●●
●
●
●
●
●●
●●●●
●
●●
●
●
●●
● ●●
●●
●
●
●
●●
●
●
●●
●●
●
●
●
●
●
●
●
●●
●
●
●
●●
●●
●
●
●
●
●●
●
●
●●
●
●
●●
●●●
−2 −1 0 1 2
−3
−1
12
Normal Q−Q Plot
Theoretical Quantiles
Sam
ple
Qua
ntile
s
●
●
●●
●
●
●
●●
●●
●
●
●
●
●
●●●●
●
●
●
●
●
●
●●
●●
●●
●
●
●
●
●●
●
●
●●
●
●
●
●
●●
●
●
● ●●●
●
●●
●●●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●●
●
−2 −1 0 1 2
−2
−1
01
2
Normal Q−Q Plot
Theoretical Quantiles
Sam
ple
Qua
ntile
s
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●●●
●
●
−2 −1 0 1 2
−2
−1
01
2
Normal Q−Q Plot
Theoretical Quantiles
Sam
ple
Qua
ntile
s
19/36
Assessing normality: Q-Q-plotsFirst column: uniform distribution. Second column: exponentialdistribution: Third column: β(1/2, 1/2) distribution (bimodal).
●
●
●
●●
●
●
●●
●
●
●
● ●●
−1 0 1
0.2
0.4
0.6
0.8
Normal Q−Q Plot
Theoretical Quantiles
Sam
ple
Qua
ntile
s
●
●
●
●
●
●
●● ●
●
●
●
●
●
●
−1 0 1
0.0
1.0
2.0
3.0
Normal Q−Q Plot
Theoretical Quantiles
Sam
ple
Qua
ntile
s
●
●●
●●
●
●
●
●
●
●
●
●
●
●
−1 0 1
0.0
0.4
0.8
Normal Q−Q Plot
Theoretical Quantiles
Sam
ple
Qua
ntile
s
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
−2 −1 0 1 2
0.2
0.6
1.0
Normal Q−Q Plot
Theoretical Quantiles
Sam
ple
Qua
ntile
s
●
●
●
●●
●
●●
●
●●
●
●
●●
●
●
●
●●
●
●
●●
●
●●
●
●
●
●●
●
●
●
●●●● ●
−2 −1 0 1 2
01
23
Normal Q−Q Plot
Theoretical Quantiles
Sam
ple
Qua
ntile
s
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●● ●
●
●
● ●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
−2 −1 0 1 2
0.0
0.4
0.8
Normal Q−Q Plot
Theoretical Quantiles
Sam
ple
Qua
ntile
s
●
●
●
●
●
●
●
●
●
● ●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
−2 −1 0 1 2
0.0
0.4
0.8
Normal Q−Q Plot
Theoretical Quantiles
Sam
ple
Qua
ntile
s
●
●
●●
●
●
●
●●●
●●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●● ●
●
●●
●
●
●
●
●
●
●
●
●
●
●●●
●●
●●●●
●
●
●●●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●
●●●
●
●●●
●
●
●
−2 −1 0 1 2
−3
−1
01
2
Normal Q−Q Plot
Theoretical Quantiles
Sam
ple
Qua
ntile
s ●●
●
●
● ●
●
●
●●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●●
●
●
●●●
●
●
●●
●
●
●
●●
●
●
●●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
● ●●
●
●
●
−2 −1 0 1 2
0.0
0.4
0.8
Normal Q−Q Plot
Theoretical Quantiles
Sam
ple
Qua
ntile
s
20/36
Assessing normality: χ2-plots (β-plots)For normal data, (Xj − X)′Σ−1(Xj − X) ∼ χ2
p.
Idea: for large n
d2j = (xj − x)′S−1(xj − x)
should be approximately χ2p-distributed. We could thus do a
Q-Q-plot of d2j against the χ2
p quantiles. This reduces thep-dimensional data to just one dimension and a single Q-Q-plotinstead of p plots.
Problem: convergence to χ2p turns out to be slow. However,
Gnanadesikan and Kettenring (1972) showed that
n · d2j
(n − 1)2∼ β(p/2, (n − p − 1)/2).
and thus the quantiles from the beta distribution are moreappropriate to use.Gnanadesikan & Kettenring (1972), Robust Estimates, Residuals, and Outlier Detection with Multiresponse Data,Biometrics, 28, pp. 81-124.
21/36
Assessing normality: χ2-plots (β-plots)For normal data, (Xj − X)′Σ−1(Xj − X) ∼ χ2
p.
Idea: for large n
d2j = (xj − x)′S−1(xj − x)
should be approximately χ2p-distributed.
We could thus do aQ-Q-plot of d2
j against the χ2p quantiles. This reduces the
p-dimensional data to just one dimension and a single Q-Q-plotinstead of p plots.
Problem: convergence to χ2p turns out to be slow. However,
Gnanadesikan and Kettenring (1972) showed that
n · d2j
(n − 1)2∼ β(p/2, (n − p − 1)/2).
and thus the quantiles from the beta distribution are moreappropriate to use.Gnanadesikan & Kettenring (1972), Robust Estimates, Residuals, and Outlier Detection with Multiresponse Data,Biometrics, 28, pp. 81-124.
21/36
Assessing normality: χ2-plots (β-plots)For normal data, (Xj − X)′Σ−1(Xj − X) ∼ χ2
p.
Idea: for large n
d2j = (xj − x)′S−1(xj − x)
should be approximately χ2p-distributed. We could thus do a
Q-Q-plot of d2j against the χ2
p quantiles. This reduces thep-dimensional data to just one dimension and a single Q-Q-plotinstead of p plots.
Problem: convergence to χ2p turns out to be slow. However,
Gnanadesikan and Kettenring (1972) showed that
n · d2j
(n − 1)2∼ β(p/2, (n − p − 1)/2).
and thus the quantiles from the beta distribution are moreappropriate to use.Gnanadesikan & Kettenring (1972), Robust Estimates, Residuals, and Outlier Detection with Multiresponse Data,Biometrics, 28, pp. 81-124.
21/36
Assessing normality: χ2-plots (β-plots)For normal data, (Xj − X)′Σ−1(Xj − X) ∼ χ2
p.
Idea: for large n
d2j = (xj − x)′S−1(xj − x)
should be approximately χ2p-distributed. We could thus do a
Q-Q-plot of d2j against the χ2
p quantiles. This reduces thep-dimensional data to just one dimension and a single Q-Q-plotinstead of p plots.
Problem: convergence to χ2p turns out to be slow. However,
Gnanadesikan and Kettenring (1972) showed that
n · d2j
(n − 1)2∼ β(p/2, (n − p − 1)/2).
and thus the quantiles from the beta distribution are moreappropriate to use.Gnanadesikan & Kettenring (1972), Robust Estimates, Residuals, and Outlier Detection with Multiresponse Data,Biometrics, 28, pp. 81-124.
21/36
Assessing normality: Formal tests
I Univariate data:I The Shapiro-Wilk test.I Tests based on skewness and kurtosis.I Univariate tests can be used in multivariate analysis by looking
at the marginal distributions one by one.
I Mardia’s tests.I The tests are generalizations of tests based on skewness and
kurtosis.
22/36
Assessing normality: Formal tests
I Univariate data:I The Shapiro-Wilk test.I Tests based on skewness and kurtosis.I Univariate tests can be used in multivariate analysis by looking
at the marginal distributions one by one.
I Mardia’s tests.I The tests are generalizations of tests based on skewness and
kurtosis.
22/36
Assessing normality: Shapiro-Wilk
For a univariate sample, consider the order statisticsx(1) ≤ x(2) ≤ . . . ≤ x(n).
The Shapiro-Wilk test is based on thestatistic
W =
(∑ni=1 aix(i)
)2∑ni=1(x(i) − x)2
where the ai come from the covariance matrix of the observations.
I Published 1956.
I Scale and location invariant.
I A formalization of Q-Q-plots.
23/36
Assessing normality: Shapiro-Wilk
For a univariate sample, consider the order statisticsx(1) ≤ x(2) ≤ . . . ≤ x(n). The Shapiro-Wilk test is based on thestatistic
W =
(∑ni=1 aix(i)
)2∑ni=1(x(i) − x)2
where the ai come from the covariance matrix of the observations.
I Published 1956.
I Scale and location invariant.
I A formalization of Q-Q-plots.
23/36
Assessing normality: Shapiro-Wilk
For a univariate sample, consider the order statisticsx(1) ≤ x(2) ≤ . . . ≤ x(n). The Shapiro-Wilk test is based on thestatistic
W =
(∑ni=1 aix(i)
)2∑ni=1(x(i) − x)2
where the ai come from the covariance matrix of the observations.
I Published 1956.
I Scale and location invariant.
I A formalization of Q-Q-plots.
23/36
Assessing normality: Shapiro-Wilk
For a univariate sample, consider the order statisticsx(1) ≤ x(2) ≤ . . . ≤ x(n). The Shapiro-Wilk test is based on thestatistic
W =
(∑ni=1 aix(i)
)2∑ni=1(x(i) − x)2
where the ai come from the covariance matrix of the observations.
I Published 1956.
I Scale and location invariant.
I A formalization of Q-Q-plots.
23/36
Assessing normality: Shapiro-Wilk
For a univariate sample, consider the order statisticsx(1) ≤ x(2) ≤ . . . ≤ x(n). The Shapiro-Wilk test is based on thestatistic
W =
(∑ni=1 aix(i)
)2∑ni=1(x(i) − x)2
where the ai come from the covariance matrix of the observations.
I Published 1956.
I Scale and location invariant.
I A formalization of Q-Q-plots.
23/36
Assessing normality: Skewness and kurtosisFor a univariate random variable X the skewness is
γ =E (X − µ)3
σ3
and the kurtosis is
κ =E (X − µ)4
σ4− 3.
Both these quantities are 0 for the normal distribution, butnon-zero for many other distributions.
In particular, all symmetric distributions have γ = 0. κ is related tohow heavy the tails of the distribution are, and to some extent tobimodality.
To use skewness and kurtosis for a tests for normality, compute
γ =1n
∑ni=1(xi − x)3
( 1n∑n
i=1(xi − x)2)3/2and κ =
1n
∑ni=1(xi − x)4
( 1n∑n
i=1(xi − x)2)2
and reject the hypothesis of normality if the statistics are too farfrom 0.
24/36
Assessing normality: Skewness and kurtosisFor a univariate random variable X the skewness is
γ =E (X − µ)3
σ3
and the kurtosis is
κ =E (X − µ)4
σ4− 3.
Both these quantities are 0 for the normal distribution, butnon-zero for many other distributions.
In particular, all symmetric distributions have γ = 0. κ is related tohow heavy the tails of the distribution are, and to some extent tobimodality.
To use skewness and kurtosis for a tests for normality, compute
γ =1n
∑ni=1(xi − x)3
( 1n∑n
i=1(xi − x)2)3/2and κ =
1n
∑ni=1(xi − x)4
( 1n∑n
i=1(xi − x)2)2
and reject the hypothesis of normality if the statistics are too farfrom 0.
24/36
Assessing normality: Skewness and kurtosisFor a univariate random variable X the skewness is
γ =E (X − µ)3
σ3
and the kurtosis is
κ =E (X − µ)4
σ4− 3.
Both these quantities are 0 for the normal distribution, butnon-zero for many other distributions.
In particular, all symmetric distributions have γ = 0. κ is related tohow heavy the tails of the distribution are, and to some extent tobimodality.
To use skewness and kurtosis for a tests for normality, compute
γ =1n
∑ni=1(xi − x)3
( 1n∑n
i=1(xi − x)2)3/2and κ =
1n
∑ni=1(xi − x)4
( 1n∑n
i=1(xi − x)2)2
and reject the hypothesis of normality if the statistics are too farfrom 0.
24/36
Assessing normality: Skewness and kurtosisFor a univariate random variable X the skewness is
γ =E (X − µ)3
σ3
and the kurtosis is
κ =E (X − µ)4
σ4− 3.
Both these quantities are 0 for the normal distribution, butnon-zero for many other distributions.
In particular, all symmetric distributions have γ = 0. κ is related tohow heavy the tails of the distribution are, and to some extent tobimodality.
To use skewness and kurtosis for a tests for normality, compute
γ =1n
∑ni=1(xi − x)3
( 1n∑n
i=1(xi − x)2)3/2and κ =
1n
∑ni=1(xi − x)4
( 1n∑n
i=1(xi − x)2)2
and reject the hypothesis of normality if the statistics are too farfrom 0.
24/36
Assessing normality: Skewness and kurtosisFor a univariate random variable X the skewness is
γ =E (X − µ)3
σ3
and the kurtosis is
κ =E (X − µ)4
σ4− 3.
Both these quantities are 0 for the normal distribution, butnon-zero for many other distributions.
In particular, all symmetric distributions have γ = 0. κ is related tohow heavy the tails of the distribution are, and to some extent tobimodality.
To use skewness and kurtosis for a tests for normality, compute
γ =1n
∑ni=1(xi − x)3
( 1n∑n
i=1(xi − x)2)3/2and κ =
1n
∑ni=1(xi − x)4
( 1n∑n
i=1(xi − x)2)2
and reject the hypothesis of normality if the statistics are too farfrom 0.
24/36
Assessing normality: Univariate tests
I The skewness test based on γ is sensitive against asymmetricdistributions but not against kurtotic distributions.
I The kurtosis test based on κ is sensitive against kurtoticdistributions but not against asymmetric distributions.
I Should we use the skewness test or the kurtosis test?Rule of thumb: for inference about µ, we should worry moreabout asymmetric distributions. For inference about σ2,deviations in kurtosis is more dangerous.
I The Shapiro-Wilk test is usually less sensitive than γ and κagainst asymmetric and kurtotic alternatives, respectively, buthas high average power against all classes of alternatives.
25/36
Assessing normality: Univariate tests
I The skewness test based on γ is sensitive against asymmetricdistributions but not against kurtotic distributions.
I The kurtosis test based on κ is sensitive against kurtoticdistributions but not against asymmetric distributions.
I Should we use the skewness test or the kurtosis test?Rule of thumb: for inference about µ, we should worry moreabout asymmetric distributions. For inference about σ2,deviations in kurtosis is more dangerous.
I The Shapiro-Wilk test is usually less sensitive than γ and κagainst asymmetric and kurtotic alternatives, respectively, buthas high average power against all classes of alternatives.
25/36
Assessing normality: Univariate tests
I The skewness test based on γ is sensitive against asymmetricdistributions but not against kurtotic distributions.
I The kurtosis test based on κ is sensitive against kurtoticdistributions but not against asymmetric distributions.
I Should we use the skewness test or the kurtosis test?
Rule of thumb: for inference about µ, we should worry moreabout asymmetric distributions. For inference about σ2,deviations in kurtosis is more dangerous.
I The Shapiro-Wilk test is usually less sensitive than γ and κagainst asymmetric and kurtotic alternatives, respectively, buthas high average power against all classes of alternatives.
25/36
Assessing normality: Univariate tests
I The skewness test based on γ is sensitive against asymmetricdistributions but not against kurtotic distributions.
I The kurtosis test based on κ is sensitive against kurtoticdistributions but not against asymmetric distributions.
I Should we use the skewness test or the kurtosis test?Rule of thumb: for inference about µ, we should worry moreabout asymmetric distributions. For inference about σ2,deviations in kurtosis is more dangerous.
I The Shapiro-Wilk test is usually less sensitive than γ and κagainst asymmetric and kurtotic alternatives, respectively, buthas high average power against all classes of alternatives.
25/36
Assessing normality: Univariate tests
I The skewness test based on γ is sensitive against asymmetricdistributions but not against kurtotic distributions.
I The kurtosis test based on κ is sensitive against kurtoticdistributions but not against asymmetric distributions.
I Should we use the skewness test or the kurtosis test?Rule of thumb: for inference about µ, we should worry moreabout asymmetric distributions. For inference about σ2,deviations in kurtosis is more dangerous.
I The Shapiro-Wilk test is usually less sensitive than γ and κagainst asymmetric and kurtotic alternatives, respectively, buthas high average power against all classes of alternatives.
25/36
Assessing normality: Univariate tests
I A necessary, but not sufficient, condition for a distribution tobe multivariate normal is that all marginal distributions arenormal.
I Some univariate normality test can be used for each of themarginal variables in the p-variate sample.
I However, the variables may be dependent. If so, the outcomesof the p tests will also be dependent!
I What is the joint significance level of the normality tests?How can we control this level?
I One way of handling this problem is to use Bonferroni’sinequality. (We’ll discuss this in the next lecture.)
I Some authors suggest reducing the dimension of the problemby performing a univariate normality test on e1xj , where e1 isthe eigenvector corresponding to the largest eigenvalue of S.(More on this when we discuss PCA.)
26/36
Assessing normality: Univariate tests
I A necessary, but not sufficient, condition for a distribution tobe multivariate normal is that all marginal distributions arenormal.
I Some univariate normality test can be used for each of themarginal variables in the p-variate sample.
I However, the variables may be dependent. If so, the outcomesof the p tests will also be dependent!
I What is the joint significance level of the normality tests?How can we control this level?
I One way of handling this problem is to use Bonferroni’sinequality. (We’ll discuss this in the next lecture.)
I Some authors suggest reducing the dimension of the problemby performing a univariate normality test on e1xj , where e1 isthe eigenvector corresponding to the largest eigenvalue of S.(More on this when we discuss PCA.)
26/36
Assessing normality: Univariate tests
I A necessary, but not sufficient, condition for a distribution tobe multivariate normal is that all marginal distributions arenormal.
I Some univariate normality test can be used for each of themarginal variables in the p-variate sample.
I However, the variables may be dependent. If so, the outcomesof the p tests will also be dependent!
I What is the joint significance level of the normality tests?How can we control this level?
I One way of handling this problem is to use Bonferroni’sinequality. (We’ll discuss this in the next lecture.)
I Some authors suggest reducing the dimension of the problemby performing a univariate normality test on e1xj , where e1 isthe eigenvector corresponding to the largest eigenvalue of S.(More on this when we discuss PCA.)
26/36
Assessing normality: Univariate tests
I A necessary, but not sufficient, condition for a distribution tobe multivariate normal is that all marginal distributions arenormal.
I Some univariate normality test can be used for each of themarginal variables in the p-variate sample.
I However, the variables may be dependent. If so, the outcomesof the p tests will also be dependent!
I What is the joint significance level of the normality tests?How can we control this level?
I One way of handling this problem is to use Bonferroni’sinequality. (We’ll discuss this in the next lecture.)
I Some authors suggest reducing the dimension of the problemby performing a univariate normality test on e1xj , where e1 isthe eigenvector corresponding to the largest eigenvalue of S.(More on this when we discuss PCA.)
26/36
Assessing normality: Univariate tests
I A necessary, but not sufficient, condition for a distribution tobe multivariate normal is that all marginal distributions arenormal.
I Some univariate normality test can be used for each of themarginal variables in the p-variate sample.
I However, the variables may be dependent. If so, the outcomesof the p tests will also be dependent!
I What is the joint significance level of the normality tests?How can we control this level?
I One way of handling this problem is to use Bonferroni’sinequality. (We’ll discuss this in the next lecture.)
I Some authors suggest reducing the dimension of the problemby performing a univariate normality test on e1xj , where e1 isthe eigenvector corresponding to the largest eigenvalue of S.(More on this when we discuss PCA.)
26/36
Assessing normality: Univariate tests
I A necessary, but not sufficient, condition for a distribution tobe multivariate normal is that all marginal distributions arenormal.
I Some univariate normality test can be used for each of themarginal variables in the p-variate sample.
I However, the variables may be dependent. If so, the outcomesof the p tests will also be dependent!
I What is the joint significance level of the normality tests?How can we control this level?
I One way of handling this problem is to use Bonferroni’sinequality. (We’ll discuss this in the next lecture.)
I Some authors suggest reducing the dimension of the problemby performing a univariate normality test on e1xj , where e1 isthe eigenvector corresponding to the largest eigenvalue of S.(More on this when we discuss PCA.)
26/36
Assessing normality: Mardia’s multivariate testsMardia’s tests for multivariate normality are based on the statistic
dij = (xi − x)′S−1(xj − x).
The test statistics
γ2p =1
n2
n∑i ,j=1
d3ij and κp =
1
n
n∑i=1
d2ii
are generalizations of γ2 and κ.
I Published 1970.
I Scale and location invariant.
I Extends the notions of skewness and kurtosis to themultivariate setting.
I Various generalizations exist, where the dij are used in slightlydifferent ways.
27/36
Assessing normality: Mardia’s multivariate testsMardia’s tests for multivariate normality are based on the statistic
dij = (xi − x)′S−1(xj − x).
The test statistics
γ2p =1
n2
n∑i ,j=1
d3ij and κp =
1
n
n∑i=1
d2ii
are generalizations of γ2 and κ.
I Published 1970.
I Scale and location invariant.
I Extends the notions of skewness and kurtosis to themultivariate setting.
I Various generalizations exist, where the dij are used in slightlydifferent ways.
27/36
Assessing normality: Mardia’s multivariate testsMardia’s tests for multivariate normality are based on the statistic
dij = (xi − x)′S−1(xj − x).
The test statistics
γ2p =1
n2
n∑i ,j=1
d3ij and κp =
1
n
n∑i=1
d2ii
are generalizations of γ2 and κ.
I Published 1970.
I Scale and location invariant.
I Extends the notions of skewness and kurtosis to themultivariate setting.
I Various generalizations exist, where the dij are used in slightlydifferent ways.
27/36
Assessing normality: Mardia’s multivariate testsMardia’s tests for multivariate normality are based on the statistic
dij = (xi − x)′S−1(xj − x).
The test statistics
γ2p =1
n2
n∑i ,j=1
d3ij and κp =
1
n
n∑i=1
d2ii
are generalizations of γ2 and κ.
I Published 1970.
I Scale and location invariant.
I Extends the notions of skewness and kurtosis to themultivariate setting.
I Various generalizations exist, where the dij are used in slightlydifferent ways.
27/36
Assessing normality: Mardia’s multivariate testsMardia’s tests for multivariate normality are based on the statistic
dij = (xi − x)′S−1(xj − x).
The test statistics
γ2p =1
n2
n∑i ,j=1
d3ij and κp =
1
n
n∑i=1
d2ii
are generalizations of γ2 and κ.
I Published 1970.
I Scale and location invariant.
I Extends the notions of skewness and kurtosis to themultivariate setting.
I Various generalizations exist, where the dij are used in slightlydifferent ways.
27/36
Assessing normality: Mardia’s multivariate testsMardia’s tests for multivariate normality are based on the statistic
dij = (xi − x)′S−1(xj − x).
The test statistics
γ2p =1
n2
n∑i ,j=1
d3ij and κp =
1
n
n∑i=1
d2ii
are generalizations of γ2 and κ.
I Published 1970.
I Scale and location invariant.
I Extends the notions of skewness and kurtosis to themultivariate setting.
I Various generalizations exist, where the dij are used in slightlydifferent ways.
27/36
Assessing normality: Recommendations
When assessing multivariate normality, it is often a good idea touse more than one method in order to account for differentpossibilities of deviations from normality.
I Inspect scatter plots and univariate Q-Q-plots.
I Perform univariate tests of normality on each variable.
I Use methods based on dimension reduction – test some linearcombination and look at the β plot.
I Use a test for multivariate normality.
Care must be taken to make sure that the joint significance level ofthe tests is reasonable!
28/36
Assessing normality: Recommendations
When assessing multivariate normality, it is often a good idea touse more than one method in order to account for differentpossibilities of deviations from normality.
I Inspect scatter plots and univariate Q-Q-plots.
I Perform univariate tests of normality on each variable.
I Use methods based on dimension reduction – test some linearcombination and look at the β plot.
I Use a test for multivariate normality.
Care must be taken to make sure that the joint significance level ofthe tests is reasonable!
28/36
Assessing normality: Recommendations
When assessing multivariate normality, it is often a good idea touse more than one method in order to account for differentpossibilities of deviations from normality.
I Inspect scatter plots and univariate Q-Q-plots.
I Perform univariate tests of normality on each variable.
I Use methods based on dimension reduction – test some linearcombination and look at the β plot.
I Use a test for multivariate normality.
Care must be taken to make sure that the joint significance level ofthe tests is reasonable!
28/36
Assessing normality: Recommendations
When assessing multivariate normality, it is often a good idea touse more than one method in order to account for differentpossibilities of deviations from normality.
I Inspect scatter plots and univariate Q-Q-plots.
I Perform univariate tests of normality on each variable.
I Use methods based on dimension reduction – test some linearcombination and look at the β plot.
I Use a test for multivariate normality.
Care must be taken to make sure that the joint significance level ofthe tests is reasonable!
28/36
Assessing normality: Recommendations
When assessing multivariate normality, it is often a good idea touse more than one method in order to account for differentpossibilities of deviations from normality.
I Inspect scatter plots and univariate Q-Q-plots.
I Perform univariate tests of normality on each variable.
I Use methods based on dimension reduction – test some linearcombination and look at the β plot.
I Use a test for multivariate normality.
Care must be taken to make sure that the joint significance level ofthe tests is reasonable!
28/36
Assessing normality: Recommendations
When assessing multivariate normality, it is often a good idea touse more than one method in order to account for differentpossibilities of deviations from normality.
I Inspect scatter plots and univariate Q-Q-plots.
I Perform univariate tests of normality on each variable.
I Use methods based on dimension reduction – test some linearcombination and look at the β plot.
I Use a test for multivariate normality.
Care must be taken to make sure that the joint significance level ofthe tests is reasonable!
28/36
Outliers
I Graphical methodsI Scatter plotsI Chernoff facesI StarsI Andrew’s curves
I Examine standardized observations zjk = (xjk − xk)/√
skk andlook for unusually large or small values.
I Examine d2j = (xj − x)′S−1(xj − x) and look for unusually
large values.
I Wilks’ test
29/36
Outliers
I Graphical methodsI Scatter plotsI Chernoff facesI StarsI Andrew’s curves
I Examine standardized observations zjk = (xjk − xk)/√
skk andlook for unusually large or small values.
I Examine d2j = (xj − x)′S−1(xj − x) and look for unusually
large values.
I Wilks’ test
29/36
Outliers
I Graphical methodsI Scatter plotsI Chernoff facesI StarsI Andrew’s curves
I Examine standardized observations zjk = (xjk − xk)/√
skk andlook for unusually large or small values.
I Examine d2j = (xj − x)′S−1(xj − x) and look for unusually
large values.
I Wilks’ test
29/36
Outliers
I Graphical methodsI Scatter plotsI Chernoff facesI StarsI Andrew’s curves
I Examine standardized observations zjk = (xjk − xk)/√
skk andlook for unusually large or small values.
I Examine d2j = (xj − x)′S−1(xj − x) and look for unusually
large values.
I Wilks’ test
29/36
Outliers: Wilks’ test
Wilks’ test for outliers in a multivariate normal sample is based onthe statistic
Λ = minj
(Λj)
where Λj = 1− n(n−1)2 d2
j ; recall that d2j = (xj − x)′S−1(xj − x).
If Λ is small, there are likely outliers in the sample.
I Published in 1963.
I A formalization of β plots.
I Equivalent to looking at max(d2j ).
I Related to the hat matrix in linear regression.
I A multitude of extensions and variations of the test exists.
30/36
Outliers: Wilks’ test
Wilks’ test for outliers in a multivariate normal sample is based onthe statistic
Λ = minj
(Λj)
where Λj = 1− n(n−1)2 d2
j ; recall that d2j = (xj − x)′S−1(xj − x).
If Λ is small, there are likely outliers in the sample.
I Published in 1963.
I A formalization of β plots.
I Equivalent to looking at max(d2j ).
I Related to the hat matrix in linear regression.
I A multitude of extensions and variations of the test exists.
30/36
Outliers: Wilks’ test
Wilks’ test for outliers in a multivariate normal sample is based onthe statistic
Λ = minj
(Λj)
where Λj = 1− n(n−1)2 d2
j ; recall that d2j = (xj − x)′S−1(xj − x).
If Λ is small, there are likely outliers in the sample.
I Published in 1963.
I A formalization of β plots.
I Equivalent to looking at max(d2j ).
I Related to the hat matrix in linear regression.
I A multitude of extensions and variations of the test exists.
30/36
Outliers: Wilks’ test
Wilks’ test for outliers in a multivariate normal sample is based onthe statistic
Λ = minj
(Λj)
where Λj = 1− n(n−1)2 d2
j ; recall that d2j = (xj − x)′S−1(xj − x).
If Λ is small, there are likely outliers in the sample.
I Published in 1963.
I A formalization of β plots.
I Equivalent to looking at max(d2j ).
I Related to the hat matrix in linear regression.
I A multitude of extensions and variations of the test exists.
30/36
Outliers: Wilks’ test
Wilks’ test for outliers in a multivariate normal sample is based onthe statistic
Λ = minj
(Λj)
where Λj = 1− n(n−1)2 d2
j ; recall that d2j = (xj − x)′S−1(xj − x).
If Λ is small, there are likely outliers in the sample.
I Published in 1963.
I A formalization of β plots.
I Equivalent to looking at max(d2j ).
I Related to the hat matrix in linear regression.
I A multitude of extensions and variations of the test exists.
30/36
Outliers: Wilks’ test
Wilks’ test for outliers in a multivariate normal sample is based onthe statistic
Λ = minj
(Λj)
where Λj = 1− n(n−1)2 d2
j ; recall that d2j = (xj − x)′S−1(xj − x).
If Λ is small, there are likely outliers in the sample.
I Published in 1963.
I A formalization of β plots.
I Equivalent to looking at max(d2j ).
I Related to the hat matrix in linear regression.
I A multitude of extensions and variations of the test exists.
30/36
Transformations to normality
If the data is found to be non-normal, it is still possible that it canbe transformed to normality.
A useful family of power transformations was described by Box andCox in 1964, along with a method of determining whichtransformation to use.
31/36
Transformations: Box–Cox transformation
Assumption: xik > 0. Box and Cox (1964):
x(λ)ik =
{xλik−1λ when λ 6= 0
ln(xik) when λ = 0
where i = 1, . . . , n and k is fixed.
Note that, by L’Hospital’s rule, it can be shown that
limλ→0(xλik − 1)/λ = ln(xik)
32/36
Transformations: Box–Cox transformation
Assumption: xik > 0. Box and Cox (1964):
x(λ)ik =
{xλik−1λ when λ 6= 0
ln(xik) when λ = 0
where i = 1, . . . , n and k is fixed.
Note that, by L’Hospital’s rule, it can be shown that
limλ→0(xλik − 1)/λ = ln(xik)
32/36
Transformations: Box–Cox transformation
Assumption: xik > 0. Box and Cox (1964):
x(λ)ik =
{xλik−1λ when λ 6= 0
ln(xik) when λ = 0
where i = 1, . . . , n and k is fixed.
Which λ should we choose? A maximum likelihood approach is totry to find the λ that maximizes g(λ) = −1
2n ln(s2(λ)). Rewrite as:
g(λ) = (λ− 1)n∑
i=1
ln(xik)− n
2ln[σ2(λ)]
where
σ2(λ) =1
ny(λ)′(I − H)y
R function: boxcox (in library MASS)
33/36
Transformations: Box–Cox transformation
Assumption: xik > 0. Box and Cox (1964):
x(λ)ik =
{xλik−1λ when λ 6= 0
ln(xik) when λ = 0
where i = 1, . . . , n and k is fixed.
Which λ should we choose? A maximum likelihood approach is totry to find the λ that maximizes g(λ) = −1
2n ln(s2(λ)). Rewrite as:
g(λ) = (λ− 1)n∑
i=1
ln(xik)− n
2ln[σ2(λ)]
where
σ2(λ) =1
ny(λ)′(I − H)y
R function: boxcox (in library MASS)
33/36
Transformations: Box–Cox transformation
−1.0 −0.5 0.0 0.5 1.0 1.5 2.0
−36
0−
340
−32
0−
300
−28
0−
260
λ
log−
Like
lihoo
d
95%
34/36
Transformations: Box–Cox method, notes
I Box–Cox gets upset by outliers — if one finds λ = 5 that isprobably the reason.
I If some xik < 0, sometimes adding a small constant to all xikcan work.
I If maxi xik/mini xik is small, Box–Cox will not do anything;power transforms are well approximated by lineartransformations over short intervals.
I Should estimation of λ count as an extra parameter to betaken into account of into the degrees of freedom? Difficultquestion: λ is not a linear parameter.
35/36
Transformations: Box–Cox method, notes
I Box–Cox gets upset by outliers — if one finds λ = 5 that isprobably the reason.
I If some xik < 0, sometimes adding a small constant to all xikcan work.
I If maxi xik/mini xik is small, Box–Cox will not do anything;power transforms are well approximated by lineartransformations over short intervals.
I Should estimation of λ count as an extra parameter to betaken into account of into the degrees of freedom? Difficultquestion: λ is not a linear parameter.
35/36
Transformations: Box–Cox method, notes
I Box–Cox gets upset by outliers — if one finds λ = 5 that isprobably the reason.
I If some xik < 0, sometimes adding a small constant to all xikcan work.
I If maxi xik/mini xik is small, Box–Cox will not do anything;power transforms are well approximated by lineartransformations over short intervals.
I Should estimation of λ count as an extra parameter to betaken into account of into the degrees of freedom? Difficultquestion: λ is not a linear parameter.
35/36
Transformations: Box–Cox method, notes
I Box–Cox gets upset by outliers — if one finds λ = 5 that isprobably the reason.
I If some xik < 0, sometimes adding a small constant to all xikcan work.
I If maxi xik/mini xik is small, Box–Cox will not do anything;power transforms are well approximated by lineartransformations over short intervals.
I Should estimation of λ count as an extra parameter to betaken into account of into the degrees of freedom? Difficultquestion: λ is not a linear parameter.
35/36
Summary
I Sample momentsI UnbiasednessI Asymptotics
I Estimation for the multivariate normal distributionI Maximum likelihood estimationI Distributions of estimators
I Assessing normalityI How to investigate the validity of the assumption of normality
I Outliers
I Transformations to normality
36/36
Summary
I Sample momentsI UnbiasednessI Asymptotics
I Estimation for the multivariate normal distributionI Maximum likelihood estimationI Distributions of estimators
I Assessing normalityI How to investigate the validity of the assumption of normality
I Outliers
I Transformations to normality
36/36
Summary
I Sample momentsI UnbiasednessI Asymptotics
I Estimation for the multivariate normal distributionI Maximum likelihood estimationI Distributions of estimators
I Assessing normalityI How to investigate the validity of the assumption of normality
I Outliers
I Transformations to normality
36/36
Summary
I Sample momentsI UnbiasednessI Asymptotics
I Estimation for the multivariate normal distributionI Maximum likelihood estimationI Distributions of estimators
I Assessing normalityI How to investigate the validity of the assumption of normality
I Outliers
I Transformations to normality
36/36
Summary
I Sample momentsI UnbiasednessI Asymptotics
I Estimation for the multivariate normal distributionI Maximum likelihood estimationI Distributions of estimators
I Assessing normalityI How to investigate the validity of the assumption of normality
I Outliers
I Transformations to normality
36/36