smoothing spline anova - statisticsusers.stat.umn.edu/~helwig/notes/ssanova-notes.pdf · smoothing...
TRANSCRIPT
Smoothing Spline ANOVA
Nathaniel E. Helwig
Assistant Professor of Psychology and StatisticsUniversity of Minnesota (Twin Cities)
Updated 04-Jan-2017
Nathaniel E. Helwig (U of Minnesota) Smoothing Spline ANOVA Updated 04-Jan-2017 : Slide 1
Copyright
Copyright c© 2017 by Nathaniel E. Helwig
Nathaniel E. Helwig (U of Minnesota) Smoothing Spline ANOVA Updated 04-Jan-2017 : Slide 2
Outline of Notes
1) IntroductionParametric regressionNonparametric regressionSmoothing splines
2) Background TheoryAveraging operatorsHilbert spacesReproducing kernels
3) Estimation & Inference:Penalized least squaresSmoothing parameter selectionBayesian confidence intervals
4) SSANOVA in Practice:One-way SSANOVATwo-way SSANOVA (additive)Two-way SSANOVA (interactive)
For a thorough treatment see:
Gu, C. (2013). Smoothing spline ANOVA models, 2nd edition. New York: Springer-Verlag.
Nathaniel E. Helwig (U of Minnesota) Smoothing Spline ANOVA Updated 04-Jan-2017 : Slide 3
Introduction
Introduction
Nathaniel E. Helwig (U of Minnesota) Smoothing Spline ANOVA Updated 04-Jan-2017 : Slide 4
Introduction Parametric Regression
Parametric Regression Model: Scalar Form
The multiple linear regression model has the form
yi =
p∑j=1
bjxij + ei
for i ∈ {1, . . . ,n} whereyi ∈ R is the real-valued response for the i-th observationbj ∈ R is the j-th predictor’s regression slopexij ∈ R is the j-th predictor for the i-th observation
eiiid∼ N(0, σ2) is Gaussian measurement error
Implies that (yi |xi1, . . . , xip)ind∼ N(
∑pj=1 bjxij , σ
2)
Nathaniel E. Helwig (U of Minnesota) Smoothing Spline ANOVA Updated 04-Jan-2017 : Slide 5
Introduction Parametric Regression
Parametric Regression Model: Matrix Form
The multiple linear regression model has the form
y = Xb + e
wherey = (y1, . . . , yn)′ ∈ Rn is the n × 1 response vectorX = [x1, . . . ,xp] ∈ Rn×p is the n × p design matrix• xj = (x1j , . . . , xnj )
′ ∈ Rn is j-th predictor vector (n × 1)
b = (b1, . . . ,bp)′ ∈ Rp is p × 1 vector of coefficientse = (e1, . . . ,en)′ ∈ Rn is the n × 1 error vector
Implies that (y|x) ∼ N(Xb, σ2In)
Nathaniel E. Helwig (U of Minnesota) Smoothing Spline ANOVA Updated 04-Jan-2017 : Slide 6
Introduction Parametric Regression
Ordinary Least Squares Solution
The ordinary least squares (OLS) problem is
minb∈Rp
1n‖y− Xb‖2 ←→ min
b∈Rp
1n
n∑i=1
(yi − yi)2
where ‖ · ‖ denotes the Euclidean norm and yi =∑p
j=1 bjxij .
The OLS solution has the form
b = (X′X)−1X′y
and the fitted values corresponding to b are given by
y = Xb = Hy
where H = X(X′X)−1X′ is the hat matrix.Nathaniel E. Helwig (U of Minnesota) Smoothing Spline ANOVA Updated 04-Jan-2017 : Slide 7
Introduction Parametric Regression
Summary of Results
Using the model assumption (y|x) ∼ N(Xb, σ2In), we have
b ∼ N(b, σ2(X′X)−1)
y ∼ N(Xb, σ2H)
e ∼ N(0, σ2(In − H))
where e = y− y is the residual vector.
Typically σ2 is unknown, so we use the MSE σ2 = 1n−p
∑ni=1 e2
i .
Nathaniel E. Helwig (U of Minnesota) Smoothing Spline ANOVA Updated 04-Jan-2017 : Slide 8
Introduction Nonparametric Regression
Nonparametric Regression Model
The Gaussian nonparametric regression model has the form
yi = η(xi) + ei
for i ∈ {1, . . . ,n} whereyi ∈ R is the real-valued response for the i-th observationxi ∈ Rp is the predictor vector for the i-th observationη : Rp → R is an unknown smooth function
eiiid∼ N(0, σ2) is Gaussian measurement error
Implies that (yi |xi1, . . . , xip)ind∼ N(η(xi), σ
2)
Nathaniel E. Helwig (U of Minnesota) Smoothing Spline ANOVA Updated 04-Jan-2017 : Slide 9
Introduction Nonparametric Regression
Additive versus Interactive Models
Suppose that xi = (xi1, xi2) with xi1 ∈ X1 and xi2 ∈ X2.
We could fit one of two possible models:
Additive : η(xi) = η0 + η1(xi1) + η2(xi2)
Interaction : η(xi) = η0 + η1(xi1) + η2(xi2) + η12(xi1, xi2)
whereη0 is a constant functionη1 is main effect of first predictorη2 is main effect of second predictorη12 is interaction effect
Nathaniel E. Helwig (U of Minnesota) Smoothing Spline ANOVA Updated 04-Jan-2017 : Slide 10
Introduction Nonparametric Regression
Example 1: Continuous and Nominal Covariates
xi = (xi1, xi2) with xi1 ∈ [0,1] and xi2 ∈ {a,b}.
0.0 0.2 0.4 0.6 0.8 1.0
−2
−1
01
23
4
Additive
x1
y
x2 = ax2 = b
0.0 0.2 0.4 0.6 0.8 1.0
−2
−1
01
23
4
Interaction
x1
y
x2 = ax2 = b
Nathaniel E. Helwig (U of Minnesota) Smoothing Spline ANOVA Updated 04-Jan-2017 : Slide 11
Introduction Nonparametric Regression
Example 1: R Code
addfun = function(x1,x2){funval = sin(2*pi*x1)idx = which(x2=="a")funval[idx] = funval[idx] + 2funval
}intfun = function(x1,x2){
funval = sin(2*pi*x1)idx = which(x2=="a")funval[idx] = funval[idx] + 2 + sin(4*pi*x1[idx])funval
}
dev.new(width=12,height=6,noRStudioGD=TRUE)par(mfrow=c(1,2))x1 = seq(0,1,length=200)plot(x1,addfun(x1,rep("a",200)),type="l",ylim=c(-2,4),main="Additive",
ylab="y",cex.axis=1.25,cex.lab=1.5,cex.main=3)lines(x1,addfun(x1,rep("b",200)),lty=2)legend("bottomleft",legend=c(expression(x[2]*" = "*a),expression(x[2]*" = "*b)),
lty=1:2,bty="n",cex=1.5)plot(x1,intfun(x1,rep("a",200)),type="l",ylim=c(-2,4),main="Interaction",
ylab="y",cex.axis=1.25,cex.lab=1.5,cex.main=3)lines(x1,intfun(x1,rep("b",200)),lty=2)legend("bottomleft",legend=c(expression(x[2]*" = "*a),expression(x[2]*" = "*b)),
lty=1:2,bty="n",cex=1.5)
Nathaniel E. Helwig (U of Minnesota) Smoothing Spline ANOVA Updated 04-Jan-2017 : Slide 12
Introduction Nonparametric Regression
Example 2: Two Continuous Covariates
xi = (xi1, xi2) with xi1, xi2 ∈ [0,1].
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
Additive
x1
x2
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
Interaction
x1
x2
Nathaniel E. Helwig (U of Minnesota) Smoothing Spline ANOVA Updated 04-Jan-2017 : Slide 13
Introduction Nonparametric Regression
Example 2: R Code
addfun = function(x1,x2){sin(2*pi*x1) + cos(4*pi*x2*(1-x2))
}intfun = function(x1,x2){sin(2*pi*x1) + cos(4*pi*x2*(1-x2)) + 2*sin(pi*(x1-x2))
}
xs = seq(0,1,length=50)xg = expand.grid(xs,xs)dev.new(width=12,height=6,noRStudioGD=TRUE)par(mfrow=c(1,2))zmat = matrix(addfun(xg[,1],xg[,2]),50,50)image(xs,xs,zmat,xlab="x1",ylab="x2",main="Additive",
cex.axis=1.25,cex.lab=1.5,cex.main=3)zmat = matrix(intfun(xg[,1],xg[,2]),50,50)image(xs,xs,zmat,xlab="x1",ylab="x2",main="Interaction",
cex.axis=1.25,cex.lab=1.5,cex.main=3)
Nathaniel E. Helwig (U of Minnesota) Smoothing Spline ANOVA Updated 04-Jan-2017 : Slide 14
Introduction Smoothing Splines
Smoothing Splines on {1, . . . ,K}Suppose xi ∈ {1, . . . ,K} and note that ηf is a vector of length K .
f = (f1, . . . , fK )′ ∈ RK is vector corresponding to ηf
ηf(1) = f1, ηf(2) = f2, . . . , ηf(K ) = fKLet ηf =
∑Kx=1 ηf(x)/K denote the mean
A nominal smoothing spline is the ηλ ∈ RK that minimizes
1n
n∑i=1
(yi − ηf(xi))2 + λJ(ηf)
where λ ≥ 0 is smoothing parameter and J(ηf) is roughness penalty.J(ηf) =
∑Kx=1(ηf(x)− ηf)
2 to shrink towards constant
J(ηf) =∑K
x=1 ηf(x)2 to shrink towards zeroNathaniel E. Helwig (U of Minnesota) Smoothing Spline ANOVA Updated 04-Jan-2017 : Slide 15
Introduction Smoothing Splines
Polynomial Smoothing Splines on [0,1]
Suppose xi ∈ [0,1] and let C(m)[0,1] = {η : η(m) ∈ L2[0,1]}.η(m) = dmη
dxm denotes m-th derivative of η
L2[0,1] = {η :∫ 1
0 η2dx <∞}
A polynomial smoothing spline is the ηλ ∈ C(m)[0,1] that minimizes
1n
n∑i=1
(yi − η(xi))2 + λ
∫ 1
0(η(m))2dx
where λ ≥ 0 is the smoothing parameter and m is spline order.Related to natural spline in numerical analysis literature
Nathaniel E. Helwig (U of Minnesota) Smoothing Spline ANOVA Updated 04-Jan-2017 : Slide 16
Introduction Smoothing Splines
Cubic Smoothing Splines
Setting m = 2 results in classic cubic smoothing spline.x1 < x2 < · · · < xq are “knots” (distinct xi values)ηλ is piecewise cubic polynomial, and is linear beyond x1 and xq
ηλ is three-times differentiable, and 3rd derivative jumps at “knots”As λ→ 0, ηλ approaches minimum curvature interpolantAs λ→∞, ηλ approaches simple linear regression
Can also view cubic smoothing spline as solution to
min1n
n∑i=1
(yi − η(xi))2 subject to∫ 1
0η2dx ≤ ρ
for some ρ ≥ 0, which is least-squares with soft constraint.
Nathaniel E. Helwig (U of Minnesota) Smoothing Spline ANOVA Updated 04-Jan-2017 : Slide 17
Introduction Smoothing Splines
Example with R’s spline Function
yi = sin(2πxi) + ei where xi = i/20 for i ∈ {0,1,2, . . . ,20} andNo Noise: ei = 0 ∀iSome Noise: ei
iid∼ N(0,0.152)
More Noise: eiiid∼ N(0,0.252)
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
0.0 0.2 0.4 0.6 0.8 1.0
−1.
5−
1.0
−0.
50.
00.
51.
01.
5
No Noise
x
y
●
●
●
●●
●
●
●
●
● ●
●
●
●
●
●●
●
●
●
●
0.0 0.2 0.4 0.6 0.8 1.0
−1.
5−
1.0
−0.
50.
00.
51.
01.
5
Some Noise
x
y
●
● ●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
0.0 0.2 0.4 0.6 0.8 1.0
−1.
5−
1.0
−0.
50.
00.
51.
01.
5
More Noise
xy
Nathaniel E. Helwig (U of Minnesota) Smoothing Spline ANOVA Updated 04-Jan-2017 : Slide 18
Introduction Smoothing Splines
spline Function (R code)
dev.new(width=12,height=4,noRStudioGD=TRUE)par(mfrow=c(1,3))x = seq(0,1,length=21)y = sin(2*pi*x)mysp = spline(x,y,method="natural")plot(x,y,main="No Noise",ylim=c(-1.5,1.5))lines(x,sin(2*pi*x),lty=2)lines(mysp)
set.seed(1)x = seq(0,1,length=21)y = sin(2*pi*x) + rnorm(21,sd=0.15)mysp = spline(x,y,method="natural")plot(x,y,main="Some Noise",ylim=c(-1.5,1.5))lines(x,sin(2*pi*x),lty=2)lines(mysp)
set.seed(1)x = seq(0,1,length=21)y = sin(2*pi*x) + rnorm(21,sd=0.25)mysp = spline(x,y,method="natural")plot(x,y,main="More Noise",ylim=c(-1.5,1.5))lines(x,sin(2*pi*x),lty=2)lines(mysp)
Nathaniel E. Helwig (U of Minnesota) Smoothing Spline ANOVA Updated 04-Jan-2017 : Slide 19
Introduction Smoothing Splines
Same Example with R’s smooth.spline Function
yi = sin(2πxi) + ei where xi = i/20 for i ∈ {0,1,2, . . . ,20} andNo Noise: ei = 0 ∀iSome Noise: ei
iid∼ N(0,0.152)
More Noise: eiiid∼ N(0,0.252)
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
0.0 0.2 0.4 0.6 0.8 1.0
−1.
5−
1.0
−0.
50.
00.
51.
01.
5
No Noise
x
y
●
●
●
●●
●
●
●
●
● ●
●
●
●
●
●●
●
●
●
●
0.0 0.2 0.4 0.6 0.8 1.0
−1.
5−
1.0
−0.
50.
00.
51.
01.
5
Some Noise
x
y
●
● ●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
0.0 0.2 0.4 0.6 0.8 1.0
−1.
5−
1.0
−0.
50.
00.
51.
01.
5
More Noise
xy
Nathaniel E. Helwig (U of Minnesota) Smoothing Spline ANOVA Updated 04-Jan-2017 : Slide 20
Introduction Smoothing Splines
smooth.spline Function (R code)
dev.new(width=12,height=4,noRStudioGD=TRUE)par(mfrow=c(1,3))set.seed(1)x = seq(0,1,length=21)y = sin(2*pi*x)mysp = smooth.spline(x,y)plot(x,y,main="No Noise",ylim=c(-1.5,1.5))lines(x,sin(2*pi*x),lty=2)lines(x,mysp$y)
set.seed(1)x = seq(0,1,length=21)y = sin(2*pi*x) + rnorm(21,sd=0.15)mysp = smooth.spline(x,y)plot(x,y,main="Some Noise",ylim=c(-1.5,1.5))lines(x,sin(2*pi*x),lty=2)lines(x,mysp$y)
set.seed(1)x = seq(0,1,length=21)y = sin(2*pi*x) + rnorm(21,sd=0.25)mysp = smooth.spline(x,y)plot(x,y,main="More Noise",ylim=c(-1.5,1.5))lines(x,sin(2*pi*x),lty=2)lines(x,mysp$y)
Nathaniel E. Helwig (U of Minnesota) Smoothing Spline ANOVA Updated 04-Jan-2017 : Slide 21
Background Theory
Background Theory
Nathaniel E. Helwig (U of Minnesota) Smoothing Spline ANOVA Updated 04-Jan-2017 : Slide 22
Background Theory Averaging Operators
One-Way ANOVA Decomposition
Consider the standard one-way ANOVA model
yij = µj + eij
for i ∈ {1, . . . ,nj} and j ∈ {1, . . . ,K}.
Typically, we want to decompose the treatment effects such as
µj = µ+ αj
where µ is overall mean and αj is treatment effect such thatα1 = 0 if first group is control∑K
j=1 αj = 0 if using effect coding
Nathaniel E. Helwig (U of Minnesota) Smoothing Spline ANOVA Updated 04-Jan-2017 : Slide 23
Background Theory Averaging Operators
One-Way ANOVA and Averaging Operators
Consider the standard one-way ANOVA model using a smoothingspline on xi ∈ {1, . . . ,K}
yi = η(xi) + ei
for i ∈ {1, . . . ,n} where n =∑K
j=1 nj .
The ANOVA decomposition µj = µ+ αj can be written as
η = Aη + (I − A)η = η0 + ηc
where A “averages out” η to return a constant η0.α1 = 0 corresponds to Aη = η(1)∑K
j=1 αj = 0 corresponds to Aη =∑K
x=1 η(x)/K
Nathaniel E. Helwig (U of Minnesota) Smoothing Spline ANOVA Updated 04-Jan-2017 : Slide 24
Background Theory Averaging Operators
Averaging Operators on Continuous Domain
For a continuous domain X = [a,b] we can decompose η such as
η = Aη + (I − A)η = η0 + ηc
where A “averages out” η to return a constant η0.Need averaging operator A defined such that A(Aη) = Aη = η0
Need identity operator I defined such that Iη = η
Note that η0 is overall constant, and ηc is treatment (contrast) effect.
For a function defined on X = [0,1], we could defineAη = η(0)
Aη =∫ 1
0 η(z)dz
Nathaniel E. Helwig (U of Minnesota) Smoothing Spline ANOVA Updated 04-Jan-2017 : Slide 25
Background Theory Averaging Operators
Two-Way ANOVA Decomposition
Consider the standard two-way ANOVA model
yijk = µjk + eijk
for i ∈ {1, . . . ,njk}, j ∈ {1, . . . ,a}, and k ∈ {1, . . . ,b}.
Typically, we want to decompose the treatment effects such as
µjk = µ+ αj + βk + γjk
where µ is overall mean andαj is main effect of Factor A such that
∑aj=1 αj = 0
βk is main effect of Factor B such that∑b
k=1 βk = 0
γjk is interaction effect such that∑a
j=1 γjk =∑b
k=1 γjk = 0 ∀j , kNathaniel E. Helwig (U of Minnesota) Smoothing Spline ANOVA Updated 04-Jan-2017 : Slide 26
Background Theory Averaging Operators
Two-Way ANOVA and Averaging Operators
Consider the standard two-way ANOVA model using a smoothingspline on xi = (xi1, xi2) ∈ X1 ×X2 = {1, . . . ,a} × {1, . . . ,b}
yi = η(xi) + ei
for i ∈ {1, . . . ,n} where n =∑a
j=1∑b
k=1 njk .
The ANOVA decomposition µjk = µ+ αj + βk + γjk can be written as
η = [AX1 + (I − AX1)][AX2 + (I − AX2)]η
= AX1AX2η︸ ︷︷ ︸η0
+ (I − AX1)AX2η︸ ︷︷ ︸η1
+ AX1(I − AX2)η︸ ︷︷ ︸η2
+ (I − AX1)(I − AX2)η︸ ︷︷ ︸η12
where AX1 and AX2 are averaging operators such thatAX1(AX1η) = AX1η is constant for all xi1 ∈ X1
AX2(AX2η) = AX2η is constant for all xi2 ∈ X2
Nathaniel E. Helwig (U of Minnesota) Smoothing Spline ANOVA Updated 04-Jan-2017 : Slide 27
Background Theory Hilbert Spaces
Linear Spaces and Functionals
Suppose that η, φ ∈ L where the set L satisfies:η + φ ∈ LaL ∈ L for any scalar a
If these two conditions are met, we say that L is a linear space.
A functional L in L operates on η ∈ L and returns a real number.Linear functional: L(η + φ) = Lη + Lφ and L(aη) = aLηBilinear functional: linear functional of two variables
- J(aη + bφ, ψ) = aJ(η, ψ) + bJ(φ, ψ)- J(η,aφ+ bψ) = aJ(η, φ) + bJ(η, ψ)
Symmetry: J(η, φ) = J(φ, η) for all η, φ ∈ LPositive definite: J(η) = J(η, η) > 0 for all η ∈ LNon-negative definite: J(η) = J(η, η) ≥ 0 for all η ∈ LQuadratic: bilinear, symmetric, and non-negative definite
Nathaniel E. Helwig (U of Minnesota) Smoothing Spline ANOVA Updated 04-Jan-2017 : Slide 28
Background Theory Hilbert Spaces
Inner Products and Norms
In a linear space L, an inner-product is a positive definite bilinear form.We will use the notation 〈·, ·〉 to denote an inner-product.
The inner-product defines a norm in L, which provides a metric tomeasure distance between two objects η, φ ∈ L.
We will use the notation ‖η‖ =√〈η, η〉 to denote the norm of η.
We will use the notation D[η, φ] = ‖η − φ‖ to denote the distancebetween η and φ in L
In any linear space L we have the following two rules:Cauchy-Schwarz: |〈η, φ〉| ≤ ‖η‖‖φ‖Triangle: ‖η + φ‖ ≤ ‖η‖+ ‖φ‖
Nathaniel E. Helwig (U of Minnesota) Smoothing Spline ANOVA Updated 04-Jan-2017 : Slide 29
Background Theory Hilbert Spaces
Null Spaces, Semi-Inner Products, and Semi-Norms
The null space of a non-negative definite bilinear form J in a linearspace L is defined as NJ = {η : J(η, η) = 0, η ∈ L}, and note that
NJ = {0} if J is positive definiteNJ contains 0 and nonzero elements otherwise
A non-negative definite bilinear form J in a linear space L defines asemi-inner-product in L.
Induces a semi-norm√
J(η) =√
J(η, η) in LSimilar to a norm, but J(η) = 0 does not imply η = 0
Nathaniel E. Helwig (U of Minnesota) Smoothing Spline ANOVA Updated 04-Jan-2017 : Slide 30
Background Theory Hilbert Spaces
Hilbert Spaces and Projections
A Hilbert space is a complete inner-product linear space.A sequence where limm,n→∞ ‖ηm − ηn‖ = 0 is a Cauchy sequenceA linear space L is complete if every Cauchy sequence in Lconverges to some element in L.
Any closed linear subspace of H (denoted G ⊂ H) is a Hilbert space.Distance between η ∈ H and G is D[η,G] = infφ∈G ‖η − φ‖There exists ηG ∈ G such that D[η,G] = ‖η − ηG‖ηG is the unique projection of η in the space G
Nathaniel E. Helwig (U of Minnesota) Smoothing Spline ANOVA Updated 04-Jan-2017 : Slide 31
Background Theory Hilbert Spaces
Tensor Sum Decompositions
Given η ∈ H and G ⊂ H, we have that 〈η − ηG , φ〉 = 0 for all φ ∈ G.Gc = {η : 〈η, φ〉 = 0, ∀φ ∈ G} is orthogonal complement of GTenor sum decomposition: H = G ⊕ Gc and η = ηG + ηGc
If Hn and Hc are Hilbert spaces with inner products 〈·, ·〉n and 〈·, ·〉c,and if Hn ∩Hc = {0}, then H = Hn ⊕Hc is Hilbert space withinner-product 〈·, ·〉 = 〈·, ·〉n + 〈·, ·〉c
Consider a null space NJ corresponding to a semi-inner-product J inthe space H, and define J(·, ·) such that
1 J(·, ·) defines a full inner product in the space NJ
2 (∀η ∈ H)(∃φ ∈ NJ) such that J(η − φ) = 0Then (J + J)(η, φ) defines a full inner product in H.Nathaniel E. Helwig (U of Minnesota) Smoothing Spline ANOVA Updated 04-Jan-2017 : Slide 32
Background Theory Hilbert Spaces
Hilbert Space Example: RK
Note that a Hilbert space is a generalization of Euclidean space RK .
For any vectors x,y ∈ RK , inner product is 〈x,y〉 = x′y =∑K
i=1 xiyi
〈x,y〉 = 〈x,y〉n + 〈x,y〉c = x′[ 1
K 1K 1′K + (IK − 1K 1K 1′K )
]y
Hn = {η : η(1) = · · · = η(K )} and Hc = {η :∑K
x=1 η(x) = 0}
This corresponds to classic one-way ANOVA decomposition
µj = µ+ αj
with the constraint∑
j αj = 0
Nathaniel E. Helwig (U of Minnesota) Smoothing Spline ANOVA Updated 04-Jan-2017 : Slide 33
Background Theory Reproducing Kernels
Riesz Representation Theorem
For every φ in a Hilbert space H, the functional Lφη = 〈φ, η〉 defines acontinuous linear functional Lφ.
L is continuous if limn→∞ Lηn = Lη whenever limn→∞ ηn = η
Every continuous linear functional L in H has a representationLη = 〈φL, η〉 for some φL ∈ H, which is called the representer of L.
TheoremFor every continuous linear functional L in a Hilbert space H, thereexists a unique φL ∈ H such that Lη = 〈φL, η〉 for all η ∈ H.
Nathaniel E. Helwig (U of Minnesota) Smoothing Spline ANOVA Updated 04-Jan-2017 : Slide 34
Background Theory Reproducing Kernels
Reproducing Kernel Hilbert Spaces
To estimate an SSANOVA, we need to evaluate η for different x ∈ X .Need continuity of evaluation functional: [x ]η = η(x)
Consider a Hilbert space H of functions on the domain X .If evaluation functional [x ]η = η(x) is continuous in H for all x ∈ X ,then we say that H is a reproducing kernel Hilbert spaceBy the Riesz Representation Theorem, there exists ρx ∈ H, whichis the representer of the evaluation functional [x ]η = η(x)
Symmetric bivariate function ρ(x , y) = ρx (y) = 〈ρx , ρy 〉 has thereproducing property 〈ρ(x , ·), η(·)〉 = η(x)
Consequently, ρ is called the reproducing kernel of the space H
Nathaniel E. Helwig (U of Minnesota) Smoothing Spline ANOVA Updated 04-Jan-2017 : Slide 35
Background Theory Reproducing Kernels
Examples of Reproducing Kernel Hilbert Spaces
Consider the Euclidean space RK , which is a RKHSInner product is defined as 〈x,y〉 =
∑Ki=1 xiyi
RK is define as ρ(x , y) = I{x=y}, which is indicator function
Consider the space L2[0,1] = {η :∫ 1
0 η2dx <∞}
Elements in L2[0,1] are defined via equivalent classes(not defined via individual functions)NOT a RKHS because evaluation functional is not well-defined
Consider the space C(m)[0,1] = {η : η(m) ∈ L2[0,1]}Elements in C(m)[0,1] are defined via individual functionsEvaluation functional is continuous, so we have a RKHS
Nathaniel E. Helwig (U of Minnesota) Smoothing Spline ANOVA Updated 04-Jan-2017 : Slide 36
Background Theory Reproducing Kernels
Tensor Sum Decompositions of RKHS
Given the tensor sum decomposition H = Hn ⊕Hc, we have
ρ = ρn + ρc
where ρ is the RK of H, ρn is the RK of Hn, and ρc is the RK of Hc.
Furthermore, if ρ is the RK of H and if ρ = ρn + ρc whereρn, ρc ∈ H are non-negative for all x ∈ X〈ρn(x , ·), ρc(y , ·)〉 = 0 for all x , y ∈ X
then the spaces Hn and Hc form a tensor sum decomposition of H.
Nathaniel E. Helwig (U of Minnesota) Smoothing Spline ANOVA Updated 04-Jan-2017 : Slide 37
Background Theory Reproducing Kernels
Reproducing Kernel for Nominal Smoothing Splines
Suppose that xi ∈ X = {1, . . . ,K} and η ∈ H = RK .
For any elements η, φ ∈ H, we have that〈η, φ〉 = η′φ =
∑Kx=1 η(x)φ(x)
ρ(x , y) = I{x=y} where I{·} is indicator function
Using the averaging operator Aη =∑K
x=1 η(x)/K〈η, φ〉 = 〈η, φ〉n + 〈η, φ〉c = η′
[ 1K 1K 1′K + (IK − 1
K 1K 1′K )]φ
ρ(x , y) = ρn(x , y) + ρc(x , y) = 1K + (I{x=y} − 1
K )
Hn = {η : η(1) = · · · = η(K )} and Hc = {η :∑K
x=1 η(x) = 0}
Nathaniel E. Helwig (U of Minnesota) Smoothing Spline ANOVA Updated 04-Jan-2017 : Slide 38
Background Theory Reproducing Kernels
Reproducing Kernel for Polynomial Smoothing Splines
Suppose that xi ∈ X = [0,1] and η ∈ H = C(m)[0,1].
Using the averaging operator Aη =∫ 1
0 ηdx〈η, φ〉 = 〈η, φ〉n + 〈η, φ〉c =
∑m−1ν=0 (
∫ 10 η
(ν)dx)(∫ 1
0 φ(ν)dx) +
∫ 10 η
(m)φ(m)dx
ρ(x , y) = ρn(x , y) + ρc(x , y) =∑m−1
ν=0 kν(x)kν(y) + (−1)m−1k2m(|x − y |)where kν(x) is scaled Bernoulli polynomial
Hn = {η : η(m) = 0} and Hc = {η :∫ 1
0 η(ν) = 0, ν = 0, . . . ,m − 1, η(m) ∈ L2[0, 1]}
Using the averaging operator Aη = η(0)
〈η, φ〉 = 〈η, φ〉n + 〈η, φ〉c =∑m−1
ν=0 η(ν)(0)φ(ν)(0) +
∫ 10 η
(m)φ(m)dx
ρ(x , y) = ρn(x , y) + ρc(x , y) =∑m−1
ν=0xν
ν!yν
ν!+
∫ 10
(x−u)m−1+
(m−1)!(y−u)m−1
+(m−1)! du
Hn = {η : η(m) = 0} and Hc = {η : η(ν)(0) = 0, ν = 0, . . . ,m − 1, η(m) ∈ L2[0, 1]}
Nathaniel E. Helwig (U of Minnesota) Smoothing Spline ANOVA Updated 04-Jan-2017 : Slide 39
Background Theory Reproducing Kernels
Tensor Product RKHS
Suppose that xi ∈ X where X = X1 × · · · × Xp is a product domain.Suppose HXj is a RKHS of functions with RK ρXj for all xj ∈ Xj
Note that the marginal RKs have the form ρXj = ρnj + ρcj
We can define ρX =∏p
j=1 ρXj =∏p
j=1(ρnj + ρcj )
ρX is non-negative for all x ∈ XρX is RK of tensor product RKHS H = HX1 ⊗ · · · ⊗ HXp
We can form functional spaces for any number of covariatesCan constrain and/or remove subspaces to fit different models
Nathaniel E. Helwig (U of Minnesota) Smoothing Spline ANOVA Updated 04-Jan-2017 : Slide 40
Background Theory Reproducing Kernels
Need for Additional Smoothing Parameters
Given a tensor product RKHS H = HX1 ⊗ · · · ⊗ HXp we have that. . .H = ⊗p
j=1(Hnj ⊕Hcj ) = ⊕sk=1Hk is tensor sum decomposition
Each subspace Hk has inner product 〈·, ·〉k and RK ρk
Inner-products and RKs have different metrics
Can introduce additional smoothing parameters to inner product:
〈·, ·〉 =s∑
k=1
θ−1k 〈·, ·〉k
which corresponds to the tensor product RK
ρ =s∑
k=1
θkρk
Nathaniel E. Helwig (U of Minnesota) Smoothing Spline ANOVA Updated 04-Jan-2017 : Slide 41
Estimation and Inference
Estimation and Inference
Nathaniel E. Helwig (U of Minnesota) Smoothing Spline ANOVA Updated 04-Jan-2017 : Slide 42
Estimation and Inference Penalized Least Squares
Tensor Product Smoothing Spline
Given xi ∈ X = X1 × · · · × Xp, a tensor product smoothing spline is theηλ ∈ H = HX1 ⊗ · · · ⊗ HXp that minimizes
1n
n∑i=1
(yi − η(xi))2 + λJ(η)
whereλ ≥ 0 is overall (global) smoothing parameterJ is a quadratic functional quantifying roughness of ηAdditional smoothing parameters θ = (θ1, . . . , θs) exist in J
Nathaniel E. Helwig (U of Minnesota) Smoothing Spline ANOVA Updated 04-Jan-2017 : Slide 43
Estimation and Inference Penalized Least Squares
Representation of η
Let H = Hn ⊕Hc denote the tensor sum decomposition of the tensorproduct RKHS H = ⊗p
j=1HXj
Note that H has RK ρ = ρn + ρc where ρc =∑s
k=1 θkρk
Given fixed smoothing parameters θ, the η ∈ H that minimizes thepenalized least-squares functional can be written as
η(x) =m∑
v=1
dvφv (x) +n∑
i=1
ciρc(xi ,x) (1)
where {φv}mv=1 is a set of known functions spanning Hn, ρc is thereproducing kernel (RK) of Hc, and d ≡ {dv}m×1 and c ≡ {ci}n×1 arethe (unknown) basis function coefficient vectors
Nathaniel E. Helwig (U of Minnesota) Smoothing Spline ANOVA Updated 04-Jan-2017 : Slide 44
Estimation and Inference Penalized Least Squares
Penalty of η
Given the tensor sum decomposition H = Hn ⊕Hc for some tensorproduct RKHS H = ⊗p
j=1HXj , we define the penalty functional
J(η) = 〈η, η〉c
which is a semi-inner-product with null space Hn.
Using the representation of η(x) on the previous slide, we have
〈η, η〉c = 〈∑m
v=1 dvφv (x) +∑n
i=1 ciρc(xi ,x),∑m
v=1 dvφv (x) +∑n
i=1 ciρc(xi ,x)〉c= 〈∑n
i=1 ciρc(xi ,x),∑n
i=1 ciρc(xi ,x)〉c
=n∑
i=1
n∑j=1
cicj〈ρc(xi ,x), ρc(xj ,x)〉c =n∑
i=1
n∑j=1
cicjρc(xi ,xj )
Nathaniel E. Helwig (U of Minnesota) Smoothing Spline ANOVA Updated 04-Jan-2017 : Slide 45
Estimation and Inference Penalized Least Squares
Penalized Least Squares Problem
Using {x∗u}qu=1 ⊂ {xi}ni=1 as knots, the penalized least-squares
functional can be approximated as
‖y− Kd− Jθc‖2 + λnc′Qθc
wherey = (y1, . . . , yn)′ is response vectorK = {φv (xi)}n×m is null space basis function matrixJθ = {ρc(xi ,x∗u)}n×q is contrast space basis function matrix
Note: Jθ =∑s
k=1 θk Jk where Jk = {ρk (xi ,x∗u)}n×q
Qθ = {ρc(x∗t ,x∗u)}q×q is penalty matrix
Note: Qθ =∑s
k=1 θk Qk where Qk = {ρk (x∗t ,x∗u)}q×q
d = (d1, . . . ,dm)′ and c = (c1, . . . , cq)′ are unknown coefficients
Nathaniel E. Helwig (U of Minnesota) Smoothing Spline ANOVA Updated 04-Jan-2017 : Slide 46
Estimation and Inference Penalized Least Squares
Coefficients and Smoothing Matrix
The coefficients minimizing the penalized least-squares function are(dc
)=
(K′K K′Jθ
J′θK J′θJθ + λnQθ
)†(K′
J′θ
)y
where (·)† denotes the Moore-Penrose pseudoinverse.
The fitted values are given by y = Kd + Jθc = Sλy where
Sλ =(K Jθ
)(K′K K′Jθ
J′θK J′θJθ + λnQθ
)†(K′
J′θ
)is the smoothing matrix, which depends on λ = (λ/θ1, . . . , λ/θs).
Nathaniel E. Helwig (U of Minnesota) Smoothing Spline ANOVA Updated 04-Jan-2017 : Slide 47
Estimation and Inference Smoothing Parameter Selection
Smoothing Parameter Goldilocks Phenomenon
The selection of λ is the crucial step when fitting an SSANOVA model!
If λk ≡ λ/θk is too large, the penalty corresponding to Hk will be toosevere, making it difficult to estimate ηk .
Oversmooth k -th contrast space
If λk ≡ λ/θk is too small, the penalty corresponding to Hk will be toolenient, making it difficult to estimate ηk (assuming noisy data)
Undersmooth k -th contrast space
Nathaniel E. Helwig (U of Minnesota) Smoothing Spline ANOVA Updated 04-Jan-2017 : Slide 48
Estimation and Inference Smoothing Parameter Selection
Cross-Validation
If σ2 is unknown, a reasonable loss function for selecting λ is thecross-validated loss function
CV(λ|y,X,w) = (1/n)n∑
i=1
wi(yi − η[i]λ (xi))2
where wi > 0 is some weight, and η[i]λ is the function φ ∈ H thatminimizes the delete the i-th observation functional:
(1/n)∑j 6=i
(yj − φ(xj))2 + λJ(φ)
Nathaniel E. Helwig (U of Minnesota) Smoothing Spline ANOVA Updated 04-Jan-2017 : Slide 49
Estimation and Inference Smoothing Parameter Selection
Cross-Validation (continued)
The form of CV loss function might suggest that it is necessary to fit ndifferent models (to obtain η[i]λ for i ∈ {1, . . . ,n}).
However, the CV function can be rewritten as
CV(λ|y,X,w) = (1/n)n∑
i=1
wi(yi − ηλ(xi))2
(1− sii(λ))2
which implies that the CV function can be minimized using the resultsof the full model.
Nathaniel E. Helwig (U of Minnesota) Smoothing Spline ANOVA Updated 04-Jan-2017 : Slide 50
Estimation and Inference Smoothing Parameter Selection
Generalized Cross-Validation
Defining wi ≡ (1− sii(λ))2/[n−1tr(In − Sλ)]2 replaces each sii(λ) with its
average value, producing the generalized cross-validation (GCV)criterion of Craven and Wahba (1979):
GCV(λ|y,X) = (1/n)n∑
i=1
(yi − ηλ(xi))2
[n−1tr(In − Sλ)]2
=(1/n)‖(In − Sλ)y‖2
[1− tr(Sλ)/n]2
(2)
The λ that minimizes the GCV score produces good estimates of η.
Nathaniel E. Helwig (U of Minnesota) Smoothing Spline ANOVA Updated 04-Jan-2017 : Slide 51
Estimation and Inference Bayesian Confidence Intervals
Gaussian Process Definition
A Gaussian process is a stochastic process {η(x) : x ∈ X} such thatη(x) ∼ N(µx , σ
2x ) for all x ∈ X where
µx = E(η(x)) is the mean functionγx ,x ′ = Cov(η(x), η(x′)) is the covariance functionσ2
x = Cov(η(x), η(x)) is the variance function
Note η(x) is a random variable that is normally distributed for all x ∈ XUse the notation η(x) ∼ N(µx , σ
2x ) for all x ∈ X
Mean and variance differ for each x ∈ X
Nathaniel E. Helwig (U of Minnesota) Smoothing Spline ANOVA Updated 04-Jan-2017 : Slide 52
Estimation and Inference Bayesian Confidence Intervals
Bayesian Interpretation of Smoothing Spline
Let η = ηn + ηc denote the null and contrast space functions andassume the following prior distributions:
ηn has a diffuse (vague) prior with mean zeroηc is a zero mean Gaussian process with covariance functionproportional to ρc
Using these prior assumptions. . .η can be interpreted as posterior mean of η given data ywe can derive posterior variance Var(η|y)
Nathaniel E. Helwig (U of Minnesota) Smoothing Spline ANOVA Updated 04-Jan-2017 : Slide 53
Estimation and Inference Bayesian Confidence Intervals
Bayesian Confidence Intervals
Using the Bayesian interpretation, we can form confidence intervals
η(x)± Zα/2√
Var(η|y)
where Zα/2 is critical value from standard normal distribution.
Bayesian CIs have approximate “across-the-function coverage” whenthe smoothing parameters are selected according to GCV.
On average contain 100(1− α)% of true function realizations
Nathaniel E. Helwig (U of Minnesota) Smoothing Spline ANOVA Updated 04-Jan-2017 : Slide 54
SSANOVA in Practice
SSANOVA in Practice
Nathaniel E. Helwig (U of Minnesota) Smoothing Spline ANOVA Updated 04-Jan-2017 : Slide 55
SSANOVA in Practice One-Way SSANOVA
Unidimensional Smoothing Splines in R
Many options for unidimensional smoothing spline in R:smooth.spline function (in stats package)bigspline function (in bigsplines package)bigssa function (in bigsplines package)ssanova function (in gss package)gam function (in mgcv package)
For unidimensional smoothing, we will focus on the smooth.splineand the bigspline functions, which have simple syntax.
Nathaniel E. Helwig (U of Minnesota) Smoothing Spline ANOVA Updated 04-Jan-2017 : Slide 56
SSANOVA in Practice One-Way SSANOVA
smooth.spline: Overview
> set.seed(1)> x = seq(0,1,length=100)> eta = 2 + x + sin(2*pi*x)> y = eta + rnorm(100)> plot(x,y)> smsp = smooth.spline(x,y)> lines(x,smsp$y)> lines(x,eta,lty=2)
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
0.0 0.2 0.4 0.6 0.8 1.0
01
23
4
x
y
Nathaniel E. Helwig (U of Minnesota) Smoothing Spline ANOVA Updated 04-Jan-2017 : Slide 57
SSANOVA in Practice One-Way SSANOVA
smooth.spline: Changing Smoothing Parameter
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●●
●●
●
●
●
●
0.0 0.2 0.4 0.6 0.8 1.0
12
34
spar=0.25
x
y
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●●
●●
●
●
●
●
0.0 0.2 0.4 0.6 0.8 1.0
12
34
spar=0.75
x
y●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●●
●●
●
●
●
●
0.0 0.2 0.4 0.6 0.8 1.0
12
34
spar=1
x
y
R code for leftmost plot:> smsp = smooth.spline(x,y,spar=0.25)> plot(x,y,main="spar=0.25")> lines(x,smsp$y)> lines(x,eta,lty=2)
Nathaniel E. Helwig (U of Minnesota) Smoothing Spline ANOVA Updated 04-Jan-2017 : Slide 58
SSANOVA in Practice One-Way SSANOVA
smooth.spline: Changing Number of Knots
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●●
●●
●
●
●
●
0.0 0.2 0.4 0.6 0.8 1.0
12
34
spar=0.5, nknots=10
x
y
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●●
●●
●
●
●
●
0.0 0.2 0.4 0.6 0.8 1.0
12
34
spar=0.5, nknots=20
x
y●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●●
●●
●
●
●
●
0.0 0.2 0.4 0.6 0.8 1.0
12
34
spar=0.5, nknots=30
x
y
R code for leftmost plot:> smsp = smooth.spline(x,y,spar=0.5,nknots=10)> plot(x,y,main="spar=0.5, nknots=10")> lines(x,smsp$y)> lines(x,eta,lty=2)
Nathaniel E. Helwig (U of Minnesota) Smoothing Spline ANOVA Updated 04-Jan-2017 : Slide 59
SSANOVA in Practice One-Way SSANOVA
smooth.spline: CV versus GCV
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●●
●●
●
●
●
●
0.0 0.2 0.4 0.6 0.8 1.0
12
34
nknots=20, cv=TRUE
x
y
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●●
●●
●
●
●
●
0.0 0.2 0.4 0.6 0.8 1.0
12
34
nknots=20, cv=FALSE
x
y
R code for leftmost plot:> smsp = smooth.spline(x,y,nknots=20,cv=TRUE)> plot(x,y,main="nknots=20, cv=TRUE")> lines(x,smsp$y)> lines(x,eta,lty=2)Nathaniel E. Helwig (U of Minnesota) Smoothing Spline ANOVA Updated 04-Jan-2017 : Slide 60
SSANOVA in Practice One-Way SSANOVA
smooth.spline: Number of Knots (revisited)
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●●
●●
●
●
●
●
0.0 0.2 0.4 0.6 0.8 1.0
12
34
cv=FALSE, nknots=10
x
y
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●●
●●
●
●
●
●
0.0 0.2 0.4 0.6 0.8 1.0
12
34
cv=FALSE, nknots=20
x
y●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●●
●●
●
●
●
●
0.0 0.2 0.4 0.6 0.8 1.0
12
34
cv=FALSE, nknots=30
x
y
R code for leftmost plot:> smsp = smooth.spline(x,y,nknots=10)> plot(x,y,main="cv=FALSE, nknots=10")> lines(x,smsp$y)> lines(x,eta,lty=2)
Nathaniel E. Helwig (U of Minnesota) Smoothing Spline ANOVA Updated 04-Jan-2017 : Slide 61
SSANOVA in Practice One-Way SSANOVA
smooth.spline: Predicting for New Data
Given η we can predict for a new sequence of data:
> set.seed(1)> x = seq(0,1,length=100)> eta = 2 + x + sin(2*pi*x)> y = eta + rnorm(100)> plot(x,y,main="Prediction")> smsp = smooth.spline(x,y)> newdata = seq(0,1,length=200)> yhat = predict(smsp,newdata)> lines(yhat)> lines(x,eta,lty=2)
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
0.0 0.2 0.4 0.6 0.8 1.0
01
23
4
Prediction
x
y
Nathaniel E. Helwig (U of Minnesota) Smoothing Spline ANOVA Updated 04-Jan-2017 : Slide 62
SSANOVA in Practice One-Way SSANOVA
bigspline: Overview
For smoothing large samples. . .
> set.seed(1)> x = seq(0,1,length=100)> eta = 2 + x + sin(2*pi*x)> y = eta + rnorm(100)> plot(x,y)> bigsp = bigspline(x,y)> lines(x,bigsp$fitted)> lines(x,eta,lty=2)
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
0.0 0.2 0.4 0.6 0.8 1.0
01
23
4
x
y
Nathaniel E. Helwig (U of Minnesota) Smoothing Spline ANOVA Updated 04-Jan-2017 : Slide 63
SSANOVA in Practice One-Way SSANOVA
bigspline: Changing Smoothing Parameter
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
0.0 0.2 0.4 0.6 0.8 1.0
01
23
4
lambdas=10^−9
x
y
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
0.0 0.2 0.4 0.6 0.8 1.0
01
23
4
lambdas=10^−5
x
y
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
0.0 0.2 0.4 0.6 0.8 1.0
01
23
4
lambdas=1
x
y
R code for leftmost plot:> bigsp = bigspline(x,y,lambdas=10^-9)> plot(x,y,main="lambdas=10^-9")> lines(x,bigsp$fitted)> lines(x,eta,lty=2)
Nathaniel E. Helwig (U of Minnesota) Smoothing Spline ANOVA Updated 04-Jan-2017 : Slide 64
SSANOVA in Practice One-Way SSANOVA
bigspline: Changing Number of Knots
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
0.0 0.2 0.4 0.6 0.8 1.0
01
23
4
lambdas=10^−5, nknots=10
x
y
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
0.0 0.2 0.4 0.6 0.8 1.0
01
23
4
lambdas=10^−5, nknots=20
x
y
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
0.0 0.2 0.4 0.6 0.8 1.0
01
23
4
lambdas=10^−5, nknots=30
x
y
R code for leftmost plot:> bigsp = bigspline(x,y,lambdas=10^-5,nknots=10)> plot(x,y,main="lambdas=10^-5, nknots=10")> lines(x,bigsp$fitted)> lines(x,eta,lty=2)
Nathaniel E. Helwig (U of Minnesota) Smoothing Spline ANOVA Updated 04-Jan-2017 : Slide 65
SSANOVA in Practice One-Way SSANOVA
bigspline: Number of Knots (revisited)
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
0.0 0.2 0.4 0.6 0.8 1.0
01
23
4
GCV, nknots=10
x
y
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
0.0 0.2 0.4 0.6 0.8 1.0
01
23
4
GCV, nknots=20
x
y
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
0.0 0.2 0.4 0.6 0.8 1.0
01
23
4
GCV, nknots=30
x
y
R code for leftmost plot:> bigsp = bigspline(x,y,nknots=10)> plot(x,y,main="GCV, nknots=10")> lines(x,bigsp$fitted)> lines(x,eta,lty=2)
Nathaniel E. Helwig (U of Minnesota) Smoothing Spline ANOVA Updated 04-Jan-2017 : Slide 66
SSANOVA in Practice One-Way SSANOVA
bigspline: Predicting for New Data
Given η we can predict for a new sequence of data:
> set.seed(1)> x = seq(0,1,length=100)> eta = 2 + x + sin(2*pi*x)> y = eta + rnorm(100)> plot(x,y,main="Prediction")> bigsp = bigspline(x,y)> newdata = seq(0,1,length=200)> yhat = predict(bigsp,newdata)> lines(newdata,yhat)> lines(x,eta,lty=2)
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
0.0 0.2 0.4 0.6 0.8 1.0
01
23
4
Prediction
x
y
Nathaniel E. Helwig (U of Minnesota) Smoothing Spline ANOVA Updated 04-Jan-2017 : Slide 67
SSANOVA in Practice One-Way SSANOVA
bigspline: Predicting Linear and Non-Linear Effects
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
0.0 0.2 0.4 0.6 0.8 1.0
01
23
4
Full Prediction
x
y
0.0 0.2 0.4 0.6 0.8 1.0
2.0
2.2
2.4
2.6
2.8
3.0
Linear Effect
x
2 +
x
0.0 0.2 0.4 0.6 0.8 1.0
−1.
0−
0.5
0.0
0.5
1.0
Non−Linear Effect
x
sin(
2 *
pi *
x)
R code for center and rightmost plot:> newdata = seq(0,1,length=200)> plot(x,2+x,main="Linear Effect",type="l",lty=2)> yhat = predict(bigsp,newdata,effect="0") + predict(bigsp,newdata,effect="lin")> lines(newdata,yhat)> plot(x,sin(2*pi*x),main="Non-Linear Effect",type="l",lty=2)> yhat = predict(bigsp,newdata,effect="non")> lines(newdata,yhat)
Nathaniel E. Helwig (U of Minnesota) Smoothing Spline ANOVA Updated 04-Jan-2017 : Slide 68
SSANOVA in Practice One-Way SSANOVA
bigspline: Bayesian Confidence Intervals
> dev.new(width=6,height=6,noRStudioGD=TRUE)> set.seed(1)> x = seq(0,1,length=100)> eta = 2 + x + sin(2*pi*x)> y = eta + rnorm(100)> bigsp = bigspline(x,y,se.fit=TRUE)> cilo = bigsp$fit - qnorm(0.975)*bigsp$se> cihi = bigsp$fit + qnorm(0.975)*bigsp$se> plot(x,y)> lines(x,eta)> lines(bigsp$xunique,cilo,lty=2)> lines(bigsp$xunique,cihi,lty=2)> sum(eta>=cilo & eta<=cihi)/length(x)[1] 1
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
0.0 0.2 0.4 0.6 0.8 1.0
01
23
4
x
y
Nathaniel E. Helwig (U of Minnesota) Smoothing Spline ANOVA Updated 04-Jan-2017 : Slide 69
SSANOVA in Practice One-Way SSANOVA
Comparing smooth.spline and bigspline
Consider the function yi = 2 + xi + sin(2πxi) + ei where eiiid∼ N(0, σ2).
Suppose that xi = i/n for i ∈ {0, . . . ,n} and σ2 = 1 so that eiiid∼ N(0,1).
Median true MSE = 1n∑n
i=1(η(xi)− η(xi))2 using q = 20 knots:n 100 1000 10000 1e+05 1e+06
smooth.spline 0.13836 0.00504 0.00113 1e-04 2e-05bigspline 0.14030 0.00497 0.00110 1e-04 2e-05
Median runtimes using q = 20 knots:n 100 1000 10000 1e+05 1e+06
smooth.spline 0.001 0.002 0.021 0.1965 2.233bigspline 0.009 0.009 0.011 0.0120 0.094
Nathaniel E. Helwig (U of Minnesota) Smoothing Spline ANOVA Updated 04-Jan-2017 : Slide 70
SSANOVA in Practice One-Way SSANOVA
R Code for Simulation (on previous slide)
nsamp = 10^c(2:6)simresults = NULLxnew = seq(0,1,length=200)set.seed(1)for(j in 1:5){for(k in 1:10){
x = seq(0,1,length=nsamp[j])eta = 2 + x + sin(2*pi*x)y = eta + rnorm(nsamp[j])
tic = proc.time()ssmod = smooth.spline(x,y,nknots=20)toc = proc.time() - tictmse = sum( (ssmod$y - eta)^2 ) / nsamp[j]simsp = data.frame(method="smsp",n=nsamp[j],time=toc[3],tmse=tmse,row.names=k)
tic = proc.time()ssmod = bigspline(x,y,nknots=20)toc = proc.time() - tictmse = sum( (predict(ssmod) - eta)^2 ) / nsamp[j]simbig = data.frame(method="big",n=nsamp[j],time=toc[3],tmse=tmse,row.names=k+1)
simresults = rbind(simresults,simsp,simbig)}
}
round(tapply(simresults$tmse,list(simresults$method,simresults$n),median),5)round(tapply(simresults$time,list(simresults$method,simresults$n),median),5)
Nathaniel E. Helwig (U of Minnesota) Smoothing Spline ANOVA Updated 04-Jan-2017 : Slide 71
SSANOVA in Practice One-Way SSANOVA
bigspline: Linear and Non-Linear Effects (revisited)
0.0 0.2 0.4 0.6 0.8 1.0
2.0
2.2
2.4
2.6
2.8
3.0
Linear: n = 100
xnew
2 +
xne
w
0.0 0.2 0.4 0.6 0.8 1.0
−1.
0−
0.5
0.0
0.5
1.0
Non−linear: n = 100
xnew
sin(
2 *
pi *
xne
w)
0.0 0.2 0.4 0.6 0.8 1.0
2.0
2.2
2.4
2.6
2.8
3.0
Linear: n = 1000
xnew
2 +
xne
w
0.0 0.2 0.4 0.6 0.8 1.0
−1.
0−
0.5
0.0
0.5
1.0
Non−linear: n = 1000
xnew
sin(
2 *
pi *
xne
w)
0.0 0.2 0.4 0.6 0.8 1.0
2.0
2.2
2.4
2.6
2.8
3.0
Linear: n = 10000
xnew
2 +
xne
w
0.0 0.2 0.4 0.6 0.8 1.0
−1.
0−
0.5
0.0
0.5
1.0
Non−linear: n = 10000
xnew
sin(
2 *
pi *
xne
w)
0.0 0.2 0.4 0.6 0.8 1.0
2.0
2.2
2.4
2.6
2.8
3.0
Linear: n = 1e+05
xnew
2 +
xne
w
0.0 0.2 0.4 0.6 0.8 1.0
−1.
0−
0.5
0.0
0.5
1.0
Non−linear: n = 1e+05
xnew
sin(
2 *
pi *
xne
w)
Nathaniel E. Helwig (U of Minnesota) Smoothing Spline ANOVA Updated 04-Jan-2017 : Slide 72
SSANOVA in Practice Two-Way SSANOVA (Additive)
Multidimensional Smoothing Splines in R
A few options for multidimensional smoothing splines in R:bigssa function (in bigsplines package)bigssp function (in bigsplines package)ssanova function (in gss package)gam function (in mgcv package)
We will focus on the ssanova and bigssa (or bigssp) functions,which fit tensor product smoothing splines.
Note that the gam function handles interactions in different manner.
Nathaniel E. Helwig (U of Minnesota) Smoothing Spline ANOVA Updated 04-Jan-2017 : Slide 73
SSANOVA in Practice Two-Way SSANOVA (Additive)
Additive Function: Definition
Suppose we have the following function defined forx = (x1, x2) ∈ [0,1]× {a,b}:
addfun = function(x1,x2){funval = sin(2*pi*x1)idx = which(x2=="a")funval[idx] = funval[idx] + 2funval
}
Note that the function isη(x1, x2) = 2 + sin(2πx1) if x2 = aη(x1, x2) = sin(2πx1) if x2 6= a
Nathaniel E. Helwig (U of Minnesota) Smoothing Spline ANOVA Updated 04-Jan-2017 : Slide 74
SSANOVA in Practice Two-Way SSANOVA (Additive)
Additive Function: Visualization
0.0 0.2 0.4 0.6 0.8 1.0
−2
−1
01
23
4
x1
η(x 1
, x2)
x2 = ax2 = b
Nathaniel E. Helwig (U of Minnesota) Smoothing Spline ANOVA Updated 04-Jan-2017 : Slide 75
SSANOVA in Practice Two-Way SSANOVA (Additive)
Additive Function: bigssa fitting
> n = 100> set.seed(55455)> x1v = seq(0,1,length=n)> x2v = factor(sample(letters[1:2],n,replace=TRUE))> eta = addfun(x1v,x2v)> y = eta + rnorm(n)> idx = binsamp(cbind(x1v,x2v),nmbin=c(20,2))> ssint = bigssa(y~x1v*x2v, type=list(x1v="cub",x2v="nom"), nknots=idx)> sum((ssint$fitted-eta)^2) / length(eta)[1] 0.04605668> ssadd = bigssa(y~x1v+x2v, type=list(x1v="cub",x2v="nom"), nknots=idx)> sum((ssadd$fitted-eta)^2) / length(eta)[1] 0.03305623> fitstats = rbind(ssint$info,ssadd$info)> rownames(fitstats) = c("int","add")> fitstats
gcv rsq aic bicint 1.441561 0.6258559 319.3529 344.6341add 1.386159 0.6134636 316.0127 332.6986
Nathaniel E. Helwig (U of Minnesota) Smoothing Spline ANOVA Updated 04-Jan-2017 : Slide 76
SSANOVA in Practice Two-Way SSANOVA (Additive)
Additive Function: bigssa prediction> dev.new(width=12,height=6,noRStudioGD=TRUE)> par(mfrow=c(1,2))> newdata = expand.grid(x1v=seq(0,1,length=100),x2v=c("a","b"))> yint = predict(ssint,newdata)> yadd = predict(ssadd,newdata)> plot(newdata[1:100,1],yint[1:100],main="Interaction",+ type="l",ylim=c(-2,4))> lines(newdata[101:200,1],yint[101:200],lty=2)> plot(newdata[1:100,1],yadd[1:100],main="Additive",+ type="l",ylim=c(-2,4))> lines(newdata[101:200,1],yadd[101:200],lty=2)
0.0 0.2 0.4 0.6 0.8 1.0
−2
−1
01
23
4
Interaction
newdata[1:100, 1]
yint
[1:1
00]
0.0 0.2 0.4 0.6 0.8 1.0
−2
−1
01
23
4
Additive
newdata[1:100, 1]
yadd
[1:1
00]
Nathaniel E. Helwig (U of Minnesota) Smoothing Spline ANOVA Updated 04-Jan-2017 : Slide 77
SSANOVA in Practice Two-Way SSANOVA (Additive)
Additive Function: ssanova fitting
> n = 100> set.seed(55455)> x1v = seq(0,1,length=n)> x2v = factor(sample(letters[1:2],n,replace=TRUE))> eta = addfun(x1v,x2v)> y = eta + rnorm(n)> idx = binsamp(cbind(x1v,x2v),nmbin=c(20,2))> ssint = ssanova(y~x1v*x2v,type=list(x1v="cubic",x2v="nominal"),+ id.basis=idx)> newdata = data.frame(x1v=x1v,x2v=x2v)> sum((predict(ssint,newdata)-eta)^2) / length(eta)[1] 0.01449173> ssadd = ssanova(y~x1v+x2v,type=list(x1v="cubic",x2v="nominal"),+ id.basis=idx)> sum((predict(ssadd,newdata)-eta)^2) / length(eta)[1] 0.01432404
Nathaniel E. Helwig (U of Minnesota) Smoothing Spline ANOVA Updated 04-Jan-2017 : Slide 78
SSANOVA in Practice Two-Way SSANOVA (Additive)
Additive Function: ssanova prediction> dev.new(width=12,height=6,noRStudioGD=TRUE)> par(mfrow=c(1,2))> newdata = expand.grid(x1v=seq(0,1,length=100),x2v=c("a","b"))> yint = predict(ssint,newdata)> yadd = predict(ssadd,newdata)> plot(newdata[1:100,1],yint[1:100],main="Interaction",+ type="l",ylim=c(-2,4))> lines(newdata[101:200,1],yint[101:200],lty=2)> plot(newdata[1:100,1],yadd[1:100],main="Additive",+ type="l",ylim=c(-2,4))> lines(newdata[101:200,1],yadd[101:200],lty=2)
0.0 0.2 0.4 0.6 0.8 1.0
−2
−1
01
23
4
Interaction
newdata[1:100, 1]
yint
[1:1
00]
0.0 0.2 0.4 0.6 0.8 1.0
−2
−1
01
23
4
Additive
newdata[1:100, 1]
yadd
[1:1
00]
Nathaniel E. Helwig (U of Minnesota) Smoothing Spline ANOVA Updated 04-Jan-2017 : Slide 79
SSANOVA in Practice Two-Way SSANOVA (Interaction)
Interaction Function: Definition
Suppose we have the following function defined forx = (x1, x2) ∈ [0,1]× {a,b}:
intfun = function(x1,x2){funval = sin(2*pi*x1)idx = which(x2=="a")funval[idx] = funval[idx] + 2 + sin(4*pi*x1[idx])funval
}
Note that the function isη(x1, x2) = 2 + sin(2πx1) + sin(4πx1) if x2 = aη(x1, x2) = sin(2πx1) if x2 6= a
Nathaniel E. Helwig (U of Minnesota) Smoothing Spline ANOVA Updated 04-Jan-2017 : Slide 80
SSANOVA in Practice Two-Way SSANOVA (Interaction)
Interaction Function: Visualization
0.0 0.2 0.4 0.6 0.8 1.0
−2
−1
01
23
4
x1
η(x 1
, x2)
x2 = ax2 = b
Nathaniel E. Helwig (U of Minnesota) Smoothing Spline ANOVA Updated 04-Jan-2017 : Slide 81
SSANOVA in Practice Two-Way SSANOVA (Interaction)
Interaction Function: bigssa fitting
> n = 100> set.seed(55455)> x1v = seq(0,1,length=n)> x2v = factor(sample(letters[1:2],n,replace=TRUE))> eta = intfun(x1v,x2v)> y = eta + rnorm(n)> idx = binsamp(cbind(x1v,x2v),nmbin=c(20,2))> ssint = bigssa(y~x1v*x2v,type=list(x1v="cub",x2v="nom"),nknots=idx)> sum((ssint$fitted-eta)^2) / length(eta)[1] 0.1081747> ssadd = bigssa(y~x1v+x2v,type=list(x1v="cub",x2v="nom"),nknots=idx)> sum((ssadd$fitted-eta)^2) / length(eta)[1] 0.1858098> fitstats = rbind(ssint$info,ssadd$info)> rownames(fitstats) = c("int","add")> fitstats
gcv rsq aic bicint 1.522061 0.6509204 324.0861 356.6680add 1.510616 0.6097741 324.5034 343.1147
Nathaniel E. Helwig (U of Minnesota) Smoothing Spline ANOVA Updated 04-Jan-2017 : Slide 82
SSANOVA in Practice Two-Way SSANOVA (Interaction)
Interaction Function: bigssa prediction> dev.new(width=12,height=6,noRStudioGD=TRUE)> par(mfrow=c(1,2))> newdata = expand.grid(x1v=seq(0,1,length=100),x2v=c("a","b"))> yint = predict(ssint,newdata)> yadd = predict(ssadd,newdata)> plot(newdata[1:100,1],yint[1:100],main="Interaction",+ type="l",ylim=c(-2,4))> lines(newdata[101:200,1],yint[101:200],lty=2)> plot(newdata[1:100,1],yadd[1:100],main="Additive",+ type="l",ylim=c(-2,4))> lines(newdata[101:200,1],yadd[101:200],lty=2)
0.0 0.2 0.4 0.6 0.8 1.0
−2
−1
01
23
4
Interaction
newdata[1:100, 1]
yint
[1:1
00]
0.0 0.2 0.4 0.6 0.8 1.0
−2
−1
01
23
4
Additive
newdata[1:100, 1]
yadd
[1:1
00]
Nathaniel E. Helwig (U of Minnesota) Smoothing Spline ANOVA Updated 04-Jan-2017 : Slide 83
SSANOVA in Practice Two-Way SSANOVA (Interaction)
Interaction Function: ssanova fitting
> n = 100> set.seed(55455)> x1v = seq(0,1,length=n)> x2v = factor(sample(letters[1:2],n,replace=TRUE))> eta = intfun(x1v,x2v)> y = eta + rnorm(n)> idx = binsamp(cbind(x1v,x2v),nmbin=c(20,2))> ssint = ssanova(y~x1v*x2v,type=list(x1v="cubic",x2v="nominal"),+ id.basis=idx)> newdata = data.frame(x1v=x1v,x2v=x2v)> sum((predict(ssint,newdata)-eta)^2) / length(eta)[1] 0.1624814> ssadd = ssanova(y~x1v+x2v,type=list(x1v="cubic",x2v="nominal"),+ id.basis=idx)> sum((predict(ssadd,newdata)-eta)^2) / length(eta)[1] 0.1802812
Nathaniel E. Helwig (U of Minnesota) Smoothing Spline ANOVA Updated 04-Jan-2017 : Slide 84
SSANOVA in Practice Two-Way SSANOVA (Interaction)
Interaction Function: ssanova prediction> dev.new(width=12,height=6,noRStudioGD=TRUE)> par(mfrow=c(1,2))> newdata = expand.grid(x1v=seq(0,1,length=100),x2v=c("a","b"))> yint = predict(ssint,newdata)> yadd = predict(ssadd,newdata)> plot(newdata[1:100,1],yint[1:100],main="Interaction",+ type="l",ylim=c(-2,4))> lines(newdata[101:200,1],yint[101:200],lty=2)> plot(newdata[1:100,1],yadd[1:100],main="Additive",+ type="l",ylim=c(-2,4))> lines(newdata[101:200,1],yadd[101:200],lty=2)
0.0 0.2 0.4 0.6 0.8 1.0
−2
−1
01
23
4
Interaction
newdata[1:100, 1]
yint
[1:1
00]
0.0 0.2 0.4 0.6 0.8 1.0
−2
−1
01
23
4
Additive
newdata[1:100, 1]
yadd
[1:1
00]
Nathaniel E. Helwig (U of Minnesota) Smoothing Spline ANOVA Updated 04-Jan-2017 : Slide 85
SSANOVA in Practice Two-Way SSANOVA (Interaction)
Interaction Function: Fitting with More Data
> n = 1000> set.seed(55455)> x1v = seq(0,1,length=n)> x2v = factor(sample(letters[1:2],n,replace=TRUE))> eta = intfun(x1v,x2v)> y = eta + rnorm(n)> idx = binsamp(cbind(x1v,x2v),nmbin=c(20,2))> ssint = bigssa(y~x1v*x2v,type=list(x1v="cub",x2v="nom"),nknots=idx)> sum((ssint$fitted-eta)^2) / length(eta)[1] 0.03178251> ssadd = bigssa(y~x1v+x2v,type=list(x1v="cub",x2v="nom"),nknots=idx)> sum((ssadd$fitted-eta)^2) / length(eta)[1] 0.1311356> fitstats = rbind(ssint$info,ssadd$info)> rownames(fitstats) = c("int","add")> fitstats
gcv rsq aic bicint 1.016522 0.6479397 2854.060 2923.757add 1.081167 0.6236793 2915.779 2973.402
Nathaniel E. Helwig (U of Minnesota) Smoothing Spline ANOVA Updated 04-Jan-2017 : Slide 86
SSANOVA in Practice Two-Way SSANOVA (Interaction)
Interaction Function: Predicting with More Data> dev.new(width=12,height=6,noRStudioGD=TRUE)> par(mfrow=c(1,2))> newdata = expand.grid(x1v=seq(0,1,length=100),x2v=c("a","b"))> yint = predict(ssint,newdata)> yadd = predict(ssadd,newdata)> plot(newdata[1:100,1],yint[1:100],main="Interaction",+ type="l",ylim=c(-2,4))> lines(newdata[101:200,1],yint[101:200],lty=2)> plot(newdata[1:100,1],yadd[1:100],main="Additive",+ type="l",ylim=c(-2,4))> lines(newdata[101:200,1],yadd[101:200],lty=2)
0.0 0.2 0.4 0.6 0.8 1.0
−2
−1
01
23
4
Interaction
newdata[1:100, 1]
yint
[1:1
00]
0.0 0.2 0.4 0.6 0.8 1.0
−2
−1
01
23
4
Additive
newdata[1:100, 1]
yadd
[1:1
00]
Nathaniel E. Helwig (U of Minnesota) Smoothing Spline ANOVA Updated 04-Jan-2017 : Slide 87