introduction to kernel smoothing
TRANSCRIPT
-
8/3/2019 Introduction to Kernel Smoothing
1/24
Introduction to Kernel Smoothing
Wilcoxon score
Density
700 800 900 1000 1100 1200 13000
.000
0.0
02
0.0
04
0.0
06
-
8/3/2019 Introduction to Kernel Smoothing
2/24
-
8/3/2019 Introduction to Kernel Smoothing
3/24
Introduction
Histogram of some pvalues
pvalues
Density
0.0 0.2 0.4 0.6 0.8 1.0
0
2
4
6
8
10
12
Stefanie Scheid - Introduction to Kernel Smoothing - January 5, 2004 2
-
8/3/2019 Introduction to Kernel Smoothing
4/24
Introduction
Estimation of functions such as regression functions or probability
density functions.
Kernel-based methods are most popular non-parametric estimators.
Can uncover structural features in the data which a parametric
approach might not reveal.
Stefanie Scheid - Introduction to Kernel Smoothing - January 5, 2004 3
-
8/3/2019 Introduction to Kernel Smoothing
5/24
Univariate kernel density estimator
Given a random sample X1, . . . , X n with a continuous, univariate
density f. The kernel density estimator is
f(x, h) =1
nh
ni=1
K
xXi
h
with kernel K and bandwidth h. Under mild conditions (hmust decrease with increasing n) the kernel estimate converges in
probability to the true density.
Stefanie Scheid - Introduction to Kernel Smoothing - January 5, 2004 4
-
8/3/2019 Introduction to Kernel Smoothing
6/24
The kernel K
Can be a proper pdf. Usually chosen to be unimodal and symmetric
about zero.
Center of kernel is placed right over each data point.
Influence of each data point is spread about its neighborhood.
Contribution from each point is summed to overall estimate.
Stefanie Scheid - Introduction to Kernel Smoothing - January 5, 2004 5
-
8/3/2019 Introduction to Kernel Smoothing
7/24
q q q q q q q q q q
0 5 10 15
0.
0
0.
5
1.
0
1.
5
Gaussian kernel density estimate
q q q q q q q q q q
Stefanie Scheid - Introduction to Kernel Smoothing - January 5, 2004 6
-
8/3/2019 Introduction to Kernel Smoothing
8/24
The bandwidth h
Scaling factor.
Controls how wide the probability mass is spread around a point.
Controls the smoothness or roughness of a density estimate.
Bandwidth selection bears danger of under- or oversmoothing.
Stefanie Scheid - Introduction to Kernel Smoothing - January 5, 2004 7
-
8/3/2019 Introduction to Kernel Smoothing
9/24
From over- to undersmoothing
KDE with b=0.1
pvalues
Density
0.0 0.2 0.4 0.6 0.8 1.0
0
2
4
6
8
10
12
Stefanie Scheid - Introduction to Kernel Smoothing - January 5, 2004 8
-
8/3/2019 Introduction to Kernel Smoothing
10/24
From over- to undersmoothing
KDE with b=0.05
pvalues
Density
0.0 0.2 0.4 0.6 0.8 1.0
0
2
4
6
8
10
12
Stefanie Scheid - Introduction to Kernel Smoothing - January 5, 2004 9
-
8/3/2019 Introduction to Kernel Smoothing
11/24
From over- to undersmoothing
KDE with b=0.02
pvalues
Density
0.0 0.2 0.4 0.6 0.8 1.0
0
2
4
6
8
10
12
Stefanie Scheid - Introduction to Kernel Smoothing - January 5, 2004 10
-
8/3/2019 Introduction to Kernel Smoothing
12/24
From over- to undersmoothing
KDE with b=0.005
pvalues
Density
0.0 0.2 0.4 0.6 0.8 1.0
0
2
4
6
8
10
12
Stefanie Scheid - Introduction to Kernel Smoothing - January 5, 2004 11
-
8/3/2019 Introduction to Kernel Smoothing
13/24
Some kernels
2 1 0 1 2
0.0
0.
1
0.2
0.3
0.
4
0.5
Uniform
2 1 0 1 2
0.0
0.2
0.
4
0.
6
Epanechnikov
2 1 0 1 2
0.0
0.
2
0.4
0.
6
0.8
Biweight
4 2 0 2 4
0.
0
0
.1
0.
2
0.
3
0.
4
Gauss
2 1 0 1 2
0.0
0.
2
0.4
0.
6
0.
8
1.0
Triangular
Stefanie Scheid - Introduction to Kernel Smoothing - January 5, 2004 12
-
8/3/2019 Introduction to Kernel Smoothing
14/24
Some kernels
K(x, p) =(1 x2)p
22p+1B(p + 1, p + 1)1{|x|
-
8/3/2019 Introduction to Kernel Smoothing
15/24
Kernel efficiency
Perfomance of kernel is measured by MISE (mean integrated
squared error) or AMISE (asymptotic MISE).
Epanechnikov kernel minimizes AMISE and is therefore optimal.
Kernel efficiency is measured in comparison to Epanechnikov kernel.
Stefanie Scheid - Introduction to Kernel Smoothing - January 5, 2004 14
-
8/3/2019 Introduction to Kernel Smoothing
16/24
Kernel Efficiency
Epanechnikov 1.000
Biweight 0.994
Triangular 0.986
Normal 0.951
Uniform 0.930
Choice of kernel is not as important as choice of bandwidth.
Stefanie Scheid - Introduction to Kernel Smoothing - January 5, 2004 15
-
8/3/2019 Introduction to Kernel Smoothing
17/24
Modified KDEs
Local KDE: Bandwidth depends on x.
Variable KDE: Smooth out the influence of points in sparse regions.
Transformation KDE: If f is difficult to estimate (highly skewed,
high kurtosis), transform data to gain a pdf that is easier to
estimate.
Stefanie Scheid - Introduction to Kernel Smoothing - January 5, 2004 16
-
8/3/2019 Introduction to Kernel Smoothing
18/24
Bandwidth selection
Simple versus high-tech selection rules.
Objective function: MISE/AMISE.
R-function density offers several selection rules.
Stefanie Scheid - Introduction to Kernel Smoothing - January 5, 2004 17
-
8/3/2019 Introduction to Kernel Smoothing
19/24
bw.nrd0, bw.nrd
Normal scale rule.
Assumes f to be normal and calculates the AMISE-optimalbandwidth in this setting.
First guess but oversmoothes if f is multimodal or otherwise not
normal.
Stefanie Scheid - Introduction to Kernel Smoothing - January 5, 2004 18
-
8/3/2019 Introduction to Kernel Smoothing
20/24
bw.ucv
Unbiased (or least squares) cross-validation.
Estimates part of MISE by leave-one-out KDE and minimizes thisestimator with respect to h.
Problems: Several local minima, high variability.
Stefanie Scheid - Introduction to Kernel Smoothing - January 5, 2004 19
-
8/3/2019 Introduction to Kernel Smoothing
21/24
bw.bcv
Biased cross-validation.
Estimation is based on optimization of AMISE instead of MISE (asbw.ucv does).
Lower variance but reasonable bias.
Stefanie Scheid - Introduction to Kernel Smoothing - January 5, 2004 20
-
8/3/2019 Introduction to Kernel Smoothing
22/24
bw.SJ(method=c("ste", "dpi"))
The AMISE optimization involves the estimation of density
functionals like integrated squared density derivatives.
dpi: Direct plug-in rule. Estimates the needed functionals by KDE.
Problem: Choice of pilot bandwidth.
ste: Solve-the-equation rule. The pilot bandwidth depends on h.
Stefanie Scheid - Introduction to Kernel Smoothing - January 5, 2004 21
-
8/3/2019 Introduction to Kernel Smoothing
23/24
Comparison of bandwidth selectors
Simulation results depend on selected true densities.
Selectors with pilot bandwidths perform quite well but rely on
asymptotics less accurate for densities with sharp features
(e.g. multiple modes).
UCV has high variance but does not depend on asymptotics.
BCV performs bad in several simulations.
Authors recommendation: DPI or STE better than UCV or BCV.
Stefanie Scheid - Introduction to Kernel Smoothing - January 5, 2004 22
-
8/3/2019 Introduction to Kernel Smoothing
24/24
KDE with Epanechnikov kernel and DPI rule
pvalues
Density
0.0 0.2 0.4 0.6 0.8 1.0
0
2
4
6
8
10
12
Stefanie Scheid - Introduction to Kernel Smoothing - January 5, 2004 23