regression - peoplejordan/courses/...-kernel regression and locally weighted regression 31. errors...
TRANSCRIPT
![Page 1: Regression - Peoplejordan/courses/...-Kernel Regression and Locally Weighted Regression 31. Errors in Variables (Total Least Squares) 0 0 32. Sensitivity to outliers High weight given](https://reader030.vdocuments.mx/reader030/viewer/2022040923/5e9d84368f48f77c3e624855/html5/thumbnails/1.jpg)
Regression
Practical Machine LearningFabian Wauthier
09/10/2009
Adapted from slides by Kurt Miller and Romain Thibaux
1
![Page 2: Regression - Peoplejordan/courses/...-Kernel Regression and Locally Weighted Regression 31. Errors in Variables (Total Least Squares) 0 0 32. Sensitivity to outliers High weight given](https://reader030.vdocuments.mx/reader030/viewer/2022040923/5e9d84368f48f77c3e624855/html5/thumbnails/2.jpg)
Outline• Ordinary Least Squares Regression
- Online version
- Normal equations
- Probabilistic interpretation
• Overfitting and Regularization
• Overview of additional topics
- L1 Regression
- Quantile Regression
- Generalized linear models
- Kernel Regression and Locally Weighted Regression
2
![Page 3: Regression - Peoplejordan/courses/...-Kernel Regression and Locally Weighted Regression 31. Errors in Variables (Total Least Squares) 0 0 32. Sensitivity to outliers High weight given](https://reader030.vdocuments.mx/reader030/viewer/2022040923/5e9d84368f48f77c3e624855/html5/thumbnails/3.jpg)
Outline• Ordinary Least Squares Regression
- Online version
- Normal equations
- Probabilistic interpretation
• Overfitting and Regularization
• Overview of additional topics
- L1 Regression
- Quantile Regression
- Generalized linear models
- Kernel Regression and Locally Weighted Regression
3
![Page 4: Regression - Peoplejordan/courses/...-Kernel Regression and Locally Weighted Regression 31. Errors in Variables (Total Least Squares) 0 0 32. Sensitivity to outliers High weight given](https://reader030.vdocuments.mx/reader030/viewer/2022040923/5e9d84368f48f77c3e624855/html5/thumbnails/4.jpg)
Regression vs. Classification:
Anything:
• continuous (ℜ, ℜd, …)
• discrete ({0,1}, {1,…k}, …)
• structured (tree, string, …)
• …
• Discrete:
– {0,1} binary
– {1,…k} ! multi-class
– tree, etc. structured
Classification
X Y⇒4
![Page 5: Regression - Peoplejordan/courses/...-Kernel Regression and Locally Weighted Regression 31. Errors in Variables (Total Least Squares) 0 0 32. Sensitivity to outliers High weight given](https://reader030.vdocuments.mx/reader030/viewer/2022040923/5e9d84368f48f77c3e624855/html5/thumbnails/5.jpg)
Regression vs. Classification:
Anything:
• continuous (ℜ, ℜd, …)
• discrete ({0,1}, {1,…k}, …)
• structured (tree, string, …)
• …
PerceptronLogistic RegressionSupport Vector Machine
Decision TreeRandom Forest
Kernel trick
Classification
X Y⇒5
![Page 6: Regression - Peoplejordan/courses/...-Kernel Regression and Locally Weighted Regression 31. Errors in Variables (Total Least Squares) 0 0 32. Sensitivity to outliers High weight given](https://reader030.vdocuments.mx/reader030/viewer/2022040923/5e9d84368f48f77c3e624855/html5/thumbnails/6.jpg)
Regression vs. Classification:
Anything:
• continuous (ℜ, ℜd, …)
• discrete ({0,1}, {1,…k}, …)
• structured (tree, string, …)
• …
Regression
• continuous:– ℜ, ℜd
X Y⇒6
![Page 7: Regression - Peoplejordan/courses/...-Kernel Regression and Locally Weighted Regression 31. Errors in Variables (Total Least Squares) 0 0 32. Sensitivity to outliers High weight given](https://reader030.vdocuments.mx/reader030/viewer/2022040923/5e9d84368f48f77c3e624855/html5/thumbnails/7.jpg)
Examples
• Voltage Temperature• Processes, memory Power consumption• Protein structure Energy• Robot arm controls Torque at effector• Location, industry, past losses Premium
⇒⇒⇒
⇒
⇒
7
![Page 8: Regression - Peoplejordan/courses/...-Kernel Regression and Locally Weighted Regression 31. Errors in Variables (Total Least Squares) 0 0 32. Sensitivity to outliers High weight given](https://reader030.vdocuments.mx/reader030/viewer/2022040923/5e9d84368f48f77c3e624855/html5/thumbnails/8.jpg)
Linear regression
0 10 200
20
40
x
y
010
2030
40
0
10
20
30
20
22
24
26
x
y
Given examplesPredict given a new point
8
![Page 9: Regression - Peoplejordan/courses/...-Kernel Regression and Locally Weighted Regression 31. Errors in Variables (Total Least Squares) 0 0 32. Sensitivity to outliers High weight given](https://reader030.vdocuments.mx/reader030/viewer/2022040923/5e9d84368f48f77c3e624855/html5/thumbnails/9.jpg)
where is a parameter to be estimated and we have used the standard convention of letting the first component of be 1.
We wish to estimate by a linear function of our data :
010
2030
40
0
10
20
30
20
22
24
26
Linear regression
0 10 200
20
40
x
y
x
y
y x
w
yn+1 = w0 + w1xn+1,1 + w2xn+1,2
= w!xn+1
x
9
![Page 10: Regression - Peoplejordan/courses/...-Kernel Regression and Locally Weighted Regression 31. Errors in Variables (Total Least Squares) 0 0 32. Sensitivity to outliers High weight given](https://reader030.vdocuments.mx/reader030/viewer/2022040923/5e9d84368f48f77c3e624855/html5/thumbnails/10.jpg)
Choosing the regressor
10
Of the many regression fits that approximate the data, which should we choose?
Observation
0 200
Xi =(
1xi
)
10
![Page 11: Regression - Peoplejordan/courses/...-Kernel Regression and Locally Weighted Regression 31. Errors in Variables (Total Least Squares) 0 0 32. Sensitivity to outliers High weight given](https://reader030.vdocuments.mx/reader030/viewer/2022040923/5e9d84368f48f77c3e624855/html5/thumbnails/11.jpg)
LMS Algorithm(Least Mean Squares)
In order to clarify what we mean by a good choice of , we will define a cost function for how well we are doing on the training data:
w
0 200
Error or “residual”
Prediction
Observation
Cost =12
n∑i=1
(w!xi − yi)2
Xi =(
1xi
)
11
![Page 12: Regression - Peoplejordan/courses/...-Kernel Regression and Locally Weighted Regression 31. Errors in Variables (Total Least Squares) 0 0 32. Sensitivity to outliers High weight given](https://reader030.vdocuments.mx/reader030/viewer/2022040923/5e9d84368f48f77c3e624855/html5/thumbnails/12.jpg)
LMS Algorithm(Least Mean Squares)
The best choice of is the one that minimizes our cost functionw
E =12
n∑i=1
(w!xi − yi)2 =n∑
i=1
Ei
In order to optimize this equation, we use standard gradient descent
where
∂
∂wE =
n∑i=1
∂
∂wEi and
∂
∂wEi =
12
∂
∂w(w!xi − yi)2
= (w!xi − yi)xi
wt+1 := wt − α∂
∂wE
12
![Page 13: Regression - Peoplejordan/courses/...-Kernel Regression and Locally Weighted Regression 31. Errors in Variables (Total Least Squares) 0 0 32. Sensitivity to outliers High weight given](https://reader030.vdocuments.mx/reader030/viewer/2022040923/5e9d84368f48f77c3e624855/html5/thumbnails/13.jpg)
LMS Algorithm(Least Mean Squares)
The LMS algorithm is an online method that performs the following update for each new data point
wt+1 := wt − α∂
∂wEi
= wt + α(yi − x!i w)xi
α∂Ei
∂w
13
![Page 14: Regression - Peoplejordan/courses/...-Kernel Regression and Locally Weighted Regression 31. Errors in Variables (Total Least Squares) 0 0 32. Sensitivity to outliers High weight given](https://reader030.vdocuments.mx/reader030/viewer/2022040923/5e9d84368f48f77c3e624855/html5/thumbnails/14.jpg)
LMS, Logistic regression, and Perceptron updates
• LMS
• Logistic Regression
• Perceptron
wt+1 := wt + α(yi − x!i w)xi
wt+1 := wt + α(yi − fw(xi))xi
wt+1 := wt + α(yi − fw(xi))xi
14
![Page 15: Regression - Peoplejordan/courses/...-Kernel Regression and Locally Weighted Regression 31. Errors in Variables (Total Least Squares) 0 0 32. Sensitivity to outliers High weight given](https://reader030.vdocuments.mx/reader030/viewer/2022040923/5e9d84368f48f77c3e624855/html5/thumbnails/15.jpg)
Ordinary Least Squares (OLS)
0 200
Error or “residual”
Prediction
Observation
Cost =12
n∑i=1
(w!xi − yi)2
Xi =(
1xi
)
15
![Page 16: Regression - Peoplejordan/courses/...-Kernel Regression and Locally Weighted Regression 31. Errors in Variables (Total Least Squares) 0 0 32. Sensitivity to outliers High weight given](https://reader030.vdocuments.mx/reader030/viewer/2022040923/5e9d84368f48f77c3e624855/html5/thumbnails/16.jpg)
Minimize the sum squared error
n
d
∂
∂wE = X!Xw −X!y
Setting the derivative equal to zero gives us the Normal Equations
X!Xw = X!y
w = (X!X)−1X!y
E =12
n∑i=1
(w!xi − yi)2
=12(Xw − y)!(Xw − y)
=12(w!X!Xw − 2y!Xw + y!y)
16
![Page 17: Regression - Peoplejordan/courses/...-Kernel Regression and Locally Weighted Regression 31. Errors in Variables (Total Least Squares) 0 0 32. Sensitivity to outliers High weight given](https://reader030.vdocuments.mx/reader030/viewer/2022040923/5e9d84368f48f77c3e624855/html5/thumbnails/17.jpg)
A geometric interpretation
17
We solved∂
∂wE = X!(Xw − y) = 0
Residuals are orthogonal to columns of X⇒⇒ gives the best reconstruction ofy = Xw y
in the range ofX
17
![Page 18: Regression - Peoplejordan/courses/...-Kernel Regression and Locally Weighted Regression 31. Errors in Variables (Total Least Squares) 0 0 32. Sensitivity to outliers High weight given](https://reader030.vdocuments.mx/reader030/viewer/2022040923/5e9d84368f48f77c3e624855/html5/thumbnails/18.jpg)
18
[X]1
y
[X]2
y’
y’ is an orthogonal projection of y onto S
Subspace S spanned by columns of X
Residual vector y!y’ isorthogonal to subspace S
18
![Page 19: Regression - Peoplejordan/courses/...-Kernel Regression and Locally Weighted Regression 31. Errors in Variables (Total Least Squares) 0 0 32. Sensitivity to outliers High weight given](https://reader030.vdocuments.mx/reader030/viewer/2022040923/5e9d84368f48f77c3e624855/html5/thumbnails/19.jpg)
Computing the solution
19
w.
.
Euclidean norm.
the pseudoinverse XIf X!X is not invertible, there is no unique solution
In that case chooses the solution with smallest
and the solution is unique.
w = (X!X)−1X!yWe compute
If X!X is invertible, then (X!X)−1X! coincides with
X+ of
An alternative way to deal with non-invertible X!X isto add a small portion of the identity matrix (= Ridge regression).
w = X+y
19
![Page 20: Regression - Peoplejordan/courses/...-Kernel Regression and Locally Weighted Regression 31. Errors in Variables (Total Least Squares) 0 0 32. Sensitivity to outliers High weight given](https://reader030.vdocuments.mx/reader030/viewer/2022040923/5e9d84368f48f77c3e624855/html5/thumbnails/20.jpg)
Beyond lines and planes
0 10 200
20
40
Linear models become powerful function approximators when we consider non-linear feature transformations.
⇒All the math is the same!
Predictions are still linear in X !
20
![Page 21: Regression - Peoplejordan/courses/...-Kernel Regression and Locally Weighted Regression 31. Errors in Variables (Total Least Squares) 0 0 32. Sensitivity to outliers High weight given](https://reader030.vdocuments.mx/reader030/viewer/2022040923/5e9d84368f48f77c3e624855/html5/thumbnails/21.jpg)
Geometric interpretation
[Matlab demo]
010
20 0
100
200
300
400
-10
0
10
20
y = w0 + w1x + w2x2
21
![Page 22: Regression - Peoplejordan/courses/...-Kernel Regression and Locally Weighted Regression 31. Errors in Variables (Total Least Squares) 0 0 32. Sensitivity to outliers High weight given](https://reader030.vdocuments.mx/reader030/viewer/2022040923/5e9d84368f48f77c3e624855/html5/thumbnails/22.jpg)
Ordinary Least Squares [summary]
n
d
Let
For example
Let
Minimize by solving
Given examples
Predict22
![Page 23: Regression - Peoplejordan/courses/...-Kernel Regression and Locally Weighted Regression 31. Errors in Variables (Total Least Squares) 0 0 32. Sensitivity to outliers High weight given](https://reader030.vdocuments.mx/reader030/viewer/2022040923/5e9d84368f48f77c3e624855/html5/thumbnails/23.jpg)
Probabilistic interpretation
0 200
Likelihood
23
![Page 24: Regression - Peoplejordan/courses/...-Kernel Regression and Locally Weighted Regression 31. Errors in Variables (Total Least Squares) 0 0 32. Sensitivity to outliers High weight given](https://reader030.vdocuments.mx/reader030/viewer/2022040923/5e9d84368f48f77c3e624855/html5/thumbnails/24.jpg)
240 2 4 6 8 10
0
5
10
15
20
25
X
y
µ=8µ=5µ=3
Mean µ
ConditionalGaussiansp(y|x)
24
![Page 25: Regression - Peoplejordan/courses/...-Kernel Regression and Locally Weighted Regression 31. Errors in Variables (Total Least Squares) 0 0 32. Sensitivity to outliers High weight given](https://reader030.vdocuments.mx/reader030/viewer/2022040923/5e9d84368f48f77c3e624855/html5/thumbnails/25.jpg)
BREAK
25
![Page 26: Regression - Peoplejordan/courses/...-Kernel Regression and Locally Weighted Regression 31. Errors in Variables (Total Least Squares) 0 0 32. Sensitivity to outliers High weight given](https://reader030.vdocuments.mx/reader030/viewer/2022040923/5e9d84368f48f77c3e624855/html5/thumbnails/26.jpg)
Outline• Ordinary Least Squares Regression
- Online version
- Normal equations
- Probabilistic interpretation
• Overfitting and Regularization
• Overview of additional topics
- L1 Regression
- Quantile Regression
- Generalized linear models
- Kernel Regression and Locally Weighted Regression
26
![Page 27: Regression - Peoplejordan/courses/...-Kernel Regression and Locally Weighted Regression 31. Errors in Variables (Total Least Squares) 0 0 32. Sensitivity to outliers High weight given](https://reader030.vdocuments.mx/reader030/viewer/2022040923/5e9d84368f48f77c3e624855/html5/thumbnails/27.jpg)
Overfitting
• So the more features the better? NO! • Carefully selected features can improve
model accuracy. • But adding too many can lead to overfitting.• Feature selection will be discussed in a
separate lecture.
27
27
![Page 28: Regression - Peoplejordan/courses/...-Kernel Regression and Locally Weighted Regression 31. Errors in Variables (Total Least Squares) 0 0 32. Sensitivity to outliers High weight given](https://reader030.vdocuments.mx/reader030/viewer/2022040923/5e9d84368f48f77c3e624855/html5/thumbnails/28.jpg)
Overfitting
0 2 4 6 8 10 12 14 16 18 20-15
-10
-5
0
5
10
15
20
25
30
[Matlab demo]
Degree 15 polynomial
28
![Page 29: Regression - Peoplejordan/courses/...-Kernel Regression and Locally Weighted Regression 31. Errors in Variables (Total Least Squares) 0 0 32. Sensitivity to outliers High weight given](https://reader030.vdocuments.mx/reader030/viewer/2022040923/5e9d84368f48f77c3e624855/html5/thumbnails/29.jpg)
Ridge Regression(Regularization)
0 2 4 6 8 10 12 14 16 18 20-10
-5
0
5
10
15Effect of regularization (degree 19)
with “small” by solving
Minimize
(X!X + εI)w = X!y
[Continue Matlab demo]29
![Page 30: Regression - Peoplejordan/courses/...-Kernel Regression and Locally Weighted Regression 31. Errors in Variables (Total Least Squares) 0 0 32. Sensitivity to outliers High weight given](https://reader030.vdocuments.mx/reader030/viewer/2022040923/5e9d84368f48f77c3e624855/html5/thumbnails/30.jpg)
Probabilistic interpretation
Likelihood
Prior
Posterior
P (w|X, y) =P (w, x1, . . . , xn, y1, . . . , yn)P (x1, . . . , xn, y1, . . . , yn)
∝ P (w.x1, . . . , x1, y1, . . . , yn)
∝ exp{− ε
2σ2||w||22
}∏i
exp{− 1
2σ2
(X!
i w − yi
)2}
= exp
{− 1
2σ2
[ε||w||22 +
∑i
(X!i w − yi)2
]}
30
![Page 31: Regression - Peoplejordan/courses/...-Kernel Regression and Locally Weighted Regression 31. Errors in Variables (Total Least Squares) 0 0 32. Sensitivity to outliers High weight given](https://reader030.vdocuments.mx/reader030/viewer/2022040923/5e9d84368f48f77c3e624855/html5/thumbnails/31.jpg)
Outline• Ordinary Least Squares Regression
- Online version
- Normal equations
- Probabilistic interpretation
• Overfitting and Regularization
• Overview of additional topics
- L1 Regression
- Quantile Regression
- Generalized linear models
- Kernel Regression and Locally Weighted Regression
31
![Page 32: Regression - Peoplejordan/courses/...-Kernel Regression and Locally Weighted Regression 31. Errors in Variables (Total Least Squares) 0 0 32. Sensitivity to outliers High weight given](https://reader030.vdocuments.mx/reader030/viewer/2022040923/5e9d84368f48f77c3e624855/html5/thumbnails/32.jpg)
Errors in Variables(Total Least Squares)
00
32
![Page 33: Regression - Peoplejordan/courses/...-Kernel Regression and Locally Weighted Regression 31. Errors in Variables (Total Least Squares) 0 0 32. Sensitivity to outliers High weight given](https://reader030.vdocuments.mx/reader030/viewer/2022040923/5e9d84368f48f77c3e624855/html5/thumbnails/33.jpg)
Sensitivity to outliers
High weight given to outliers
010
2030
40
010
2030
5
10
15
20
25
Temperature at noon
Influence function
33
![Page 34: Regression - Peoplejordan/courses/...-Kernel Regression and Locally Weighted Regression 31. Errors in Variables (Total Least Squares) 0 0 32. Sensitivity to outliers High weight given](https://reader030.vdocuments.mx/reader030/viewer/2022040923/5e9d84368f48f77c3e624855/html5/thumbnails/34.jpg)
L1 Regression
Linear programInfluence function
[Matlab demo]34
![Page 35: Regression - Peoplejordan/courses/...-Kernel Regression and Locally Weighted Regression 31. Errors in Variables (Total Least Squares) 0 0 32. Sensitivity to outliers High weight given](https://reader030.vdocuments.mx/reader030/viewer/2022040923/5e9d84368f48f77c3e624855/html5/thumbnails/35.jpg)
Quantile Regression
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
15 16 17 18 19 20 21
260
280
300
320
340
360
workload (ViewItem.php) [req/s]
CPU
utiliz
atio
n [M
Hz]
mean CPU95th percentile of CPU
Slide courtesy of Peter Bodik35
![Page 36: Regression - Peoplejordan/courses/...-Kernel Regression and Locally Weighted Regression 31. Errors in Variables (Total Least Squares) 0 0 32. Sensitivity to outliers High weight given](https://reader030.vdocuments.mx/reader030/viewer/2022040923/5e9d84368f48f77c3e624855/html5/thumbnails/36.jpg)
Generalized Linear Models
36
Probabilistic interpretation of OLSMean is linear in Xi
OLS: linearly predict the mean of a Gaussian conditional.
GLM: predict the mean of some other conditional density.
May need to transform linear prediction by to produce a valid parameter.
yi|xi ∼ p(f(X!
i w))
f(·)
36
![Page 37: Regression - Peoplejordan/courses/...-Kernel Regression and Locally Weighted Regression 31. Errors in Variables (Total Least Squares) 0 0 32. Sensitivity to outliers High weight given](https://reader030.vdocuments.mx/reader030/viewer/2022040923/5e9d84368f48f77c3e624855/html5/thumbnails/37.jpg)
Example: “Poisson regression”
37
yi|xi ∼ Poisson(f(X!
i w))
Suppose data are event counts:y
Typical distribution for count data: Poisson
Poisson(y|λ) =e−λλy
y!Mean parameter is λ > 0
Say we predict λ = f(x!w) = exp{x!w
}
y ∈ N0
GLM:
37
![Page 38: Regression - Peoplejordan/courses/...-Kernel Regression and Locally Weighted Regression 31. Errors in Variables (Total Least Squares) 0 0 32. Sensitivity to outliers High weight given](https://reader030.vdocuments.mx/reader030/viewer/2022040923/5e9d84368f48f77c3e624855/html5/thumbnails/38.jpg)
380 2 4 6 8 100
5
10
15
20
25
X
y Mean !
ConditionalPoissonsp(y|x)
!=8!=5!=3
38
![Page 39: Regression - Peoplejordan/courses/...-Kernel Regression and Locally Weighted Regression 31. Errors in Variables (Total Least Squares) 0 0 32. Sensitivity to outliers High weight given](https://reader030.vdocuments.mx/reader030/viewer/2022040923/5e9d84368f48f77c3e624855/html5/thumbnails/39.jpg)
Poisson regression: learning
39
As for OLS: optimize by maximizing the likelihood of data.w
Equivalently: maximize log likelihood.
Likelihood L =∏
i
Poisson(yi|f(X!
i w))
l =∑
i
(X!
i wyi − exp{X!
i w})
+ const.Log likelihood
∂l
∂w=
∑i
(yi − exp
{X!
i w})
Xi
=∑
i
(yi − f
(X!
i w))
Xi
Batch gradient:
︸ ︷︷ ︸“residual”
39
![Page 40: Regression - Peoplejordan/courses/...-Kernel Regression and Locally Weighted Regression 31. Errors in Variables (Total Least Squares) 0 0 32. Sensitivity to outliers High weight given](https://reader030.vdocuments.mx/reader030/viewer/2022040923/5e9d84368f48f77c3e624855/html5/thumbnails/40.jpg)
LMS, Logistic regression, Perceptron and GLM updates
• GLM (online)
• LMS
• Logistic Regression
• Perceptron
wt+1 := wt + α(yi − x!i w)xi
wt+1 := wt + α(yi − fw(xi))xi
wt+1 := wt + α(yi − fw(xi))xi
wt+1 := wt + α(yi − fw(xi))xi
40
![Page 41: Regression - Peoplejordan/courses/...-Kernel Regression and Locally Weighted Regression 31. Errors in Variables (Total Least Squares) 0 0 32. Sensitivity to outliers High weight given](https://reader030.vdocuments.mx/reader030/viewer/2022040923/5e9d84368f48f77c3e624855/html5/thumbnails/41.jpg)
Kernel Regression and Locally Weighted Linear Regression
• Kernel Regression: Take a very very conservative function approximator called
AVERAGING. Locally weight it.
• Locally Weighted Linear Regression: Take a conservative function approximator called LINEAR
REGRESSION. Locally weight it.
Slide from Paul Viola 200341
![Page 42: Regression - Peoplejordan/courses/...-Kernel Regression and Locally Weighted Regression 31. Errors in Variables (Total Least Squares) 0 0 32. Sensitivity to outliers High weight given](https://reader030.vdocuments.mx/reader030/viewer/2022040923/5e9d84368f48f77c3e624855/html5/thumbnails/42.jpg)
Kernel Regression
0 2 4 6 8 10 12 14 16 18 20-10
-5
0
5
10
15Kernel regression (sigma=1)
42
![Page 43: Regression - Peoplejordan/courses/...-Kernel Regression and Locally Weighted Regression 31. Errors in Variables (Total Least Squares) 0 0 32. Sensitivity to outliers High weight given](https://reader030.vdocuments.mx/reader030/viewer/2022040923/5e9d84368f48f77c3e624855/html5/thumbnails/43.jpg)
! " # $ % &! &" &# &$ &% "!!&!
!'
!
'
&!
&'
Locally Weighted Linear Regression(LWR)
Kernel regression (sigma=1)
E =12
n∑i=1
(w!xi − yi)2
OLS cost function:
LWR cost function:
E′ =n∑
i=1
k(xi − x)(w"xi − yi)2
[Matlab demo]43
![Page 44: Regression - Peoplejordan/courses/...-Kernel Regression and Locally Weighted Regression 31. Errors in Variables (Total Least Squares) 0 0 32. Sensitivity to outliers High weight given](https://reader030.vdocuments.mx/reader030/viewer/2022040923/5e9d84368f48f77c3e624855/html5/thumbnails/44.jpg)
0 1 20
#requests per minute
Time (days)
5000
Heteroscedasticity
44
![Page 45: Regression - Peoplejordan/courses/...-Kernel Regression and Locally Weighted Regression 31. Errors in Variables (Total Least Squares) 0 0 32. Sensitivity to outliers High weight given](https://reader030.vdocuments.mx/reader030/viewer/2022040923/5e9d84368f48f77c3e624855/html5/thumbnails/45.jpg)
What we covered• Ordinary Least Squares Regression
- Online version
- Normal equations
- Probabilistic interpretation
• Overfitting and Regularization
• Overview of additional topics
- L1 Regression
- Quantile Regression
- Generalized linear models
- Kernel Regression and Locally Weighted Regression
45