22s:152 applied linear regression chapter 18...

15
22s:152 Applied Linear Regression Chapter 18: Nonparametric Regression (Lowess smoothing) ———————————————————— When the nature of the response function (or mean structure) is unknown and one wishes to investigate it, a nonparametric regression approach may be useful. An example of a nonparametric regression fit for Y vs. X: 0 5000 10000 15000 20000 25000 20 40 60 80 income prestige 1

Upload: others

Post on 01-Feb-2021

6 views

Category:

Documents


0 download

TRANSCRIPT

  • 22s:152 Applied Linear Regression

    Chapter 18: Nonparametric Regression(Lowess smoothing)————————————————————

    •When the nature of the response function (ormean structure) is unknown and one wishesto investigate it, a nonparametric regressionapproach may be useful.

    • An example of a nonparametric regression fitfor Y vs. X:

    ● ●

    ● ●

    ● ●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    0 5000 10000 15000 20000 25000

    2040

    6080

    income

    pres

    tige

    1

  • ● ●

    ● ●

    ● ●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    0 5000 10000 15000 20000 25000

    2040

    6080

    income

    pres

    tige

    • One method of nonparametric regression isLOWESS.

    LOcallyWEightedpolynomialregreSSion

    • A lowess curve doesn’t assume any particularform to the mean structure.

    • Instead, the data points themselves suggestthe form of the relationship, allowing a meanstructure that changes freely as x changes.

    2

  • • Though it looks like someone has simply drawna line through the points, the fitting of thelowess curve is based on statistical modeling.

    • Actually, a particular polynomial regressionmodel will be fit many times over, but eachtime, a different window of the data will beused to fit the model.

    • Thus, the the term ‘local’ coincides with thefact that we’re only using a fraction of thedata, a window, each time we do a fit.

    3

  • • General idea for computing a Ŷi (point on curve)

    Observed Observed Fitted

    xi Yi Ŷi11,377 73.1 ?

    – Choose a window containing xi. The pointsin the window are the only points thatwill contribute to the fit of the the locallyweighted polynomial at xi.

    0 5000 10000 15000 20000 25000

    2040

    6080

    income

    pres

    tige

    11,377

    68.34

    ∗Wider windows give smoother lowess curves.

    4

  • – Before fitting the local model, we deter-mine the weights of the points in the win-dow.

    Higher weights will be placed on the pointsnear xi (local weighting), and lower weightswill be placed on the points far from xi,near the window edges.

    One weight function used is theTricube weight function:

    T (t) =

    {(1− |t|3)3 for |t| < 1

    0 for |t| ≥ 1

    where t =xij−xi

    hifor all observations xij in the window for

    xi, and hi is the half-width of the window.

    5

  • – Use the weights (and the data) to fit alocal polynomial regression. A first-orderlocal polynomial (straight line) usuallysuffices (red line below shows best-fit-line).

    0 5000 10000 15000 20000 25000

    2040

    6080

    income

    pres

    tige

    11,377

    68.3

    – Let Ŷi from this locally fitted polynomialregression be the prediction of Yi at xi.

    6

  • 0 5000 10000 15000 20000 25000

    2040

    6080

    income

    pres

    tige

    11,377

    68.34

    Then, we have no further use for thislocally fitted polynomial.

    All that work, for one fitted point Ŷi.

    Observed Observed Fitted

    xi Yi Ŷi11377 73.1 68.34

    – Connect the n fitted Ŷi values (all com-puted in the same manner) to form themean structure.

    7

  • • You have some choice in how ‘smooth’ thefitted mean structure is.

    A ‘smoother’ curve using the f = 1 setting:

    ● ●

    ● ●

    ● ●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    0 5000 10000 15000 20000 25000

    2040

    6080

    income

    pres

    tige

    A less smooth curve using f = 1/3:

    ● ●

    ● ●

    ● ●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    0 5000 10000 15000 20000 25000

    2040

    6080

    income

    pres

    tige

    8

  • In R using the lowess function:

    > attach(Prestige)

    ## Lowess plot example:

    > plot(income,prestige,pch=16)

    > lowess.out=lowess(income,prestige,f=1/3)

    > lines(lowess.out,lwd=2)

    where f is the the smoothing parameter.

    f gives the proportion of points in the plotwhich influence the smooth at each value.

    Even if all points are included in the win-dow (f = 1), the points still have differingweights in the polynomial fit.

    Larger values of f give more smoothness.

    9

  • • You can also access all the fitted Ŷi valuesfor the given xi. The results are returned inascending order by the x-values:

    > lowess.out

    $x

    [1] 611 918 1656 1890 2370 2448 2594 2847 2901 ...

    $y

    [1] 19.28922 20.79308 24.32462 25.38495 27.52914 27.96910 ...

    > points(lowess.out$x,lowess.out$y,col="purple",pch=1)

    ● ●

    ● ●

    ● ●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    0 5000 10000 15000 20000 25000

    2040

    6080

    income

    pres

    tige

    ●●

    ●●●●●●●●●●●●●●●●●●●

    ●●●●●●●●●●●●●●●●●

    ●●●●●●●●●●●●●●●●●●●●●●●●●●●●

    ●●●●●●

    ●●●●●●●●●●●●●●●

    ●●

    ●●●●

    ●●●●●

    ● ●

    ●●

    ●●

    ●●●●●●●●●●●●●●●●●●●

    ●●●●●●●●●●●●●●●●●

    ●●●●●●●●●●●●●●●●●●●●●●●●●●●●

    ●●●●●●

    ●●●●●●●●●●●●●●●

    ●●

    ●●●●

    ●●●●●

    ● ●

    ●●

    10

  • • There is another function that does local poly-nomial regression fitting called loess, whichuses a slightly different weighting scheme (youcan reset this if you want), but can easily beused for prediction at new x-values.

    11

  • • The author of our book has also provided alowess function inside the scatterplot()function in the car library.

    > scatterplot(income,prestige,smooth=T,span=1/3)

    ●● ●● ●

    0 5000 10000 15000 20000 25000

    2040

    6080

    income

    pres

    tige

    ● ●

    ● ●

    ●●

    ●●

    ●●

    ●●

    ●●

    Dotted line ≡ Simple linear regression fit.Solid line ≡ lowess fit.

    12

  • ...and in the scatterplot.matrix() function.

    > scatterplotMatrix(Ginzberg)

    ||| || | ||| || ||| | || ||| | | ||| || || | | || | || | || || ||| ||| || ||| || | || | ||||| | || || ||| || || | || | |||

    simplicity

    0.0 0.5 1.0 1.5 2.0

    ● ●

    ●●

    ● ●

    ●●

    ●●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ● ● ●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    0.5 1.5 2.5

    ●●

    ●●

    ● ●

    ● ●

    ●●

    ●●

    ●●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ● ● ●

    ●●

    ● ●

    ● ●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    0.5 1.0 1.5 2.0 2.5

    0.5

    1.5

    2.5

    ●●

    ●●

    ● ●

    ● ●

    ●●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●●

    ●●

    ●●

    ●●

    ●●

    0.0

    0.5

    1.0

    1.5

    2.0

    ● ●

    ● ●

    ●●

    ● ●

    ●●

    ● ●●●

    ●●

    ●●

    ● ●

    ●●

    | || | | || | || ||| || || |||| ||| |||| | ||||| | ||||| | | ||||| || ||| || | || | |||| | | || || || | || || || | || ||

    fatalism

    ● ●

    ●●

    ●●

    ● ●

    ●●

    ●●●●

    ●●

    ● ●

    ● ●

    ● ●

    ● ●

    ● ●

    ● ●

    ● ●

    ●●

    ● ●●●

    ●●

    ●●

    ● ●

    ●●

    ●●

    ● ●

    ● ●

    ● ●

    ●●

    ●●● ●

    ●●

    ●●

    ●●

    ● ●

    ● ●

    ●●

    ●●

    ● ●

    ●●

    ●●● ●

    ●●

    ●●

    ●●

    ● ●

    ● ●●●

    ●●

    ● ●●

    ●●

    ●●

    ● ●●●●

    ●●●●

    ●●●● ●

    ●●

    ● ●●● ●

    ●●

    ●●●●

    ●●

    ●●●

    ●●

    ● ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ● ● ●●

    ●●

    ●● ●

    ●●●

    ● ●●●●

    ●●● ●

    ●●●● ●

    ●●

    ●●●●●

    ● ●

    ●●●●

    ●●●

    ●●

    ●●

    ● ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    | |||| || |||| || | | |||||| | |||||||| ||| | ||| |||| ||| ||| || ||| || | ||| | || | ||| || | || ||| || | ||| || |

    depression

    ● ●●●

    ●●

    ● ●●

    ●●

    ●●

    ● ●●●●

    ●●● ●

    ●● ●●●

    ● ●

    ●●●● ●

    ●●

    ●●

    ●●

    ●●●

    ●●

    ●●

    ● ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ● ● ●●

    ●●

    ● ●●

    ●●●

    ● ●●●●

    ●●● ●

    ●●●● ●

    ● ●

    ●●●●●

    ● ●

    ●●

    ● ●

    ●●●

    ●●

    ●●

    ●●●

    ●●

    ●●

    ●●

    ●●

    ●●

    0.5

    1.0

    1.5

    2.0

    ●●●●

    ●●

    ● ●●

    ●●

    ●●

    ●●●●●

    ●● ● ●

    ●●●●●●●

    ●●●● ●

    ● ●

    ●●

    ● ●

    ●●

    ●●●

    ●●

    ●●●●

    ●●

    ●●

    ●●

    ●●

    ●●

    0.5

    1.5

    2.5

    ●●●

    ●●

    ●●

    ●●

    ● ●

    ●●

    ● ●

    ● ●

    ●●

    ●●

    ●●

    ●●

    ●●

    ● ●

    ● ●

    ● ●●

    ● ●

    ●●

    ●●

    ●●

    ●●

    ● ●

    ●●

    ● ●

    ●●

    ●●

    ●●

    ● ●

    ● ●

    ● ●

    ● ●●

    ●●

    ●●

    ●●

    ●●

    ●●●

    ● ●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ● ●

    ●●

    ||| ||| | || || | || | || ||| || || ||| ||| | || |||| || || ||| ||| || ||| || | || | || |||| || |||| ||| || | || | |||

    adjsimp

    ● ●●

    ● ●

    ●●

    ●●

    ●●

    ●●

    ● ●

    ●●

    ● ●

    ●●

    ●●

    ●●

    ● ●

    ● ●

    ●●

    ●●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ● ●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ● ●

    ●●

    ● ●

    ●●

    ●●

    ● ●

    ●●

    ● ●

    ●●

    ●●

    ●●

    ●●

    ●●

    ● ●

    ●●

    ● ●

    ● ●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ● ●

    ● ●

    ●●

    ● ●

    ●●

    ●●

    ● ●

    ●●

    ● ●

    | || | | || || ||| | || || ||| | ||| |||| | || || | | ||||| || || || | || ||| || | ||| |||| | | || || || ||||| || | || | |

    adjfatal

    0.0

    0.5

    1.0

    1.5

    2.0

    ● ●

    ●●

    ● ●

    ●●

    ●●

    ●●

    ●●

    ●●

    0.5 1.5 2.5

    0.5

    1.5

    2.5

    ●●

    ● ●●

    ●●

    ●●

    ●●

    ●● ●

    ●● ● ● ●

    ●●

    ●●

    ●●

    ●●

    ● ●

    ●●

    ●●

    ●●

    ●●●

    ●●●

    ●●●

    ●●

    ● ●

    ● ●●

    ●●

    ● ●●

    ● ●

    ●●

    ● ●●

    ●●

    ●●

    ●●

    ●● ●

    ●●● ●●

    ●●

    ●●

    ●●●

    ●●

    ●●

    ● ●

    ●●

    ●●

    ●●●

    ● ●●

    ●●●

    ●●

    ● ●

    ● ●●

    ●●

    ● ●●

    ● ●

    0.5 1.0 1.5 2.0

    ●●

    ●●●

    ●●

    ●●

    ●●

    ●●●

    ●●● ● ●

    ●●●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●●

    ● ●●

    ●●●

    ●●

    ●●

    ● ●●

    ●●

    ● ●●

    ●●

    ●●

    ● ●●

    ●●

    ●●

    ●●

    ●● ●

    ●● ●● ●

    ●●

    ●●

    ●●

    ●●

    ● ●

    ●●

    ●●

    ●●

    ●●●

    ●●●

    ●●●●

    ● ●

    ●●●

    ●●

    ● ●●

    ● ●

    0.0 0.5 1.0 1.5 2.0

    ●●

    ● ●●

    ●●

    ●●

    ●●

    ●● ●

    ●● ● ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ● ●

    ●●

    ●●

    ●●●

    ● ●●

    ●●●

    ●●

    ● ●

    ● ●●

    ●●

    ● ●●

    ● ●

    | | |||| | || || | || | |||||||| | ||| ||| | || || |||| || | || || | || ||| || |||| ||| | ||| || || | ||| | ||||| || |

    adjdep

    13

  • Lab Example (frog data)In logistic regression, it can be useful to use thelowess fit to ‘suggest’ a model, or check on therelationship between X and Y :

    2.0 2.5 3.0 3.5 4.0

    0.0

    0.2

    0.4

    0.6

    0.8

    1.0

    average low temperature

    presence(jittered)

    f = 2/3 (default)

    2.0 2.5 3.0 3.5 4.0

    0.0

    0.2

    0.4

    0.6

    0.8

    1.0

    average low temperature

    presence(jittered)

    f = 1/4

    2.0 2.5 3.0 3.5 4.0

    0.0

    0.2

    0.4

    0.6

    0.8

    1.0

    average low temperature

    presence(jittered)

    f = 1/10

    14

  • Text illustration:

    (a) Outside dotted lines define window for observation i,middle dotted line is on x-value of prediction (ith obs).

    (b) Weighting scheme. Higher weights to nearby points.

    (c) Local polynomial fit (extends past window). Ŷi fallson the fitted line.

    (d) Overall lowess curve (after repeating process for everyobservation and connecting the predicted points).

    15