1.2 multiple linear regression

Download 1.2 Multiple Linear Regression

Post on 07-Jul-2018

223 views

Category:

Documents

0 download

Embed Size (px)

TRANSCRIPT

  • 8/18/2019 1.2 Multiple Linear Regression

    1/39

    1.2 Multiple Linear1.2 Multiple Linear

    RegressionRegressionHector LemusHector Lemus

    Spring 2016Spring 2016

  • 8/18/2019 1.2 Multiple Linear Regression

    2/39

    2

    Multiple Linear RegressionMultiple Linear Regression

    Examine the relationship between a set of independent variables and aExamine the relationship between a set of independent variables and asingle continuous dependent variable.single continuous dependent variable.

    ses of multiple linear regression!ses of multiple linear regression!

    1.1. "o assess the relationship between the dependent and the independent"o assess the relationship between the dependent and the independentvariables simultaneousl# ta$ing into account the intercorrelationsvariables simultaneousl# ta$ing into account the intercorrelationsamong the independent variables.among the independent variables.

    2.2. "o examine the effect of one or more variables on the dependent"o examine the effect of one or more variables on the dependentvariable after controlling %ad&usting' for the effects of the othervariable after controlling %ad&usting' for the effects of the othervariables in the model.variables in the model.

    (.(. "o assess the interaction of two or more independent variables with"o assess the interaction of two or more independent variables withrespect to the dependent variable.respect to the dependent variable.

    ).). "o develop a prediction e*uation."o develop a prediction e*uation.

  • 8/18/2019 1.2 Multiple Linear Regression

    3/39

    (

    ExampleExample

    +ependent variable! S#stolic blood pressure+ependent variable! S#stolic blood pressure

    ,ndependent variables!,ndependent variables!

    1.1. -,-,

    2.2. /ge/ge

    (.(. Smo$ing histor#!Smo$ing histor#!

    0 onsmo$er 0 onsmo$er 

    1 urrent or 3revious Smo$er 1 urrent or 3revious Smo$er 

  • 8/18/2019 1.2 Multiple Linear Regression

    4/39

  • 8/18/2019 1.2 Multiple Linear Regression

    5/39

    8

    Hypothetical ExampleHypothetical Example

    ultiple linear regression ma# be used to assess the degree of interactionultiple linear regression ma# be used to assess the degree of interaction

    and test whether the interaction is statisticall# significant.and test whether the interaction is statisticall# significant.

    +ependent variable! hange in +-3+ependent variable! hange in +-3

    ,ndependent variables!,ndependent variables!

    1.1. /ge/ge

    2.2. +rug group %/ctive93lacebo'+rug group %/ctive93lacebo'

    (.(. ,nteraction term %to be discussed',nteraction term %to be discussed'

  • 8/18/2019 1.2 Multiple Linear Regression

    6/39

    6

    Multiple Linear Regression ModelMultiple Linear Regression Model

     otation! otation! LetLet Y Y   be the dependent variable be the dependent variable

    LetLet X  X 11:;::;: X  X k k  be the independent variables be the independent variables

    odel!odel!

    wherewhere β  β 00:: β  β 11:;::;: β  β k k  are regression coefficients to be estimated andare regression coefficients to be estimated and E  E  isis

    the error term which is a random variablethe error term which is a random variable

     E  E  has a distribution and for testing h#potheses we need to ma$e anhas a distribution and for testing h#potheses we need to ma$e anassumption about its distribution.assumption about its distribution.

    0 1 1 2 2

    0

    1

    k k 

    i i

    i

    Y X X X E  

     X E 

    β β β β  

    β β =

    = + + + + +

    = + +∑

    L

  • 8/18/2019 1.2 Multiple Linear Regression

    7/39<

    1.1. Y Y  is a random variable with distribution of values for each specificis a random variable with distribution of values for each specificcombination of values of thecombination of values of the  X  X =s.=s.

    2.2. "he observations of"he observations of Y Y  are statisticall# independent of each other.are statisticall# independent of each other.

    (.(. "he mean value of"he mean value of Y Y  for each specific combination of thefor each specific combination of the  X  X =s is given b#=s is given b#

    ).). "he variance of"he variance of Y Y  is the same for an# fixed combination of theis the same for an# fixed combination of the X  X =s.=s.

    >or h#pothesis testing: we need one more assumption!>or h#pothesis testing: we need one more assumption!

    5.5. Y Y  is normall# distributed for each specific combination of theis normall# distributed for each specific combination of the X  X =s.=s.

    Assumptions for MLRAssumptions for MLR

    0 1 1 2 2 k k  X X X β β β β  + + + +L

  • 8/18/2019 1.2 Multiple Linear Regression

    8/39

    ?

    Estimating with Least SuaresEstimating with Least Suares

    -asic ,dea! >ind estimates of the-asic ,dea! >ind estimates of the  β  β =s which [email protected] the sum of the s*uared=s which [email protected] the sum of the s*uareddistances between the observed and corresponding predicted values.distances between the observed and corresponding predicted values.

    Let the predicted value beLet the predicted value be

    >ind the estimated parameters which [email protected]>ind the estimated parameters which [email protected]

    "his *uantit# defines the error sum of s*uares denoted b# SSE. ,t is"his *uantit# defines the error sum of s*uares denoted b# SSE. ,t isalso called the residual sum of s*uares.also called the residual sum of s*uares.

    "he difference between the observed and predicted value of #ields"he difference between the observed and predicted value of #ieldsan estimate foran estimate for E  E ..   is called the residual.is called the residual.

    0 1 1 2 2A A A AA

    k k Y X X X  β β β β  = + + + +L

    0 1A A A: : :

    k β β β K 

    ( )   ( )22

    0 1 1

    1 1

    A A AAn n

    i i i i k ki

    i i

    Y Y Y X X  β β β = =

    − = − − − −∑ ∑   L

    A A E Y Y = −

    AY Y −

  • 8/18/2019 1.2 Multiple Linear Regression

    9/39

    B

    !omputing the "arameter Estimates!omputing the "arameter Estimates

    +etails for computing the+etails for computing the β  β =s will=s will C" C" be discussed. be discussed.

    De*uires matrix algebra and calculus.De*uires matrix algebra and calculus.

  • 8/18/2019 1.2 Multiple Linear Regression

    10/39

    10

    A#$%A &a'le for MLR A#$%A &a'le for MLR 

    SourceSource df df  SSSS SS  F  F 

    DegressionDegression k k  SS F SSESS F SSE

    DesidualDesidual nn F F kk F 1 F 1 SSESSE

    "otal"otal nn F 1 F 1 SSSS

    ( )

    ( )

    2

    1

    Deg Des

    n

    i

    i

    SSY Y Y  

    SS SS  

    SSY SSE SSE  

    =

    = −

    = +

    = − +

    ∑   ( )

    ( )

    2

    1

    2

    1

    A

    A

    n

    i i

    i

    n

    i

    i

    SSE Y Y  

    SSY SSE Y Y  

    =

    =

    = −

    − = −

    Deg

    SSY SSE   MS 

    −=

    Des

    1

    SSE  MS 

    n k 

    =

    − −

    Deg

    Des

     MS  F 

     MS =

  • 8/18/2019 1.2 Multiple Linear Regression

    11/39

    11

    !oefficient of (etermination!oefficient of (etermination

    3roportion of variabilit# of3roportion of variabilit# of Y Y  that can be explained b# the modelthat can be explained b# the model

    00 GG R R22 G 1G 1

    ,f,f R R22 1: we have a perfect fit. "he model explains all of the variabilit#. 1: we have a perfect fit. "he model explains all of the variabilit#.

    ,f,f R R22  0: the model explains none of the variabilit#. 0: the model explains none of the variabilit#.

    2   SSY SSE  RSSY 

    −=

  • 8/18/2019 1.2 Multiple Linear Regression

    12/39

    12

    &esting Hypotheses in MLR &esting Hypotheses in MLR 

    "hree test t#pes!"hree test t#pes!

    1.1. Cverall test! +oes the set of independent variables ta$en together explain aCverall test! +oes the set of independent variables ta$en together explain asignificant amount of the variabilit# insignificant amount of the variabilit# in Y Y  "a$en together: does the set of -,: /ge and Smo$ing Histor# explain a"a$en together: does the set of -,: /ge and Smo$ing Histor# explain a

    significant amount of the variabilit# of S-3significant amount of the variabilit# of S-3

    2.2. "est for addition of a single variable! 5iven a set of independent variables in"est for addition of a single variable! 5iven a set of independent variables inthe model: does the addition of one variable explain a significant amount of thethe model: does the addition of one variable explain a significant amount of thevariabilit# ofvariabilit# of Y Y  Evaluate the relationship between one independent variable andEvaluate the relationship between one independent variable and Y Y  after controllingafter controlling

    %ad&usting' for the other variables in the model.%ad&usting' for the other variables in the model. 5iven that /ge and Smo$ing Histor# are in the model: what is the relationship5iven that /ge and Smo$ing Histor# are in the model: what is the relationship

     between -, and S-3 between -, and S-3

    (.(. "est for the addition of a group of variables! 5iven a set of independent"est for the addition of a group of variables! 5iven a set of independentvariables in the model: does the addition of another set of variables explain avariables in the model: does the addition of another set of variables explain asignificant amount of the variabilit# ofsignificant amount of the variabilit# of Y Y  /ssess the relationship of a set of behavioral variables measuring stress on +-3/ssess the relationship of a set of behavioral variables measuring stress on +-3

    ad&usting for $nown factors related to +-3 such as /ge and -,.ad&usting for $nown factors related to +-3 such as /ge and -,.

  • 8/18/2019 1.2 Multiple Linear Regression

    13/39

    1(

    #ested Models) &he *ull#ested Models) &he *ull

    // fullfull model contains all of the variables of interest.model contains all of the variables of interest.

    >or example!>or example!

    Suppose we test the association between -, and S-3 after ad&usting forSuppose we test the association between -, and S-3 after ad&usting for

    /ge and Smo$ing Histor#/ge and Smo$ing Histor#

    0 1 2 (S-3 %-,' %/ge' %Smo$ing Histor#'   E β β β β  = + + + +

    0 1! 0 H    β   =

  • 8/18/2019 1.2 Multiple Linear Regression

    14/39

    1)

    #ested Models) &he Reduced#ested Models) &he Reduced

    ,f,f H  H 00 is true: then the most appropriate model isis true: then the most appropriate model is

    "his is the"his is the reducedreduced model whenmodel when H  H 00 is true.is true.

    "esting"esting H  H 00 is e*uivalent to testing which of the two models is mostis e*uivalent to testing which of the two models is most

    appropriate.appropriate.

     ote that no new variables are introduced in the reduced model. ote that no new variables are introduced in the reduced model.

    "he concepts of nested %full and reduced' models will appl# to all of the"he concepts of nested %full and reduced' models will appl# to all of thetests that we discuss.tests that we discuss.

    0 2 (S-3 %/ge' %Smo$ing Histor#'   E β β β = + + +

  • 8/18/2019 1.2 Multiple Linear Regression

    15/39

    18

    &est for $+erall Regression&est for $+erall Regression

    "he full model!"he full model!

    HaveHave k k  independent variables.independent variables.

    "hree wa#s of stating the same null h#pothesis"hree wa#s of stating the same null h#pothesis

    1.1.   H  H 00! "he! "he k k  independent variables ta$en together do not explain aindependent variables ta$en together do not explain a

    significant amount of the variabilit# insignificant amount of the variabilit# in Y Y ..

    2.2.   H  H 00! "he overall regression using the! "he overall regression using the k k  independent variables is notindependent variables is not

    statisticall# significant.statisticall# significant.

    (.(.   H  H 00!! β  β 11  β  β 22 I I I I I I  β  β k k   0 0

    "he reduced model!"he reduced model!

    0 1 1 2 2 k k Y X X X E  β β β β  = + + + + +L

    0Y E β = +

  • 8/18/2019 1.2 Multiple Linear Regression

    16/39

    16

    se these the F  F  statistic from the /CJ/ tablestatistic from the /CJ/ table

    KhenKhen H  H 00 is trueis true F  F   F  F 4dist with4dist with k k  andand n-k-n-k-1 degrees of freedom1 degrees of freedom

    De&ectDe&ect H  H 00 for large values offor large values of  F  F  

     F  F k, n-k-k, n-k-1: 141: 14αα! the 100%1 4! the 100%1 4 αα' percentile from the' percentile from the F  F 4dist with4dist with k k  andand n-k-n-k-1 degrees1 degrees

    of freedom: whereof freedom: where αα is our chosen level of significance.is our chosen level of significance.

    +ecision rule! De&ect+ecision rule! De&ect  H  H 00 ifif F  F  MM F  F k, n-k-k, n-k-1: 141: 14αα

    "he percentile is the critical value or critical point."he percentile is the critical value or critical point.

    /lternativel#: compute the/lternativel#: compute the  p p4value and compare to the4value and compare to the αα level.level.

    &he &est Statistic&he &est Statistic

    Deg

    Des

     MS  F 

     MS =

  • 8/18/2019 1.2 Multiple Linear Regression

    17/39

    1<

    S," ExampleS," Example

    +etermine whether -,: age and smo$ing histor# ta$en together account+etermine whether -,: age and smo$ing histor# ta$en together account

    for a significant amount of the variabilit# of S-3.for a significant amount of the variabilit# of S-3.

    Y Y ! S-3:! S-3:  X  X 11! -,:! -,:  X  X 22! /ge:! /ge:  X  X ((! Smo$ing Histor#! Smo$ing Histor#

    n (2 sub&ectsn (2 sub&ects k k   ( (

    >ull model!>ull model!

     H  H 00!! β  β 

    11  β  β 

    22 β  β 

    (( 0 0

    Deduced model!Deduced model!

    ndernder H  H 00:: F  F  followsfollows F  F 4dist with4dist with k k   ( and ( and n-k-n-k-1 2?1 2? df.df.

    0 1 1 2 2 ( (Y X X X E  β β β β  = + + + +

    0Y E β = +

  • 8/18/2019 1.2 Multiple Linear Regression

    18/39

    1?

    SAS $utputSAS $utput

      The REG Procedure

      Model: MODEL1

      Dependent Variable: SBP Systolic Blood Pressure !!"#$

      Number of Observations Read 32

      Number of Observations Used 32

      %nalysis o& Variance

      Su! o& Mean  Source D' S(uares S(uare ' Value Pr ) '

      Model 3 4889.82570 129.94190 29.71 !.0001

      "rror 28 153.14305 54.8225

      #orre$ted %otal 31 425.9875

      Root M&" 7.4091 R'&(uare 0.709

      )e*endent Mean 144.53125 +d, R'&( 0.7353

      #oeff -ar 5.12478

    /t/t αα  0.08: 0.08:

     F  F (: 2?: 0.B8(: 2?: 0.B8  2.B8 2.B8

    /t/t αα  0.01: 0.01:

     F  F (: 2?: 0.BB(: 2?: 0.BB  ).8

  • 8/18/2019 1.2 Multiple Linear Regression

    19/39

    1B

    &he "artial&he "artial F F  &est&est

    "he regression sum of s*uares must be partitioned into components that"he regression sum of s*uares must be partitioned into components that

    can be used to test h#potheses about individual variablescan be used to test h#potheses about individual variables

    Cne t#pe of brea$down is se*uential: variables4added4in4orderCne t#pe of brea$down is se*uential: variables4added4in4order

    alled "#pe , in S/Salled "#pe , in S/S

      X  X 11! -,:! -,:  X  X 22! /ge:! /ge:  X  X ((! Smo$ing Histor#! Smo$ing Histor#

    ( )

    ( )( )

    1

    2 1

    ( 1 2

    Source SS

      1 (8(

  • 8/18/2019 1.2 Multiple Linear Regression

    20/39

    20

    SS-SS- X  X 11

    "he sum of s*uares explained b# using onl#"he sum of s*uares explained b# using onl#  X  X 11 in the model.in the model.

    "his ma# be used to test whether -, is linearl# related to S-3"his ma# be used to test whether -, is linearl# related to S-3 withoutwithout 

    ad&usting for an# other variables.ad&usting for an# other variables.

    Since: technicall#:Since: technicall#: X  X 22 andand X  X (( are not in the model: then pool their termsare not in the model: then pool their termswith the residual.with the residual.

    SSSSDesDes  18(6.1) O 8?2.68 O

  • 8/18/2019 1.2 Multiple Linear Regression

    21/39

    21

    SS-SS- X  X 22// X  X 11

    "he extra sum of s*uares explained b# adding /ge to the model given -,"he extra sum of s*uares explained b# adding /ge to the model given -,

    alread# in the model.alread# in the model.

    3ooled error term!3ooled error term!

    SSSSDesDes  18(6.1) O

  • 8/18/2019 1.2 Multiple Linear Regression

    22/39

    22

    SS-SS- X  X 00|X |X 11 , X  , X 22

    "he extra sum of s*uares explained b# adding Smo$ing histor# to the"he extra sum of s*uares explained b# adding Smo$ing histor# to the

    model given -, and /ge alread# in the model.model given -, and /ge alread# in the model.

     

     H  H 00!! β  β ((  0 PSmo$ing histor# is not associated with S-3 after ad&usting for 0 PSmo$ing histor# is not associated with S-3 after ad&usting for

    -, and /ge.Q  -, and /ge.Q

    0 1 1 2 2 ( (

    0 1 1 2 2

    >ull!

    Deduced!

    Y X X X E  

    Y X X E  

    β β β β  

    β β β 

    = + + + +

    = + + +

  • 8/18/2019 1.2 Multiple Linear Regression

    23/39

    2(

    eneral "artialeneral "artial F F  &est&est

    >ull model!>ull model!

     H  H 00! "he addition of! "he addition of X  X RR to the model does not explain a significant amountto the model does not explain a significant amount

    of the variabilit# ofof the variabilit# of Y Y  in the presence ofin the presence of X  X 11:: X  X 22: ; :: ; : X  X  p p..

     H  H 00!! X  X RR is not significantl# related tois not significantl# related to Y Y  controlling forcontrolling for X  X 11:: X  X 22: ; :: ; : X  X  p p..

     H  H 00!! β  β ** == 00

    Deduced model!Deduced model!

    R R

    0 1 1 2 2   p pY X X X X E  β β β β β  = + + + + + +L

    0 1 1 2 2   p pY X X X E  β β β β  = + + + + +L

  • 8/18/2019 1.2 Multiple Linear Regression

    24/39

    2)

    !onstruction of the &est!onstruction of the &est

    "o construct the partial"o construct the partial F  F  test: #ou need the extra sum of s*uares fortest: #ou need the extra sum of s*uares for  X  X RR..

    +enote!+enote!

    SS%SS% X  X RRNN X  X 11:: X  X 22: ; :: ; : X  X  p p' DegSS%' DegSS% X  X 11:: X  X 22: ; :: ; : X  X  p p , X  , X **' F DegSS%' F DegSS% X  X 11:: X  X 22: ; :: ; : X  X  p p''

      DegSS%>ull' F DegSS%Deduced' DegSS%>ull' F DegSS%Deduced'

    Ke also need the SKe also need the SDesDes for the full model!for the full model!

    So:So:

    "he statistic follows an"he statistic follows an  F  F 4dist with 1 and4dist with 1 and n-p-n-p-22 df df 

    De&ect theDe&ect the H  H 00 ifif F  F %% X  X **NN X  X 11:: X  X 22: ; :: ; : X  X  p p' M' M F  F 1:1: n-p-n-p-2: 142: 14αα 

    ( ) DeDesSS %>ull'

    S >ull2

     s

    n p=

    − −

    ( )

      ( )R 1R1

    Des

    SS N :...:

    N :...: S %>ull'

     p

     p

     X X X 

     F X X X    =

  • 8/18/2019 1.2 Multiple Linear Regression

    25/39

    28

    Example 1Example 1

    "est whether smo$ing histor# is related to S-3 after controlling for /ge and"est whether smo$ing histor# is related to S-3 after controlling for /ge and

    -,.-,.

    >ull model!>ull model!

     H  H 00!! β  β ((  0 0

    >rom the table!>rom the table!

    SS%SS% X  X ((|X |X 11 , X  , X 22'

  • 8/18/2019 1.2 Multiple Linear Regression

    26/39

    26

    Example 2Example 2

    "est the relationship of -, to S-3 controlling for /ge and Smo$ing histor#."est the relationship of -, to S-3 controlling for /ge and Smo$ing histor#.

     H  H 00!! β  β 11  0 0

    Ke need SS%Ke need SS% X  X 11|X |X 22 , X  , X ((': but not available in the table.': but not available in the table.

     ote that SS% ote that SS% X  X 11|X |X 22 , X  , X ((' ' SS%SS% X  X 11 , X  , X 22 , X  , X ((' F SS%' F SS% X  X 22 , X  , X (('' 

    Ke $now that SS%Ke $now that SS% X  X 

    11 , X  , X 

    22 , X  , X 

    ((' )??B.?( from the S/S Cutput.' )??B.?( from the S/S Cutput.

    However: we would have to find SS%However: we would have to find SS% X  X 22 , X  , X ((' b# fitting a model with onl#' b# fitting a model with onl#  X  X 22 

    andand X  X (( in it.in it.

    ,t turns out SS%,t turns out SS% X  X 22 , X  , X ((' )6?B.6B' )6?B.6B

    0 1 1 2 2 ( (

    0 2 2 ( (

    >ull!

    Deduced!

    Y X X X E  

    Y X X E  

    β β β β  

    β β β 

    = + + + +

    = + + +

  • 8/18/2019 1.2 Multiple Linear Regression

    27/39

    2<

    Example 2 -cont.Example 2 -cont.

    SS%SS% X  X 11|X |X 22 , X  , X ((' )??B.?( F )6?B.6B 200.1)' )??B.?( F )6?B.6B 200.1)

    "his is the marginal sum of s*uares: S/S can provide this information."his is the marginal sum of s*uares: S/S can provide this information.

     F  F %% X  X 11|X |X 22 , X  , X ((' 200.1)98).?6 (.68' 200.1)98).?6 (.68

     F  F 1: 2?: 0.B01: 2?: 0.B0  2.?B 2.?B

     F  F 1: 2?: 0.B81: 2?: 0.B8  ).20 ).20 0.08 0.08  p p4value 0.104value 0.10

    >ail to re&ect>ail to re&ect H  H 00 atat αα == 0.08.0.08.

     o evidence to suggest a significant relationship between S-3 and -, o evidence to suggest a significant relationship between S-3 and -,

    ad&usting for /ge and Smo$ing histor#.ad&usting for /ge and Smo$ing histor#.

  • 8/18/2019 1.2 Multiple Linear Regression

    28/39

    2?

    A &test Eui+alentA &test Eui+alent

    /n e*uivalent test to the 3artial/n e*uivalent test to the 3artial F F test.test.

    >ull model!>ull model!

    "est!"est! H  H 00!! β  β ** == 00

    ould useould use F  F %% X  X **NN X  X 11:: X  X 22: ; :: ; : X  X  p p'' or or  e*uivalentl#e*uivalentl#

    where is the estimated regression parameter where is the estimated regression parameter 

    and is the estimated standard error.and is the estimated standard error.

    >or a two4sided "est!>or a two4sided "est!

    De&ectDe&ect H  H 00 if Nif NT T N MN M t t n-p-n-p-2:142:14αα9292

    R R

    0 1 1 2 2   p pY X X X X E  β β β β β  = + + + + + +L

    R

    R

    A

    AT 

     sβ 

    β =

    RAβ 

    RA sβ 

  • 8/18/2019 1.2 Multiple Linear Regression

    29/39

    2B

    Example 2 -againExample 2 -again

    Delationship of -, to S-3 ad&usting for /ge and Smo$ing Histor#.Delationship of -, to S-3 ad&usting for /ge and Smo$ing Histor#.

      arameter "stimates

      arameter &tandard

    -ariable /abel ) "stimate "rror t -alue r t

    nter$e*t nter$e*t 1 45.10319 10.7488 4.19 0.0003

    M od Mass nde6 1 1.22225 0.3993 1.91 0.04

    +" +e ears: 1 1.21271 0.32382 3.75 0.0008

    &M; &mo

  • 8/18/2019 1.2 Multiple Linear Regression

    30/39

    (0

    "artitioning the RegSS"artitioning the RegSS

    1

    2 1

    ( 1 2

    1. % '

    % N '

    % N : '

    SS X 

    SS X X  

    SS X X X  

    Leads to variables4added4in4order or se*uential

    testing.

    "his is S/S "#pe 1 SS.

    seful if there is an ordering to the independent variables.

    1 2 (

    2 1 (

    ( 1 2

    2. % N : '

    % N : '

    % N : '

    SS X X X    

    SS X X X    

    SS X X X    

    Leads to variables4added4last or marginal testing.

    Each test ad&usts for all other variables in the

    model.

    "his is S/S "#pe 2 SS.

    Kith the exception of the last test: these tests are not e*uivalent.

  • 8/18/2019 1.2 Multiple Linear Regression

    31/39

    (1

    SAS !ode and $utputSAS !ode and $utput

      arameter "stimates

      arameter &tandard-ariable /abel ) "stimate "rror t -alue r t

    nter$e*t nter$e*t 1 45.10319 10.7488 4.19 0.0003M od Mass nde6 1 1.22225 0.3993 1.91 0.04+" +e ears: 1 1.21271 0.32382 3.75 0.0008&M; &mo bmi ae sm< A ss1 [email protected]

    [email protected]([email protected]

  • 8/18/2019 1.2 Multiple Linear Regression

    32/39

    (2

    MLR &a'leMLR &a'le

    Multiple Linear Regression of Systolic ,lood "ressure +ersus selected characteristics -Multiple Linear Regression of Systolic ,lood "ressure +ersus selected characteristics - nn 3 023 02

    haracteristicharacteristic Estimated oefficientEstimated oefficient B8T onfidence ,ntervalB8T onfidence ,nterval  p p4value4value

    -, %$g9m-, %$g9m22'' 1.21.2 40.1: 2.840.1: 2.8 0.0660.066

    /ge %8 #r interval'/ge %8 #r interval' 6.16.1 2.

  • 8/18/2019 1.2 Multiple Linear Regression

    33/39

    ((

    Multiple "artialMultiple "artial F F  &est&est

    5iven that a set of independent variables is in the model: test for the addition5iven that a set of independent variables is in the model: test for the addition

    of another set.of another set.

    ses!ses!

    1.1. "he additional set represents a related group of variablesU test a set of"he additional set represents a related group of variablesU test a set of behavioral variables controlling for a set of demographic variables. behavioral variables controlling for a set of demographic variables.

    2.2. "est a set of interactions."est a set of interactions.

    (.(. /ssess the relationship of a categorical variable with ( or more categories./ssess the relationship of a categorical variable with ( or more categories.

  • 8/18/2019 1.2 Multiple Linear Regression

    34/39

    ()

    enerali4ation of "artialenerali4ation of "artial F F  &est&est

    >ull model!>ull model!

     H  H 00! "he addition of! "he addition of X  X  p+ p+11RR: ; :: ; : X  X k k RR to the model does not explain ato the model does not explain a

    significant amount of the variabilit# ofsignificant amount of the variabilit# of Y Y  in the presence ofin the presence of X  X 11:: X  X 22: ; :: ; : X  X  p p..

     H  H 00! "he set of! "he set of X  X  p+ p+11RR: ; :: ; : X  X k k RR is not significantl# related tois not significantl# related to Y Y  controlling forcontrolling for

     X  X 11:: X  X 22: ; :: ; : X  X  p p..

     H  H 00!! β  β  p+ p+11**= ··· == ··· = β  β k k ** == 00

    Deduced model!Deduced model! 0 1 1   p pY X X E  β β β = + + + +L

    R R R R0 1 1 1 1 p p p p k k Y X X X X E  β β β β β  + += + + + + + + +L L

  • 8/18/2019 1.2 Multiple Linear Regression

    35/39

    (8

    !onstruction of the &est!onstruction of the &est

     eed the extra sum of s*uares from adding eed the extra sum of s*uares from adding X  X  p+ p+11RR: ; :: ; : X  X k k RR to the model.to the model.

    +enote!+enote!

    SS%SS% X  X  p+ p+11RR: ; :: ; : X  X k k RR NN X  X 11:: X  X 22: ; :: ; : X  X  p p' DegSS%>ull' F DegSS%Deduced'' DegSS%>ull' F DegSS%Deduced'

    So:So:

    "he statistic follows an"he statistic follows an F  F 4dist with4dist with k-pk-p andand n-k-n-k-11 df df 

    De&ect theDe&ect the H  H 00 ifif F  F %% X  X  p+ p+11RR: ; :: ; : X  X k k RR NN X  X 11:: X  X 22: ; :: ; : X  X  p p' M' M F  F k k 44 p p:: n-k-n-k-1: 141: 14αα 

    ( )  ( )   ( )R R1 1R R

    1 1

    Des

    SS :...: N :...: 9:...: N :...:

    S %>ull'

     p k p

     p k p

     X X X X k p F X X X X 

      +

    +

    −=

  • 8/18/2019 1.2 Multiple Linear Regression

    36/39

  • 8/18/2019 1.2 Multiple Linear Regression

    37/39

    (<

    Example) S," (ata -cont.Example) S," (ata -cont.

    /CJ/ for the full model!/CJ/ for the full model! /CJ/ for the reduced model!/CJ/ for the reduced model!

    SS%SS% X  X )):: X  X 88:: X  X 66 NN X  X 11:: X  X 22:: X  X ((' 80B2.?( F )??B.?( 20(.00' 80B2.?( F )??B.?( 20(.00

    ndernder H  H 00:: F  F  follows anfollows an F  F 4dist with (: 284dist with (: 28 df df  p p4value M 0.284value M 0.28 >ail to re&ect >ail to re&ect H  H 00..

    "he interactions ta$en together do not explain a significant amount of the"he interactions ta$en together do not explain a significant amount of the

    variabilit# of S-3.variabilit# of S-3.

    ) 8 6 1 2 (

    20(.00 9 (% : : N : : ' 1.2<

    8(.(( F X X X X X X    = =

    SourceSource SSSS df df  SS

    DegressionDegression 80B2.?(80B2.?( 66 ?)?.?0?)?.?0

    DesidualDesidual 1(((.1)1(((.1) 2828 8(.((8(.((

    SourceSource SSSS df df  SS

    DegressionDegression )??B.?()??B.?( (( 162B.B)162B.B)

    DesidualDesidual 18(6.1)18(6.1) 2?2? 8).?68).?6

  • 8/18/2019 1.2 Multiple Linear Regression

    38/39

    (?

    !onstructing Extra SS!onstructing Extra SS

    Suppose we have!Suppose we have!

    SS%SS% X  X 11''

    SS%SS% X  X 22NN X  X 11''

    SS%SS% X  X ((NN X  X 11:: X  X 22''

    Suppose we have the full model!Suppose we have the full model!

    and we want to testand we want to test  H  H 00!! β  β 22 == β  β (( == 0.0.

    So we need SS%So we need SS% X  X 22:: X  X 

    ((NN X  X 

    11' which does not appear in the table.' which does not appear in the table.

    SS%SS% X  X 22:: X  X (( NN X  X 11' is the extra sum of s*uares' is the extra sum of s*uares explained b# addingexplained b# adding X  X 22 andand X  X (( toto

    the model giventhe model given X  X 11 alread# in the model.alread# in the model.

    SS%SS% X  X 22:: X  X (( NN X  X 11' SS%' SS% X  X 11:: X  X 22:: X  X ((' F SS%' F SS% X  X 11''

    0 1 1 2 2 ( (Y X X X E  β β β β  = + + + +

  • 8/18/2019 1.2 Multiple Linear Regression

    39/39

    Rewriting the Extra SSRewriting the Extra SS

    SS%SS% X  X 22 NN X  X 11' SS%' SS% X  X 11:: X  X 22' F SS%' F SS% X  X 11''

    SS%SS% X  X (( NN X  X 11:: X  X 22' SS%' SS% X  X 11:: X  X 22:: X  X ((' F SS%' F SS% X  X 11:: X  X 22''

    "herefore:"herefore:

    SS%SS% X  X 22 NN X  X 11' O SS%' O SS% X  X (( NN X  X 11:: X  X 22' SS%' SS% X  X 11:: X  X 22' F SS%' F SS% X  X 11' O SS%' O SS% X  X 11:: X  X 22:: X  X ((' F SS%' F SS% X  X 11:: X  X 22''

    SS% SS% X  X 11:: X  X 22:: X  X ((' F SS%' F SS% X  X 11''

    SS% SS% X  X 22:: X  X (( NN X  X 11''