generalized linear model (gzlm): overview. dependent variables continuous discrete dichotomous ...

42
Generalized Linear Model (GZLM): Overview

Upload: norma-jackson

Post on 28-Dec-2015

302 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Generalized Linear Model (GZLM): Overview. Dependent Variables Continuous Discrete  Dichotomous  Polychotomous  Ordinal  Count

Generalized Linear Model (GZLM):

Overview

Page 2: Generalized Linear Model (GZLM): Overview. Dependent Variables Continuous Discrete  Dichotomous  Polychotomous  Ordinal  Count

Dependent Variables

Continuous Discrete

DichotomousPolychotomousOrdinalCount

Page 3: Generalized Linear Model (GZLM): Overview. Dependent Variables Continuous Discrete  Dichotomous  Polychotomous  Ordinal  Count

Continuous Variables

Quantitative variables that can take on any value within the limits of the variable

Page 4: Generalized Linear Model (GZLM): Overview. Dependent Variables Continuous Discrete  Dichotomous  Polychotomous  Ordinal  Count

Continuous Variables (cont’d) Distance, time, or length

Infinite number of possible divisions between any two values, at least theoretically

“Only love can be divided endlessly and still not diminish” (Anne Morrow Lindbergh)

More than 11 ordered valuesScores on standardized scales such as those

that measure parenting attitudes, depression, family functioning, and children’s behavioral problems

Page 5: Generalized Linear Model (GZLM): Overview. Dependent Variables Continuous Discrete  Dichotomous  Polychotomous  Ordinal  Count

Discrete Variables

Finite number of indivisible values; cannot take on all possible values within the limits of the variableDichotomousPolytomous OrdinalCount

Page 6: Generalized Linear Model (GZLM): Overview. Dependent Variables Continuous Discrete  Dichotomous  Polychotomous  Ordinal  Count

Dichotomous Variables

Two categories used to indicate whether an event has occurred or some characteristic is present

Sometimes called binary or binomial variables

“To be or not to be, that is the question..” (William Shakespeare, “Hamlet”)

Page 7: Generalized Linear Model (GZLM): Overview. Dependent Variables Continuous Discrete  Dichotomous  Polychotomous  Ordinal  Count

Dichotomous DVs

Placed in foster care or not Diagnosed with a disease or not Abused or not Pregnant or not Service provided or not

Page 8: Generalized Linear Model (GZLM): Overview. Dependent Variables Continuous Discrete  Dichotomous  Polychotomous  Ordinal  Count

Polytomous Variables

Three or more unordered categories Categories mutually exclusive and

exhaustive Sometimes called multicategorical or

sometimes multinomial variables “Inanimate objects can be classified

scientifically into three major categories; those that don't work, those that break down and those that get lost” (Russell Baker)

Page 9: Generalized Linear Model (GZLM): Overview. Dependent Variables Continuous Discrete  Dichotomous  Polychotomous  Ordinal  Count

Polytomous DVs

Reason for leaving welfare:marriage, stable employment, move to

another state, incarceration, or death Status of foster home application:

licensed to foster, discontinued application process prior to licensure, or rejected for licensure

Changes in living arrangements of the elderly:newly co-residing with their children, no

longer co-residing, or residing in institutions

Page 10: Generalized Linear Model (GZLM): Overview. Dependent Variables Continuous Discrete  Dichotomous  Polychotomous  Ordinal  Count

Ordinal Variables

Three or more ordered categories Sometimes called ordered categorical

variables or ordered polytomous variables

“Good, better, best; never let it rest till your good is better and your better is best” (Anonymous)

Page 11: Generalized Linear Model (GZLM): Overview. Dependent Variables Continuous Discrete  Dichotomous  Polychotomous  Ordinal  Count

Ordinal DVs

Job satisfaction:very dissatisfied, somewhat dissatisfied,

neutral, somewhat satisfied, or very satisfied Severity of child abuse injury:

none, mild, moderate, or severe Willingness to foster children with

emotional or behavioral problems: least acceptable, willing to discuss, or most

acceptable

Page 12: Generalized Linear Model (GZLM): Overview. Dependent Variables Continuous Discrete  Dichotomous  Polychotomous  Ordinal  Count

Count Variables

Number of times a particular event occurs to each case, usually within a given:Time period (e.g., number of hospital visits

per year)Population size (e.g., number of registered

sex offenders per 100,000 population), orGeographical area (e.g., number of divorces

per county or state) Whole numbers that can range from 0

through +

Page 13: Generalized Linear Model (GZLM): Overview. Dependent Variables Continuous Discrete  Dichotomous  Polychotomous  Ordinal  Count

Count Variables (cont’d)

“Now I've got heartaches by the number,Troubles by the score,Every day you love me less,Each day I love you more” (Ray Price)

Page 14: Generalized Linear Model (GZLM): Overview. Dependent Variables Continuous Discrete  Dichotomous  Polychotomous  Ordinal  Count

Count DVs

Number of hospital visits, outpatient visits, services used, divorces, arrests, criminal offenses, symptoms, placements, children fostered, children adopted

Page 15: Generalized Linear Model (GZLM): Overview. Dependent Variables Continuous Discrete  Dichotomous  Polychotomous  Ordinal  Count

General Linear Model (GLM) (selected models)

Continuous DV

Linear Regression

ANOVA

t-test

Page 16: Generalized Linear Model (GZLM): Overview. Dependent Variables Continuous Discrete  Dichotomous  Polychotomous  Ordinal  Count

Generalized Linear Model (GZLM) (selected regression models)

GZLM

ContinuousDV

DichotomousDV

Polytomous DV

OrdinalDV

CountDV

LinearRegression

BinaryLogistic

Regression

MultinomialLogistic

Regression

OrdinalLogistic

Regression

Poisson orNegativeBinomial

Regression

Page 17: Generalized Linear Model (GZLM): Overview. Dependent Variables Continuous Discrete  Dichotomous  Polychotomous  Ordinal  Count

Generalized How?

DV continuous or discrete Normal or non-normal error distributions Constant or non-constant variance Provides a unifying framework for

analyzing an entire class of regression models

Page 18: Generalized Linear Model (GZLM): Overview. Dependent Variables Continuous Discrete  Dichotomous  Polychotomous  Ordinal  Count

GLM & GZLM Similarities

IVs are combined in a linear fashion (α + 1X1 + 2X2 + … kXk ;

a slope is estimated for each IV; each slope has an accompanying test

of statistical significance and confidence interval;

each slope indicates the IV’s independent contribution to the explanation or prediction of the DV;

Page 19: Generalized Linear Model (GZLM): Overview. Dependent Variables Continuous Discrete  Dichotomous  Polychotomous  Ordinal  Count

GLM & GZLM Similarities (cont’d) the sign of each slope indicates the

direction of the relationship IVs can be any level of measurement; the same methods are used for coding

categorical IVs (e.g., dummy coding); IVs can be entered simultaneously,

sequentially or using other methods; product terms can be used to test

interactions;

Page 20: Generalized Linear Model (GZLM): Overview. Dependent Variables Continuous Discrete  Dichotomous  Polychotomous  Ordinal  Count

GLM & GZLM Similarities (cont’d) powered terms (e.g., the square of an

IV) can be used to test curvilinearity; overall model fit can be tested, as can

incremental improvement in a model brought about by the addition or deletion of IVs (nested models); and

residuals, leverage values, Cook’s D, and other indices are used to diagnose model problems.

Page 21: Generalized Linear Model (GZLM): Overview. Dependent Variables Continuous Discrete  Dichotomous  Polychotomous  Ordinal  Count

Common Assumptions

Correct model specification Variables measured without error Independent errors No perfect multicollinearity

Page 22: Generalized Linear Model (GZLM): Overview. Dependent Variables Continuous Discrete  Dichotomous  Polychotomous  Ordinal  Count

Correct Model Specification

Have you included relevant IVs? Have you excluded irrelevant IVs? Do the IVs that you have included have

linear or non-linear relationships with your DV (or some function of your DV, as discussed below)?

Are one or more of your IVs moderated by other IVs (i.e., are there interaction effects)?

Page 23: Generalized Linear Model (GZLM): Overview. Dependent Variables Continuous Discrete  Dichotomous  Polychotomous  Ordinal  Count

Variables Measured without Error Limitation of regression models, given

that most often our variables contain some measurement error

Page 24: Generalized Linear Model (GZLM): Overview. Dependent Variables Continuous Discrete  Dichotomous  Polychotomous  Ordinal  Count

Independent Errors

Can be result of study design, e.g.:– Clustered data, which occurs when data are

collected from groups– Temporally linked data, which occurs when

data are collected repeatedly over time from the same people or groups

Can lead to incorrect significance tests and confidence intervals

Page 25: Generalized Linear Model (GZLM): Overview. Dependent Variables Continuous Discrete  Dichotomous  Polychotomous  Ordinal  Count

Independent Errors (cont’d)

Examples of when this might not be trueEffect of parenting practices on behavioral

problems of children and reports of parenting practices and behavioral problems collected from both parents in two-parent families

Effect of parenting practices on behavioral problems of children and information collected about behavioral problems for two or more children per family

Effects of leader behaviors on group cohesion in small groups, and information collected about leader behaviors and group cohesion from all members of each group

Page 26: Generalized Linear Model (GZLM): Overview. Dependent Variables Continuous Discrete  Dichotomous  Polychotomous  Ordinal  Count

No Perfect Multicollinearity

Perfect multicollinearity exists when an IV is predicted perfectly by a linear combination of the remaining IVs

Typically quantified by “tolerance” or “variance inflation factor” (VIF) (1/tolerance)

Even high levels of multicollinearity may pose problems (e.g., tolerance < .20 or especially < .10)

Page 27: Generalized Linear Model (GZLM): Overview. Dependent Variables Continuous Discrete  Dichotomous  Polychotomous  Ordinal  Count

Estimating Parameters (e.g.,

) GLM

Ordinary Least Squares (OLS) estimation• Estimates minimize sum of the squared

differences between observed and estimated values of the DV

http://www.ruf.rice.edu/~lane/stat_sim/reg_by_eye/index.html

GZLMMaximum Likelihood (ML) estimation

• Estimates have greatest likelihood (i.e., the maximum likelihood) of generating observed sample data if model assumptions are true

Page 28: Generalized Linear Model (GZLM): Overview. Dependent Variables Continuous Discrete  Dichotomous  Polychotomous  Ordinal  Count

Testing Hypotheses

Overall and nested models (1 = 2 = k = 0)GLM

• F GZLM

• Likelihood ratio 2

Individual slopes ( = 0)GLM

• tGZLM

• Wald 2 or likelihood ratio 2

Page 29: Generalized Linear Model (GZLM): Overview. Dependent Variables Continuous Discrete  Dichotomous  Polychotomous  Ordinal  Count

Estimating DV with GLM

Three ways of expressing the same thing… = α + 1X1 + 2X2 + … kXk

= • Assumed linear relationship

= Greek letter muEstimated mean value of DV

= Greek letter etaLinear predictor

Page 30: Generalized Linear Model (GZLM): Overview. Dependent Variables Continuous Discrete  Dichotomous  Polychotomous  Ordinal  Count

Estimating DV with Poisson Regresion

ln() = α + 1X1 + 2X2 + … kXk

ln() = Assumed linear relationship

Page 31: Generalized Linear Model (GZLM): Overview. Dependent Variables Continuous Discrete  Dichotomous  Polychotomous  Ordinal  Count

Single (Quantitative) IV Example

DV = number of foster children adopted IV = Perceived responsibility for

parenting (scale scores transformed to z-scores)

N = 285 foster mothers

Do foster mothers who feel a greater responsibility to parent foster children adopt more foster children?

Page 32: Generalized Linear Model (GZLM): Overview. Dependent Variables Continuous Discrete  Dichotomous  Polychotomous  Ordinal  Count

Poisson Model

ln() = α + X

log of estimated mean count .018 + (.185)(X)Log of mean number of children adoptedDoes not have intuitive or substantive

meaning

Page 33: Generalized Linear Model (GZLM): Overview. Dependent Variables Continuous Discrete  Dichotomous  Polychotomous  Ordinal  Count

Mathematical Functions

Function√4 = 2

Inverse (reverse) function22 = 4

Page 34: Generalized Linear Model (GZLM): Overview. Dependent Variables Continuous Discrete  Dichotomous  Polychotomous  Ordinal  Count

Mathematical Functions (cont’d)

Function ln(), natural logarithm of “Link function”

Inverse (reverse) functionexp(), exponential of

• ex on calculator• exp(x) in SPSS and Excel

“Inverse link function”

Page 35: Generalized Linear Model (GZLM): Overview. Dependent Variables Continuous Discrete  Dichotomous  Polychotomous  Ordinal  Count

Link Function

ln(), log of estimated mean countConnects (i.e., links) mean value of DV to

linear combination of IVsTransforms relationship between and so

relationship is linearDifferent GZLM models use different linksDoes not have intuitive or substantive

meaning

Page 36: Generalized Linear Model (GZLM): Overview. Dependent Variables Continuous Discrete  Dichotomous  Polychotomous  Ordinal  Count

Inverse (Reverse) Link Function

Three ways of expressing the same thing… = exp(α + 1X1 + 2X2 + … kXk) = exp() = e

represent values of the DV with intuitive and substantive meaninge.g., mean number of children adopted

Page 37: Generalized Linear Model (GZLM): Overview. Dependent Variables Continuous Discrete  Dichotomous  Polychotomous  Ordinal  Count

Estimated Mean DV

.018 + (.185)(X)

X = 0 .018 + (.185)(0) = .018e.018 = 1.018M = 1.02 children adopted

X = 1 .018 + (.185)(1) = .203e.203 = 1.225M = 1.23 children adopted

Page 38: Generalized Linear Model (GZLM): Overview. Dependent Variables Continuous Discrete  Dichotomous  Polychotomous  Ordinal  Count

Examples of Exponentiation

e0 = 1.00

e.50 = 1.65

e1.00 = 2.72

Page 39: Generalized Linear Model (GZLM): Overview. Dependent Variables Continuous Discrete  Dichotomous  Polychotomous  Ordinal  Count

Problem

For discrete DVs the relationship between the DV () and the linear predictor () is non-linear

= α + 1X1 + 2X2 + … kXk =

• Non-linear

One-unit increase in an IV may be associated with a different amount of change in the mean DV, depending on the initial value of the IV

Page 40: Generalized Linear Model (GZLM): Overview. Dependent Variables Continuous Discrete  Dichotomous  Polychotomous  Ordinal  Count

Example Non-linear Relationship

0.00

0.50

1.00

1.50

2.00

Standardized Parenting Responsibility

Mea

n N

umbe

r of

Chi

ldre

n

Mean Number ofChildren

0.58 0.70 0.85 1.02 1.23 1.47 1.77

-3 -2 -1 0 1 2 3

Page 41: Generalized Linear Model (GZLM): Overview. Dependent Variables Continuous Discrete  Dichotomous  Polychotomous  Ordinal  Count

Solution

Linear relationship between a linear combination of one or more IVs and some function of the DV

Page 42: Generalized Linear Model (GZLM): Overview. Dependent Variables Continuous Discrete  Dichotomous  Polychotomous  Ordinal  Count

Example Linear Relationship

-0.60

-0.40

-0.20

0.00

0.20

0.40

0.60

0.80

Standardized Parenting Responsibility

ln(M

ean

Num

ber

of C

hild

ren)

ln(Mean Number ofChildren)

-0.54 -0.35 -0.17 0.02 0.20 0.39 0.57

-3 -2 -1 0 1 2 3