model regresi logit binomial - wordpress.com...nov 12, 2019 · model regresi logit binomial...
TRANSCRIPT
Analisis Data Kategorik - STK654 (Materi UAS)
Dr. Kusman Sadik, M.Si
Program Studi Magister Statistika Terapan
Departemen Statistika IPB, Semester Ganjil 2019/2020
IPB University─ Bogor Indonesia ─ Inspiring Innovation with Integrity
Model Regresi Logit Binomial(Bagian II : Peubah Bebas Kategorik)
2
In the case of logistic regression, the response
variable is a binary or dichotomous variable, which
means it can only take on one of two possible values.
Case: logistic regression models in which the
predictors are categorical or qualitative variables (such
as gender, location, and socioeconomic status).
All of the material on logistic regression modeling
remains the same, but the coding of the predictors
(dummy coding) and interpretation of the regression
coefficients changes due to the categorical nature of the
predictors.
3
The interpretation of the model parameters
(intercept, slope) discussed for continuous predictor
variables does not change fundamentally for
categorical predictor variables.
The main difference between quantitative or
continuous predictors and qualitative or
categorical predictors is that the latter need to be
coded such that (C – 1) indicator variables are
required to represent a total of C categories.
4
When dummy coding is used, the last category of
the variable is used as a reference category.
Therefore, the parameter associated with the last
category is set to zero, and each of the
remaining parameters of the model is interpreted
relative to the last category.
5
6
X (Gender) : 0 = male, 1 = female
7
8
9
10
X (SES) : high, moderate, low
11
12
13
Inferensia
14
Catatan : Uji G2 sama dengan Uji Deviance
15
16
Pengaruh Interaksi
17
18
Gender SES Interaksi
19
kategori referensi
20
21
22
# Model Logistik untuk Data Horseshoe Crab (Agresti, 5.4.4) #
dataku <- read.csv(file="Data-Horseshoe.Crab-Agresti.csv")
c <- factor(dataku[,1])
s <- factor(dataku[,2])
w <- dataku[,3]
wt <- dataku[,4]
sa <- dataku[,5]
y <- c(1:173)
for (i in 1:length(sa)) {
if (sa[i] > 0) (y[i] = 1) else (y[i] = 0)
}
color <- relevel(c, ref="4") # Kategori Referensi #
width <- w
data.frame(color,s,width,wt,sa,y)
model <- glm(y ~ color+width, family=binomial("link"=logit))
summary(model)
dugaan <- round(fitted(model),2)
data.frame(color,width,y,dugaan)
23
color s width wt sa y
1 2 3 28.3 3.05 8 1
2 3 3 26.0 2.60 4 1
3 3 3 25.6 2.15 0 0
4 4 2 21.0 1.85 0 0
5 2 3 29.0 3.00 1 1
6 1 2 25.0 2.30 3 1
7 4 3 26.2 1.30 0 0
8 2 3 24.9 2.10 0 0
.
.
.
171 2 3 26.5 2.75 7 1
172 3 3 26.1 2.75 3 1
173 2 2 24.5 2.00 0 0
24
Call:
glm(formula = y ~ color+width, family = binomial(link =
logit))
Coefficients:
Estimate Std. Error z value Pr(>|z|)
Intercept -12.7151 2.7617 -4.604 4.14e-06 ***
color1 1.3299 0.8525 1.560 0.1188
color2 1.4023 0.5484 2.557 0.0106 *
color3 1.1061 0.5921 1.868 0.0617 .
width 0.4680 0.1055 4.434 9.26e-06 ***
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’
Null deviance: 225.76 on 172 degrees of freedom
Residual deviance: 187.46 on 168 degrees of freedom
AIC: 197.46
25
color width y dugaan
1 2 28.3 1 0.87
2 3 26.0 1 0.64
3 3 25.6 0 0.59
4 4 21.0 0 0.05
5 2 29.0 1 0.91
6 1 25.0 1 0.58
7 4 26.2 0 0.39
8 2 24.9 0 0.58
.
.
.
171 2 26.5 1 0.75
172 3 26.1 1 0.65
173 2 24.5 0 0.54
26
27
28
H0 : β1 = β2 = β3 = 0
Call: H0glm(formula = y ~ width, family = binomial(link = logit))
Null deviance : 225.76 on 172 degrees of freedom
Residual deviance: 194.45 on 171 degrees of freedom
AIC: 198.45
Call: H1glm(formula = y ~ color + width, family = binomial(link =
logit))
Null deviance : 225.76 on 172 degrees of freedom
Residual deviance: 187.46 on 168 degrees of freedom
AIC: 197.46
Apa kesimpulan dari uji deviance tersebut?
29
Bagaimana cara menguji ada tidaknya interaksi antara
“Color” dan “Width”?
Apa hipotesis H0 dan H1-nya?
Bagaimana implementasinya dalam Program R?
30
31
1. Gunakan Program R untuk data Horseshoe Crabs Revisited
(Agresti, sub-bab 5.4.4 ) .
a. Lakukan pemodelan regresi logistik dengan peubah bebasnya
adalah Width (x) dan Color (c). Bandingkan hasil output R
dengan output SAS di dalam buku Agresti. Jelaskan
interpretasinya.
b. Lakukan pemodelan regresi logistik dengan peubah bebasnya
adalah Width (x), Color (c), dan Spine (s), tanpa interaksi.
Apakah Spine berpengaruh nyata? Gunakan uji Deviance
untuk = 0.05.
c. Pada model bagian (b) di atas, lalukan uji Deviance pada
= 0.05 untuk mengetahui apakah ada interaksi antara Color
dan Spine. Jelaskan interpretasinya.
32
2. Gunakan Program R untuk menyelesaikan Problems 9.5 (Azen,
hlm. 241 ) .
33
34
Pustaka
1. Azen, R. dan Walker, C.R. (2011). Categorical Data
Analysis for the Behavioral and Social Sciences.
Routledge, Taylor and Francis Group, New York.
2. Agresti, A. (2002). Categorical Data Analysis 2nd. New
York: Wiley.
3. Pustaka lain yang relevan.
35
Bisa di-download di
kusmansadik.wordpress.com
36
Terima Kasih