creating structured and flexible models: some open problemsgelman/presentations/mittalk2.pdf ·...

Post on 20-Jul-2020

2 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Creating structured and flexible models: someopen problems

Andrew GelmanDepartment of Statistics and Department of Political Science

Columbia University

8 June 2009

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Themes

I Goal: models that are structured enough to be able to learnfrom data but not be so strong as to overwhelm the data

I Collaborators:Joe Bafumi, Valerie Chan, Samantha Cook, Jeronimo Cortina,Zaiying Huang, Aleks Jakulin, Jouni Kerman, Gary King, IainPardoe, David Park, Maria Grazia Pittau, Boris Shor, MattStevens, Yu-Sung Su, Francis Tuerlinckx, Masanao Yajima,Shouhao Zhou

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Themes

I Goal: models that are structured enough to be able to learnfrom data but not be so strong as to overwhelm the data

I Collaborators:Joe Bafumi, Valerie Chan, Samantha Cook, Jeronimo Cortina,Zaiying Huang, Aleks Jakulin, Jouni Kerman, Gary King, IainPardoe, David Park, Maria Grazia Pittau, Boris Shor, MattStevens, Yu-Sung Su, Francis Tuerlinckx, Masanao Yajima,Shouhao Zhou

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Themes

I Goal: models that are structured enough to be able to learnfrom data but not be so strong as to overwhelm the data

I Collaborators:Joe Bafumi, Valerie Chan, Samantha Cook, Jeronimo Cortina,Zaiying Huang, Aleks Jakulin, Jouni Kerman, Gary King, IainPardoe, David Park, Maria Grazia Pittau, Boris Shor, MattStevens, Yu-Sung Su, Francis Tuerlinckx, Masanao Yajima,Shouhao Zhou

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets

Logistic regression

−6 −4 −2 0 2 4 60.0

0.2

0.4

0.6

0.8

1.0 y = logit−1(x)

x

logi

t−1(x

)

slope = 1/4

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets

A clean example

−10 0 10 20

0.0

0.2

0.4

0.6

0.8

1.0

estimated Pr(y=1) = logit−1(−1.40 + 0.33 x)

x

y slope = 0.33/4

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets

The problem of separation

−6 −4 −2 0 2 4 6

0.0

0.2

0.4

0.6

0.8

1.0

slope = infinity?

x

y

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets

Separation is no joke!

glm (vote ~ female + black + income, family=binomial(link="logit"))

1960 1968

coef.est coef.se coef.est coef.se

(Intercept) -0.14 0.23 (Intercept) 0.47 0.24

female 0.24 0.14 female -0.01 0.15

black -1.03 0.36 black -3.64 0.59

income 0.03 0.06 income -0.03 0.07

1964 1972

coef.est coef.se coef.est coef.se

(Intercept) -1.15 0.22 (Intercept) 0.67 0.18

female -0.09 0.14 female -0.25 0.12

black -16.83 420.40 black -2.63 0.27

income 0.19 0.06 income 0.09 0.05

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets

bayesglm()

I Bayesian logistic regression

I In the arm (Applied Regression and Multilevel modeling)package in R

I Replaces glm(), estimates are more numerically andcomputationally stable

I Student-t prior distributions for regression coefs

I Use EM-like algorithm

I We went inside glm.fit to augment the iteratively weightedleast squares step

I Default choices for tuning parameters (we’ll get back to this!)

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets

bayesglm()

I Bayesian logistic regression

I In the arm (Applied Regression and Multilevel modeling)package in R

I Replaces glm(), estimates are more numerically andcomputationally stable

I Student-t prior distributions for regression coefs

I Use EM-like algorithm

I We went inside glm.fit to augment the iteratively weightedleast squares step

I Default choices for tuning parameters (we’ll get back to this!)

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets

bayesglm()

I Bayesian logistic regression

I In the arm (Applied Regression and Multilevel modeling)package in R

I Replaces glm(), estimates are more numerically andcomputationally stable

I Student-t prior distributions for regression coefs

I Use EM-like algorithm

I We went inside glm.fit to augment the iteratively weightedleast squares step

I Default choices for tuning parameters (we’ll get back to this!)

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets

bayesglm()

I Bayesian logistic regression

I In the arm (Applied Regression and Multilevel modeling)package in R

I Replaces glm(), estimates are more numerically andcomputationally stable

I Student-t prior distributions for regression coefs

I Use EM-like algorithm

I We went inside glm.fit to augment the iteratively weightedleast squares step

I Default choices for tuning parameters (we’ll get back to this!)

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets

bayesglm()

I Bayesian logistic regression

I In the arm (Applied Regression and Multilevel modeling)package in R

I Replaces glm(), estimates are more numerically andcomputationally stable

I Student-t prior distributions for regression coefs

I Use EM-like algorithm

I We went inside glm.fit to augment the iteratively weightedleast squares step

I Default choices for tuning parameters (we’ll get back to this!)

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets

bayesglm()

I Bayesian logistic regression

I In the arm (Applied Regression and Multilevel modeling)package in R

I Replaces glm(), estimates are more numerically andcomputationally stable

I Student-t prior distributions for regression coefs

I Use EM-like algorithm

I We went inside glm.fit to augment the iteratively weightedleast squares step

I Default choices for tuning parameters (we’ll get back to this!)

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets

bayesglm()

I Bayesian logistic regression

I In the arm (Applied Regression and Multilevel modeling)package in R

I Replaces glm(), estimates are more numerically andcomputationally stable

I Student-t prior distributions for regression coefs

I Use EM-like algorithm

I We went inside glm.fit to augment the iteratively weightedleast squares step

I Default choices for tuning parameters (we’ll get back to this!)

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets

bayesglm()

I Bayesian logistic regression

I In the arm (Applied Regression and Multilevel modeling)package in R

I Replaces glm(), estimates are more numerically andcomputationally stable

I Student-t prior distributions for regression coefs

I Use EM-like algorithm

I We went inside glm.fit to augment the iteratively weightedleast squares step

I Default choices for tuning parameters (we’ll get back to this!)

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets

Regularization in action!

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets

What else is out there?

I glm (maximum likelihood): fails under separation, gives noisyanswers for sparse data

I Augment with prior “successes” and “failures”: doesn’t workwell for multiple predictors

I brlr (Jeffreys-like prior distribution): computationallyunstable

I brglm (improvement on brlr): doesn’t do enough smoothing

I BBR (Laplace prior distribution): OK, not quite as good asbayesglm

I Non-Bayesian machine learning algorithms: understateuncertainty in predictions

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets

What else is out there?

I glm (maximum likelihood): fails under separation, gives noisyanswers for sparse data

I Augment with prior “successes” and “failures”: doesn’t workwell for multiple predictors

I brlr (Jeffreys-like prior distribution): computationallyunstable

I brglm (improvement on brlr): doesn’t do enough smoothing

I BBR (Laplace prior distribution): OK, not quite as good asbayesglm

I Non-Bayesian machine learning algorithms: understateuncertainty in predictions

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets

What else is out there?

I glm (maximum likelihood): fails under separation, gives noisyanswers for sparse data

I Augment with prior “successes” and “failures”: doesn’t workwell for multiple predictors

I brlr (Jeffreys-like prior distribution): computationallyunstable

I brglm (improvement on brlr): doesn’t do enough smoothing

I BBR (Laplace prior distribution): OK, not quite as good asbayesglm

I Non-Bayesian machine learning algorithms: understateuncertainty in predictions

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets

What else is out there?

I glm (maximum likelihood): fails under separation, gives noisyanswers for sparse data

I Augment with prior “successes” and “failures”: doesn’t workwell for multiple predictors

I brlr (Jeffreys-like prior distribution): computationallyunstable

I brglm (improvement on brlr): doesn’t do enough smoothing

I BBR (Laplace prior distribution): OK, not quite as good asbayesglm

I Non-Bayesian machine learning algorithms: understateuncertainty in predictions

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets

What else is out there?

I glm (maximum likelihood): fails under separation, gives noisyanswers for sparse data

I Augment with prior “successes” and “failures”: doesn’t workwell for multiple predictors

I brlr (Jeffreys-like prior distribution): computationallyunstable

I brglm (improvement on brlr): doesn’t do enough smoothing

I BBR (Laplace prior distribution): OK, not quite as good asbayesglm

I Non-Bayesian machine learning algorithms: understateuncertainty in predictions

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets

What else is out there?

I glm (maximum likelihood): fails under separation, gives noisyanswers for sparse data

I Augment with prior “successes” and “failures”: doesn’t workwell for multiple predictors

I brlr (Jeffreys-like prior distribution): computationallyunstable

I brglm (improvement on brlr): doesn’t do enough smoothing

I BBR (Laplace prior distribution): OK, not quite as good asbayesglm

I Non-Bayesian machine learning algorithms: understateuncertainty in predictions

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets

What else is out there?

I glm (maximum likelihood): fails under separation, gives noisyanswers for sparse data

I Augment with prior “successes” and “failures”: doesn’t workwell for multiple predictors

I brlr (Jeffreys-like prior distribution): computationallyunstable

I brglm (improvement on brlr): doesn’t do enough smoothing

I BBR (Laplace prior distribution): OK, not quite as good asbayesglm

I Non-Bayesian machine learning algorithms: understateuncertainty in predictions

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets

Information in prior distributions

I Informative prior distI A full generative model for the data

I Noninformative prior distI Let the data speakI Goal: valid inference for any θ

I Weakly informative prior distI Purposely include less information than we actually haveI Goal: regularization, stabilization

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets

Information in prior distributions

I Informative prior distI A full generative model for the data

I Noninformative prior distI Let the data speakI Goal: valid inference for any θ

I Weakly informative prior distI Purposely include less information than we actually haveI Goal: regularization, stabilization

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets

Information in prior distributions

I Informative prior distI A full generative model for the data

I Noninformative prior distI Let the data speakI Goal: valid inference for any θ

I Weakly informative prior distI Purposely include less information than we actually haveI Goal: regularization, stabilization

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets

Information in prior distributions

I Informative prior distI A full generative model for the data

I Noninformative prior distI Let the data speakI Goal: valid inference for any θ

I Weakly informative prior distI Purposely include less information than we actually haveI Goal: regularization, stabilization

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets

Information in prior distributions

I Informative prior distI A full generative model for the data

I Noninformative prior distI Let the data speakI Goal: valid inference for any θ

I Weakly informative prior distI Purposely include less information than we actually haveI Goal: regularization, stabilization

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets

Information in prior distributions

I Informative prior distI A full generative model for the data

I Noninformative prior distI Let the data speakI Goal: valid inference for any θ

I Weakly informative prior distI Purposely include less information than we actually haveI Goal: regularization, stabilization

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets

Information in prior distributions

I Informative prior distI A full generative model for the data

I Noninformative prior distI Let the data speakI Goal: valid inference for any θ

I Weakly informative prior distI Purposely include less information than we actually haveI Goal: regularization, stabilization

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets

Information in prior distributions

I Informative prior distI A full generative model for the data

I Noninformative prior distI Let the data speakI Goal: valid inference for any θ

I Weakly informative prior distI Purposely include less information than we actually haveI Goal: regularization, stabilization

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets

Information in prior distributions

I Informative prior distI A full generative model for the data

I Noninformative prior distI Let the data speakI Goal: valid inference for any θ

I Weakly informative prior distI Purposely include less information than we actually haveI Goal: regularization, stabilization

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets

Weakly informative priors forlogistic regression coefficients

I Separation in logistic regressionI Some prior info: logistic regression coefs are almost always

between −5 and 5:I 5 on the logit scale takes you from 0.01 to 0.50

or from 0.50 to 0.99I Smoking and lung cancer

I Independent Cauchy prior dists with center 0 and scale 2.5

I Rescale each predictor to have mean 0 and sd 12

I Fast implementation using EM; easy adaptation of glm

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets

Weakly informative priors forlogistic regression coefficients

I Separation in logistic regressionI Some prior info: logistic regression coefs are almost always

between −5 and 5:I 5 on the logit scale takes you from 0.01 to 0.50

or from 0.50 to 0.99I Smoking and lung cancer

I Independent Cauchy prior dists with center 0 and scale 2.5

I Rescale each predictor to have mean 0 and sd 12

I Fast implementation using EM; easy adaptation of glm

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets

Weakly informative priors forlogistic regression coefficients

I Separation in logistic regressionI Some prior info: logistic regression coefs are almost always

between −5 and 5:I 5 on the logit scale takes you from 0.01 to 0.50

or from 0.50 to 0.99I Smoking and lung cancer

I Independent Cauchy prior dists with center 0 and scale 2.5

I Rescale each predictor to have mean 0 and sd 12

I Fast implementation using EM; easy adaptation of glm

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets

Weakly informative priors forlogistic regression coefficients

I Separation in logistic regressionI Some prior info: logistic regression coefs are almost always

between −5 and 5:I 5 on the logit scale takes you from 0.01 to 0.50

or from 0.50 to 0.99I Smoking and lung cancer

I Independent Cauchy prior dists with center 0 and scale 2.5

I Rescale each predictor to have mean 0 and sd 12

I Fast implementation using EM; easy adaptation of glm

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets

Weakly informative priors forlogistic regression coefficients

I Separation in logistic regressionI Some prior info: logistic regression coefs are almost always

between −5 and 5:I 5 on the logit scale takes you from 0.01 to 0.50

or from 0.50 to 0.99I Smoking and lung cancer

I Independent Cauchy prior dists with center 0 and scale 2.5

I Rescale each predictor to have mean 0 and sd 12

I Fast implementation using EM; easy adaptation of glm

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets

Weakly informative priors forlogistic regression coefficients

I Separation in logistic regressionI Some prior info: logistic regression coefs are almost always

between −5 and 5:I 5 on the logit scale takes you from 0.01 to 0.50

or from 0.50 to 0.99I Smoking and lung cancer

I Independent Cauchy prior dists with center 0 and scale 2.5

I Rescale each predictor to have mean 0 and sd 12

I Fast implementation using EM; easy adaptation of glm

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets

Weakly informative priors forlogistic regression coefficients

I Separation in logistic regressionI Some prior info: logistic regression coefs are almost always

between −5 and 5:I 5 on the logit scale takes you from 0.01 to 0.50

or from 0.50 to 0.99I Smoking and lung cancer

I Independent Cauchy prior dists with center 0 and scale 2.5

I Rescale each predictor to have mean 0 and sd 12

I Fast implementation using EM; easy adaptation of glm

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets

Weakly informative priors forlogistic regression coefficients

I Separation in logistic regressionI Some prior info: logistic regression coefs are almost always

between −5 and 5:I 5 on the logit scale takes you from 0.01 to 0.50

or from 0.50 to 0.99I Smoking and lung cancer

I Independent Cauchy prior dists with center 0 and scale 2.5

I Rescale each predictor to have mean 0 and sd 12

I Fast implementation using EM; easy adaptation of glm

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets

Prior distributions

−10 −5 0 5 10

θ

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets

Another example

Dose #deaths/#animals

−0.86 0/5−0.30 1/5−0.05 3/5

0.73 5/5

I Slope of a logistic regression of Pr(death) on dose:I Maximum likelihood est is 7.8± 4.9I With weakly-informative prior: Bayes est is 4.4± 1.9

I Which is truly conservative?

I The sociology of shrinkage

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets

Another example

Dose #deaths/#animals

−0.86 0/5−0.30 1/5−0.05 3/5

0.73 5/5

I Slope of a logistic regression of Pr(death) on dose:I Maximum likelihood est is 7.8± 4.9I With weakly-informative prior: Bayes est is 4.4± 1.9

I Which is truly conservative?

I The sociology of shrinkage

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets

Another example

Dose #deaths/#animals

−0.86 0/5−0.30 1/5−0.05 3/5

0.73 5/5

I Slope of a logistic regression of Pr(death) on dose:I Maximum likelihood est is 7.8± 4.9I With weakly-informative prior: Bayes est is 4.4± 1.9

I Which is truly conservative?

I The sociology of shrinkage

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets

Another example

Dose #deaths/#animals

−0.86 0/5−0.30 1/5−0.05 3/5

0.73 5/5

I Slope of a logistic regression of Pr(death) on dose:I Maximum likelihood est is 7.8± 4.9I With weakly-informative prior: Bayes est is 4.4± 1.9

I Which is truly conservative?

I The sociology of shrinkage

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets

Another example

Dose #deaths/#animals

−0.86 0/5−0.30 1/5−0.05 3/5

0.73 5/5

I Slope of a logistic regression of Pr(death) on dose:I Maximum likelihood est is 7.8± 4.9I With weakly-informative prior: Bayes est is 4.4± 1.9

I Which is truly conservative?

I The sociology of shrinkage

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets

Another example

Dose #deaths/#animals

−0.86 0/5−0.30 1/5−0.05 3/5

0.73 5/5

I Slope of a logistic regression of Pr(death) on dose:I Maximum likelihood est is 7.8± 4.9I With weakly-informative prior: Bayes est is 4.4± 1.9

I Which is truly conservative?

I The sociology of shrinkage

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets

Maximum likelihood and Bayesian estimates

Dose

Pro

babi

lity

of d

eath

0 10 20

0.0

0.5

1.0

glmbayesglm

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets

Conservatism of Bayesian inference

I Problems with maximum likelihood when data showseparation:

I Coefficient estimate of −∞I Estimated predictive probability of 0 for new cases

I Is this conservative?

I Not if evaluated by log score or predictive log-likelihood

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets

Conservatism of Bayesian inference

I Problems with maximum likelihood when data showseparation:

I Coefficient estimate of −∞I Estimated predictive probability of 0 for new cases

I Is this conservative?

I Not if evaluated by log score or predictive log-likelihood

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets

Conservatism of Bayesian inference

I Problems with maximum likelihood when data showseparation:

I Coefficient estimate of −∞I Estimated predictive probability of 0 for new cases

I Is this conservative?

I Not if evaluated by log score or predictive log-likelihood

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets

Conservatism of Bayesian inference

I Problems with maximum likelihood when data showseparation:

I Coefficient estimate of −∞I Estimated predictive probability of 0 for new cases

I Is this conservative?

I Not if evaluated by log score or predictive log-likelihood

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets

Conservatism of Bayesian inference

I Problems with maximum likelihood when data showseparation:

I Coefficient estimate of −∞I Estimated predictive probability of 0 for new cases

I Is this conservative?

I Not if evaluated by log score or predictive log-likelihood

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets

Conservatism of Bayesian inference

I Problems with maximum likelihood when data showseparation:

I Coefficient estimate of −∞I Estimated predictive probability of 0 for new cases

I Is this conservative?

I Not if evaluated by log score or predictive log-likelihood

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets

Which one is conservative?

Dose

Pro

babi

lity

of d

eath

0 10 20

0.0

0.5

1.0

glmbayesglm

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets

Prior as population distribution

I Consider many possible datasets

I The “true prior” is the distribution of β’s across these datasets

I Fit one dataset at a time

I A “weakly informative prior” has less information (widervariance) than the true prior

I Open question: How to formalize the tradeoffs from usingdifferent priors?

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets

Prior as population distribution

I Consider many possible datasets

I The “true prior” is the distribution of β’s across these datasets

I Fit one dataset at a time

I A “weakly informative prior” has less information (widervariance) than the true prior

I Open question: How to formalize the tradeoffs from usingdifferent priors?

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets

Prior as population distribution

I Consider many possible datasets

I The “true prior” is the distribution of β’s across these datasets

I Fit one dataset at a time

I A “weakly informative prior” has less information (widervariance) than the true prior

I Open question: How to formalize the tradeoffs from usingdifferent priors?

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets

Prior as population distribution

I Consider many possible datasets

I The “true prior” is the distribution of β’s across these datasets

I Fit one dataset at a time

I A “weakly informative prior” has less information (widervariance) than the true prior

I Open question: How to formalize the tradeoffs from usingdifferent priors?

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets

Prior as population distribution

I Consider many possible datasets

I The “true prior” is the distribution of β’s across these datasets

I Fit one dataset at a time

I A “weakly informative prior” has less information (widervariance) than the true prior

I Open question: How to formalize the tradeoffs from usingdifferent priors?

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets

Prior as population distribution

I Consider many possible datasets

I The “true prior” is the distribution of β’s across these datasets

I Fit one dataset at a time

I A “weakly informative prior” has less information (widervariance) than the true prior

I Open question: How to formalize the tradeoffs from usingdifferent priors?

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets

Evaluation using a corpus of datasets

I Compare classical glm to Bayesian estimates using variousprior distributions

I Evaluate using 5-fold cross-validation and average predictiveerror

I The optimal prior distribution for β’s is (approx) Cauchy (0, 1)

I Our Cauchy (0, 2.5) prior distribution is weakly informative!

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets

Evaluation using a corpus of datasets

I Compare classical glm to Bayesian estimates using variousprior distributions

I Evaluate using 5-fold cross-validation and average predictiveerror

I The optimal prior distribution for β’s is (approx) Cauchy (0, 1)

I Our Cauchy (0, 2.5) prior distribution is weakly informative!

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets

Evaluation using a corpus of datasets

I Compare classical glm to Bayesian estimates using variousprior distributions

I Evaluate using 5-fold cross-validation and average predictiveerror

I The optimal prior distribution for β’s is (approx) Cauchy (0, 1)

I Our Cauchy (0, 2.5) prior distribution is weakly informative!

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets

Evaluation using a corpus of datasets

I Compare classical glm to Bayesian estimates using variousprior distributions

I Evaluate using 5-fold cross-validation and average predictiveerror

I The optimal prior distribution for β’s is (approx) Cauchy (0, 1)

I Our Cauchy (0, 2.5) prior distribution is weakly informative!

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets

Evaluation using a corpus of datasets

I Compare classical glm to Bayesian estimates using variousprior distributions

I Evaluate using 5-fold cross-validation and average predictiveerror

I The optimal prior distribution for β’s is (approx) Cauchy (0, 1)

I Our Cauchy (0, 2.5) prior distribution is weakly informative!

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets

Expected predictive loss, avg over a corpus of datasets

0 1 2 3 4 5

0.29

0.30

0.31

0.32

0.33

scale of prior

−lo

g te

st li

kelih

ood

(1.79)GLM

BBR(l)

df=2.0

df=4.0

df=8.0BBR(g)

df=1.0

df=0.5

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets

Other examples of weakly informative priors

I Variance parameters

I Covariance matrices

I Population variation in a physiological model

I Mixture models

I Intentional underpooling in hierarchical models

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets

Other examples of weakly informative priors

I Variance parameters

I Covariance matrices

I Population variation in a physiological model

I Mixture models

I Intentional underpooling in hierarchical models

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets

Other examples of weakly informative priors

I Variance parameters

I Covariance matrices

I Population variation in a physiological model

I Mixture models

I Intentional underpooling in hierarchical models

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets

Other examples of weakly informative priors

I Variance parameters

I Covariance matrices

I Population variation in a physiological model

I Mixture models

I Intentional underpooling in hierarchical models

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets

Other examples of weakly informative priors

I Variance parameters

I Covariance matrices

I Population variation in a physiological model

I Mixture models

I Intentional underpooling in hierarchical models

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets

Other examples of weakly informative priors

I Variance parameters

I Covariance matrices

I Population variation in a physiological model

I Mixture models

I Intentional underpooling in hierarchical models

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets

End of part 1

I “Noninformative priors” are actually weakly informative

I “Weakly informative” is a more general and useful conceptI Regularization

I Better inferencesI Stability of computation (bayesglm)

I Why use weakly informative priors rather than informativepriors?

I Conformity with statistical culture (“conservatism”)I Labor-saving deviceI Robustness

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets

End of part 1

I “Noninformative priors” are actually weakly informative

I “Weakly informative” is a more general and useful conceptI Regularization

I Better inferencesI Stability of computation (bayesglm)

I Why use weakly informative priors rather than informativepriors?

I Conformity with statistical culture (“conservatism”)I Labor-saving deviceI Robustness

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets

End of part 1

I “Noninformative priors” are actually weakly informative

I “Weakly informative” is a more general and useful conceptI Regularization

I Better inferencesI Stability of computation (bayesglm)

I Why use weakly informative priors rather than informativepriors?

I Conformity with statistical culture (“conservatism”)I Labor-saving deviceI Robustness

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets

End of part 1

I “Noninformative priors” are actually weakly informative

I “Weakly informative” is a more general and useful conceptI Regularization

I Better inferencesI Stability of computation (bayesglm)

I Why use weakly informative priors rather than informativepriors?

I Conformity with statistical culture (“conservatism”)I Labor-saving deviceI Robustness

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets

End of part 1

I “Noninformative priors” are actually weakly informative

I “Weakly informative” is a more general and useful conceptI Regularization

I Better inferencesI Stability of computation (bayesglm)

I Why use weakly informative priors rather than informativepriors?

I Conformity with statistical culture (“conservatism”)I Labor-saving deviceI Robustness

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets

End of part 1

I “Noninformative priors” are actually weakly informative

I “Weakly informative” is a more general and useful conceptI Regularization

I Better inferencesI Stability of computation (bayesglm)

I Why use weakly informative priors rather than informativepriors?

I Conformity with statistical culture (“conservatism”)I Labor-saving deviceI Robustness

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets

End of part 1

I “Noninformative priors” are actually weakly informative

I “Weakly informative” is a more general and useful conceptI Regularization

I Better inferencesI Stability of computation (bayesglm)

I Why use weakly informative priors rather than informativepriors?

I Conformity with statistical culture (“conservatism”)I Labor-saving deviceI Robustness

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets

End of part 1

I “Noninformative priors” are actually weakly informative

I “Weakly informative” is a more general and useful conceptI Regularization

I Better inferencesI Stability of computation (bayesglm)

I Why use weakly informative priors rather than informativepriors?

I Conformity with statistical culture (“conservatism”)I Labor-saving deviceI Robustness

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets

End of part 1

I “Noninformative priors” are actually weakly informative

I “Weakly informative” is a more general and useful conceptI Regularization

I Better inferencesI Stability of computation (bayesglm)

I Why use weakly informative priors rather than informativepriors?

I Conformity with statistical culture (“conservatism”)I Labor-saving deviceI Robustness

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets

End of part 1

I “Noninformative priors” are actually weakly informative

I “Weakly informative” is a more general and useful conceptI Regularization

I Better inferencesI Stability of computation (bayesglm)

I Why use weakly informative priors rather than informativepriors?

I Conformity with statistical culture (“conservatism”)I Labor-saving deviceI Robustness

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Legislative redistrictingEducational experimentIncumbency advantageHierarchical models for treatment interactions

No-interaction model

I Before-after data with treatment and control groupsI Default model: constant treatment effects

I Fisher’s classical null hyp: effect is zero for all casesI Regression model: yi = Tiθ + Xiβ + εi

control

treatment

"before" measurement, x

"afte

r" m

easu

rem

ent,

y

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Legislative redistrictingEducational experimentIncumbency advantageHierarchical models for treatment interactions

No-interaction model

I Before-after data with treatment and control groupsI Default model: constant treatment effects

I Fisher’s classical null hyp: effect is zero for all casesI Regression model: yi = Tiθ + Xiβ + εi

control

treatment

"before" measurement, x

"afte

r" m

easu

rem

ent,

y

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Legislative redistrictingEducational experimentIncumbency advantageHierarchical models for treatment interactions

No-interaction model

I Before-after data with treatment and control groupsI Default model: constant treatment effects

I Fisher’s classical null hyp: effect is zero for all casesI Regression model: yi = Tiθ + Xiβ + εi

control

treatment

"before" measurement, x

"afte

r" m

easu

rem

ent,

y

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Legislative redistrictingEducational experimentIncumbency advantageHierarchical models for treatment interactions

No-interaction model

I Before-after data with treatment and control groupsI Default model: constant treatment effects

I Fisher’s classical null hyp: effect is zero for all casesI Regression model: yi = Tiθ + Xiβ + εi

control

treatment

"before" measurement, x

"afte

r" m

easu

rem

ent,

y

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Legislative redistrictingEducational experimentIncumbency advantageHierarchical models for treatment interactions

No-interaction model

I Before-after data with treatment and control groupsI Default model: constant treatment effects

I Fisher’s classical null hyp: effect is zero for all casesI Regression model: yi = Tiθ + Xiβ + εi

control

treatment

"before" measurement, x

"afte

r" m

easu

rem

ent,

y

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Legislative redistrictingEducational experimentIncumbency advantageHierarchical models for treatment interactions

Actual data show interactions

I Treatment interacts with “before” measurement

I Before-after correlation is higher for controls than for treatedunits

I ExamplesI An observational study of legislative redistrictingI An experiment with pre-test, post-test dataI Congressional elections with incumbents and open seats

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Legislative redistrictingEducational experimentIncumbency advantageHierarchical models for treatment interactions

Actual data show interactions

I Treatment interacts with “before” measurement

I Before-after correlation is higher for controls than for treatedunits

I ExamplesI An observational study of legislative redistrictingI An experiment with pre-test, post-test dataI Congressional elections with incumbents and open seats

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Legislative redistrictingEducational experimentIncumbency advantageHierarchical models for treatment interactions

Actual data show interactions

I Treatment interacts with “before” measurement

I Before-after correlation is higher for controls than for treatedunits

I ExamplesI An observational study of legislative redistrictingI An experiment with pre-test, post-test dataI Congressional elections with incumbents and open seats

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Legislative redistrictingEducational experimentIncumbency advantageHierarchical models for treatment interactions

Actual data show interactions

I Treatment interacts with “before” measurement

I Before-after correlation is higher for controls than for treatedunits

I ExamplesI An observational study of legislative redistrictingI An experiment with pre-test, post-test dataI Congressional elections with incumbents and open seats

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Legislative redistrictingEducational experimentIncumbency advantageHierarchical models for treatment interactions

Actual data show interactions

I Treatment interacts with “before” measurement

I Before-after correlation is higher for controls than for treatedunits

I ExamplesI An observational study of legislative redistrictingI An experiment with pre-test, post-test dataI Congressional elections with incumbents and open seats

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Legislative redistrictingEducational experimentIncumbency advantageHierarchical models for treatment interactions

Actual data show interactions

I Treatment interacts with “before” measurement

I Before-after correlation is higher for controls than for treatedunits

I ExamplesI An observational study of legislative redistrictingI An experiment with pre-test, post-test dataI Congressional elections with incumbents and open seats

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Legislative redistrictingEducational experimentIncumbency advantageHierarchical models for treatment interactions

Actual data show interactions

I Treatment interacts with “before” measurement

I Before-after correlation is higher for controls than for treatedunits

I ExamplesI An observational study of legislative redistrictingI An experiment with pre-test, post-test dataI Congressional elections with incumbents and open seats

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Legislative redistrictingEducational experimentIncumbency advantageHierarchical models for treatment interactions

Observational study of legislative redistricting:before-after data

Estimated partisan bias in previous election

Est

imat

ed p

artis

an b

ias

(adj

uste

d fo

r st

ate)

-0.05 0.0 0.05

-0.0

50.

00.

05

no redistricting

bipartisan redistrict

Dem. redistrict

Rep. redistrict.

. .. .

.

..

.....

...

.

.

.

...

.

.

..

..

.

.

.. .

. .

.

.

.

.

.

..

..

.

. .

.

.

. .

..

..

.

.. ..

.

.

.

..

.

..

..

.

.

. .

.

.

.

..

..

.

.

.

.

.

.. .

.

.

..

.

.

.

. .

..

..

.

..

.

. .

.

.. o

o

o

o

o o

ox

x

x

x

x

xx

x

x

x

•• •

•••

(favors Democrats)

(favors Republicans)

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Legislative redistrictingEducational experimentIncumbency advantageHierarchical models for treatment interactions

Educational experiment: correlation between pre-test andpost-test data for controls and for treated units

grade

corr

elat

ion

1 2 3 4

0.8

0.9

1.0

controls

treated

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Legislative redistrictingEducational experimentIncumbency advantageHierarchical models for treatment interactions

Correlation between two successive Congressional electionsfor incumbents running (controls) and open seats (treated)

1900 1920 1940 1960 1980 2000

0.0

0.2

0.4

0.6

0.8

year

corr

elat

ion

incumbents

open seats

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Legislative redistrictingEducational experimentIncumbency advantageHierarchical models for treatment interactions

Interactions as variance components

I yi = Tiθ + Xiβ + εi + ηi

I Unit-level “error term” ηi

I For control units, ηi persists from time 1 to time 2I For treatment units, ηi changes:

I Subtractive treatment error (ηi only at time 1)I Additive treatment error (ηi only at time 2)I Replacement treatment error

I Under all these models, the before-after correlation is higherfor controls than treated units

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Legislative redistrictingEducational experimentIncumbency advantageHierarchical models for treatment interactions

Interactions as variance components

I yi = Tiθ + Xiβ + εi + ηi

I Unit-level “error term” ηi

I For control units, ηi persists from time 1 to time 2I For treatment units, ηi changes:

I Subtractive treatment error (ηi only at time 1)I Additive treatment error (ηi only at time 2)I Replacement treatment error

I Under all these models, the before-after correlation is higherfor controls than treated units

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Legislative redistrictingEducational experimentIncumbency advantageHierarchical models for treatment interactions

Interactions as variance components

I yi = Tiθ + Xiβ + εi + ηi

I Unit-level “error term” ηi

I For control units, ηi persists from time 1 to time 2I For treatment units, ηi changes:

I Subtractive treatment error (ηi only at time 1)I Additive treatment error (ηi only at time 2)I Replacement treatment error

I Under all these models, the before-after correlation is higherfor controls than treated units

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Legislative redistrictingEducational experimentIncumbency advantageHierarchical models for treatment interactions

Interactions as variance components

I yi = Tiθ + Xiβ + εi + ηi

I Unit-level “error term” ηi

I For control units, ηi persists from time 1 to time 2I For treatment units, ηi changes:

I Subtractive treatment error (ηi only at time 1)I Additive treatment error (ηi only at time 2)I Replacement treatment error

I Under all these models, the before-after correlation is higherfor controls than treated units

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Legislative redistrictingEducational experimentIncumbency advantageHierarchical models for treatment interactions

Interactions as variance components

I yi = Tiθ + Xiβ + εi + ηi

I Unit-level “error term” ηi

I For control units, ηi persists from time 1 to time 2I For treatment units, ηi changes:

I Subtractive treatment error (ηi only at time 1)I Additive treatment error (ηi only at time 2)I Replacement treatment error

I Under all these models, the before-after correlation is higherfor controls than treated units

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Legislative redistrictingEducational experimentIncumbency advantageHierarchical models for treatment interactions

Interactions as variance components

I yi = Tiθ + Xiβ + εi + ηi

I Unit-level “error term” ηi

I For control units, ηi persists from time 1 to time 2I For treatment units, ηi changes:

I Subtractive treatment error (ηi only at time 1)I Additive treatment error (ηi only at time 2)I Replacement treatment error

I Under all these models, the before-after correlation is higherfor controls than treated units

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Legislative redistrictingEducational experimentIncumbency advantageHierarchical models for treatment interactions

Interactions as variance components

I yi = Tiθ + Xiβ + εi + ηi

I Unit-level “error term” ηi

I For control units, ηi persists from time 1 to time 2I For treatment units, ηi changes:

I Subtractive treatment error (ηi only at time 1)I Additive treatment error (ηi only at time 2)I Replacement treatment error

I Under all these models, the before-after correlation is higherfor controls than treated units

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Legislative redistrictingEducational experimentIncumbency advantageHierarchical models for treatment interactions

Interactions as variance components

I yi = Tiθ + Xiβ + εi + ηi

I Unit-level “error term” ηi

I For control units, ηi persists from time 1 to time 2I For treatment units, ηi changes:

I Subtractive treatment error (ηi only at time 1)I Additive treatment error (ηi only at time 2)I Replacement treatment error

I Under all these models, the before-after correlation is higherfor controls than treated units

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Legislative redistrictingEducational experimentIncumbency advantageHierarchical models for treatment interactions

End of part 2

I Treatment and control can be modeled asymmetrically

I Additional variance component

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Legislative redistrictingEducational experimentIncumbency advantageHierarchical models for treatment interactions

End of part 2

I Treatment and control can be modeled asymmetrically

I Additional variance component

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Legislative redistrictingEducational experimentIncumbency advantageHierarchical models for treatment interactions

End of part 2

I Treatment and control can be modeled asymmetrically

I Additional variance component

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Federal spendingVote preferencesIncome and votingIncentives in sample surveys

Examples of interactions in regression

I Federal spending by state, year, category (50× 19× 10)

I Vote preference given state and demographic variables(50× 2× 2× 4× 4)

I Rich state, poor state, red state, blue state (50× 2 for eachelection)

I Meta-analysis of incentives in sample surveys (26)

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Federal spendingVote preferencesIncome and votingIncentives in sample surveys

Examples of interactions in regression

I Federal spending by state, year, category (50× 19× 10)

I Vote preference given state and demographic variables(50× 2× 2× 4× 4)

I Rich state, poor state, red state, blue state (50× 2 for eachelection)

I Meta-analysis of incentives in sample surveys (26)

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Federal spendingVote preferencesIncome and votingIncentives in sample surveys

Examples of interactions in regression

I Federal spending by state, year, category (50× 19× 10)

I Vote preference given state and demographic variables(50× 2× 2× 4× 4)

I Rich state, poor state, red state, blue state (50× 2 for eachelection)

I Meta-analysis of incentives in sample surveys (26)

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Federal spendingVote preferencesIncome and votingIncentives in sample surveys

Examples of interactions in regression

I Federal spending by state, year, category (50× 19× 10)

I Vote preference given state and demographic variables(50× 2× 2× 4× 4)

I Rich state, poor state, red state, blue state (50× 2 for eachelection)

I Meta-analysis of incentives in sample surveys (26)

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Federal spendingVote preferencesIncome and votingIncentives in sample surveys

Examples of interactions in regression

I Federal spending by state, year, category (50× 19× 10)

I Vote preference given state and demographic variables(50× 2× 2× 4× 4)

I Rich state, poor state, red state, blue state (50× 2 for eachelection)

I Meta-analysis of incentives in sample surveys (26)

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Federal spendingVote preferencesIncome and votingIncentives in sample surveys

General concerns

I Lots of potential interactions

I Setting high-level interactions to zero? Too extreme,especially when interactions are of substantive interest

I Simple hierarchical model for interactions is too crude

I Model: large main effects can have large interactions. Inhierarchical setting, model should come “naturally”

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Federal spendingVote preferencesIncome and votingIncentives in sample surveys

General concerns

I Lots of potential interactions

I Setting high-level interactions to zero? Too extreme,especially when interactions are of substantive interest

I Simple hierarchical model for interactions is too crude

I Model: large main effects can have large interactions. Inhierarchical setting, model should come “naturally”

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Federal spendingVote preferencesIncome and votingIncentives in sample surveys

General concerns

I Lots of potential interactions

I Setting high-level interactions to zero? Too extreme,especially when interactions are of substantive interest

I Simple hierarchical model for interactions is too crude

I Model: large main effects can have large interactions. Inhierarchical setting, model should come “naturally”

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Federal spendingVote preferencesIncome and votingIncentives in sample surveys

General concerns

I Lots of potential interactions

I Setting high-level interactions to zero? Too extreme,especially when interactions are of substantive interest

I Simple hierarchical model for interactions is too crude

I Model: large main effects can have large interactions. Inhierarchical setting, model should come “naturally”

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Federal spendingVote preferencesIncome and votingIncentives in sample surveys

General concerns

I Lots of potential interactions

I Setting high-level interactions to zero? Too extreme,especially when interactions are of substantive interest

I Simple hierarchical model for interactions is too crude

I Model: large main effects can have large interactions. Inhierarchical setting, model should come “naturally”

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Federal spendingVote preferencesIncome and votingIncentives in sample surveys

Federal spending by state

I Federal spending by state, year, category (50× 19× 10)

I For each spending category, 50× 19 data structure

I yjt = αj + βt + γjt

I possible model: γjt ∼ N (0, A + B|αjβt |)I Some example data

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Federal spendingVote preferencesIncome and votingIncentives in sample surveys

Federal spending by state

I Federal spending by state, year, category (50× 19× 10)

I For each spending category, 50× 19 data structure

I yjt = αj + βt + γjt

I possible model: γjt ∼ N (0, A + B|αjβt |)I Some example data

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Federal spendingVote preferencesIncome and votingIncentives in sample surveys

Federal spending by state

I Federal spending by state, year, category (50× 19× 10)

I For each spending category, 50× 19 data structure

I yjt = αj + βt + γjt

I possible model: γjt ∼ N (0, A + B|αjβt |)I Some example data

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Federal spendingVote preferencesIncome and votingIncentives in sample surveys

Federal spending by state

I Federal spending by state, year, category (50× 19× 10)

I For each spending category, 50× 19 data structure

I yjt = αj + βt + γjt

I possible model: γjt ∼ N (0, A + B|αjβt |)I Some example data

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Federal spendingVote preferencesIncome and votingIncentives in sample surveys

Federal spending by state

I Federal spending by state, year, category (50× 19× 10)

I For each spending category, 50× 19 data structure

I yjt = αj + βt + γjt

I possible model: γjt ∼ N (0, A + B|αjβt |)I Some example data

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Federal spendingVote preferencesIncome and votingIncentives in sample surveys

Federal spending by state

I Federal spending by state, year, category (50× 19× 10)

I For each spending category, 50× 19 data structure

I yjt = αj + βt + γjt

I possible model: γjt ∼ N (0, A + B|αjβt |)I Some example data

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Federal spendingVote preferencesIncome and votingIncentives in sample surveys

Interactions |γjt plotted vs. main effects |αjβt |

Disretire − Federal Spending (Inflation−adjusted)

Abs(a_ j*b_t)

Abs

(e_

jt)

0.00 0.02 0.04 0.06 0.08

0.00

0.05

0.10

0.15

0.20

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Federal spendingVote preferencesIncome and votingIncentives in sample surveys

Logistic regression for pre-election polls

I Logistic regression predicting vote from demographic andgeographic factors: sex, ethnicity, age, education, state

I Hierarchical model for 4 age levels, 4 education levels, 16 age× education, 50 states

I Also consider interactions such as ethnicity × state

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Federal spendingVote preferencesIncome and votingIncentives in sample surveys

Logistic regression for pre-election polls

I Logistic regression predicting vote from demographic andgeographic factors: sex, ethnicity, age, education, state

I Hierarchical model for 4 age levels, 4 education levels, 16 age× education, 50 states

I Also consider interactions such as ethnicity × state

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Federal spendingVote preferencesIncome and votingIncentives in sample surveys

Logistic regression for pre-election polls

I Logistic regression predicting vote from demographic andgeographic factors: sex, ethnicity, age, education, state

I Hierarchical model for 4 age levels, 4 education levels, 16 age× education, 50 states

I Also consider interactions such as ethnicity × state

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Federal spendingVote preferencesIncome and votingIncentives in sample surveys

Logistic regression for pre-election polls

I Logistic regression predicting vote from demographic andgeographic factors: sex, ethnicity, age, education, state

I Hierarchical model for 4 age levels, 4 education levels, 16 age× education, 50 states

I Also consider interactions such as ethnicity × state

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Federal spendingVote preferencesIncome and votingIncentives in sample surveys

Logistic regression for pre-election polls

I Logistic regression predicting vote from demographic andgeographic factors: sex, ethnicity, age, education, state

I Hierarchical model for 4 age levels, 4 education levels, 16 age× education, 50 states

I Also consider interactions such as ethnicity × state

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Federal spendingVote preferencesIncome and votingIncentives in sample surveys

Prediction error as function of # of predictors

MSE : training sample

number of parameters

MS

E

0 50 100 150

0.22

0.23

0.24

0.25

MSE : test sample

number of parameters

MS

E

0 50 100 150

0.22

0.23

0.24

0.25

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Federal spendingVote preferencesIncome and votingIncentives in sample surveys

Red state, blue state, rich state, poor state

I Richer voters favor the Republicans, but

I Richer states favor the Democrats

I Hierarchical logistic regression: predict your vote given yourincome and your state (“varying-intercept model”)

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Federal spendingVote preferencesIncome and votingIncentives in sample surveys

Red state, blue state, rich state, poor state

I Richer voters favor the Republicans, but

I Richer states favor the Democrats

I Hierarchical logistic regression: predict your vote given yourincome and your state (“varying-intercept model”)

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Federal spendingVote preferencesIncome and votingIncentives in sample surveys

Red state, blue state, rich state, poor state

I Richer voters favor the Republicans, but

I Richer states favor the Democrats

I Hierarchical logistic regression: predict your vote given yourincome and your state (“varying-intercept model”)

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Federal spendingVote preferencesIncome and votingIncentives in sample surveys

Red state, blue state, rich state, poor state

I Richer voters favor the Republicans, but

I Richer states favor the Democrats

I Hierarchical logistic regression: predict your vote given yourincome and your state (“varying-intercept model”)

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Federal spendingVote preferencesIncome and votingIncentives in sample surveys

Varying-intercept model

Income

Pro

babi

lity

Vot

ing

Rep

−2 −1 0 1 2

0.25

0.35

0.45

0.55

0.65

0.75

Connecticut

Ohio

Mississippi

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Federal spendingVote preferencesIncome and votingIncentives in sample surveys

Varying-intercept, varying-slope model

Income

Pro

babi

lity

Vot

ing

Rep

−2 −1 0 1 2

0.4

0.5

0.6

0.7

0.8

Connecticut

Ohio

Mississippi

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Federal spendingVote preferencesIncome and votingIncentives in sample surveys

Interactions!

Avg Income 2000 vs. Var Slope 2000

Avg State Income ($10k)

Slo

pe

2.0 2.5 3.0 3.5

0.0

0.1

0.2

0.3

0.4

0.5

AL

AKAZ

AR

CA

CO

CT

DEFL

GA

HIID

ILIN

IA

KS

KY

LA

ME

MD

MA

MI MN

MS

MO

MT

NENV NH

NJ

NM

NY

NCNDOH

OK

OR

PA

RISC SD

TN

TXUT VT

VA

WA

WV

WI

WY

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Federal spendingVote preferencesIncome and votingIncentives in sample surveys

3-way interactions!

Income

Pro

babi

lity

Vot

ing

Rep

−2 −1 0 1 2

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1980

ConnecticutOhio

Mississippi

Income

Pro

babi

lity

Vot

ing

Rep

−2 −1 0 1 20.

10.

20.

30.

40.

50.

60.

70.

80.

9

1984

Connecticut

OhioMississippi

Income

Pro

babi

lity

Vot

ing

Rep

−2 −1 0 1 2

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1988

Connecticut

Ohio

Mississippi

Income

Pro

babi

lity

Vot

ing

Rep

−2 −1 0 1 2

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1992

Connecticut

Ohio

Mississippi

Income

Pro

babi

lity

Vot

ing

Rep

−2 −1 0 1 2

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1996

Connecticut

Ohio

Mississippi

Income

Pro

babi

lity

Vot

ing

Rep

−2 −1 0 1 2

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

2000

Connecticut

Ohio

Mississippi

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Federal spendingVote preferencesIncome and votingIncentives in sample surveys

Meta-analysis of effects of incentives on survey responserates

I 6 factorsI Incentive or notI Value of incentiveI Form (gift or cash)I Timing (before or after)I Mode (telephone or face-to-face)I Burden (short or long survey)

I ModelsI No interactions: estimates don’t make senseI Interactions: estimates are out of control

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Federal spendingVote preferencesIncome and votingIncentives in sample surveys

Meta-analysis of effects of incentives on survey responserates

I 6 factorsI Incentive or notI Value of incentiveI Form (gift or cash)I Timing (before or after)I Mode (telephone or face-to-face)I Burden (short or long survey)

I ModelsI No interactions: estimates don’t make senseI Interactions: estimates are out of control

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Federal spendingVote preferencesIncome and votingIncentives in sample surveys

Meta-analysis of effects of incentives on survey responserates

I 6 factorsI Incentive or notI Value of incentiveI Form (gift or cash)I Timing (before or after)I Mode (telephone or face-to-face)I Burden (short or long survey)

I ModelsI No interactions: estimates don’t make senseI Interactions: estimates are out of control

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Federal spendingVote preferencesIncome and votingIncentives in sample surveys

Meta-analysis of effects of incentives on survey responserates

I 6 factorsI Incentive or notI Value of incentiveI Form (gift or cash)I Timing (before or after)I Mode (telephone or face-to-face)I Burden (short or long survey)

I ModelsI No interactions: estimates don’t make senseI Interactions: estimates are out of control

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Federal spendingVote preferencesIncome and votingIncentives in sample surveys

Meta-analysis of effects of incentives on survey responserates

I 6 factorsI Incentive or notI Value of incentiveI Form (gift or cash)I Timing (before or after)I Mode (telephone or face-to-face)I Burden (short or long survey)

I ModelsI No interactions: estimates don’t make senseI Interactions: estimates are out of control

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Federal spendingVote preferencesIncome and votingIncentives in sample surveys

Meta-analysis of effects of incentives on survey responserates

I 6 factorsI Incentive or notI Value of incentiveI Form (gift or cash)I Timing (before or after)I Mode (telephone or face-to-face)I Burden (short or long survey)

I ModelsI No interactions: estimates don’t make senseI Interactions: estimates are out of control

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Federal spendingVote preferencesIncome and votingIncentives in sample surveys

Meta-analysis of effects of incentives on survey responserates

I 6 factorsI Incentive or notI Value of incentiveI Form (gift or cash)I Timing (before or after)I Mode (telephone or face-to-face)I Burden (short or long survey)

I ModelsI No interactions: estimates don’t make senseI Interactions: estimates are out of control

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Federal spendingVote preferencesIncome and votingIncentives in sample surveys

Meta-analysis of effects of incentives on survey responserates

I 6 factorsI Incentive or notI Value of incentiveI Form (gift or cash)I Timing (before or after)I Mode (telephone or face-to-face)I Burden (short or long survey)

I ModelsI No interactions: estimates don’t make senseI Interactions: estimates are out of control

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Federal spendingVote preferencesIncome and votingIncentives in sample surveys

Meta-analysis of effects of incentives on survey responserates

I 6 factorsI Incentive or notI Value of incentiveI Form (gift or cash)I Timing (before or after)I Mode (telephone or face-to-face)I Burden (short or long survey)

I ModelsI No interactions: estimates don’t make senseI Interactions: estimates are out of control

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Federal spendingVote preferencesIncome and votingIncentives in sample surveys

Meta-analysis of effects of incentives on survey responserates

I 6 factorsI Incentive or notI Value of incentiveI Form (gift or cash)I Timing (before or after)I Mode (telephone or face-to-face)I Burden (short or long survey)

I ModelsI No interactions: estimates don’t make senseI Interactions: estimates are out of control

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Federal spendingVote preferencesIncome and votingIncentives in sample surveys

Model without interactions

I Estimated effects on response rate (in percentage points)

Beta (s.e.)Intercept 1.4 (1.6)Value of incentive 0.34 (0.17)Prepayment 2.8 (1.8)Gift −6.9 (1.5)Burden 3.3 (1.3)

I Will a low-value postpaid gift really reduce response rates by 7percentage points??

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Federal spendingVote preferencesIncome and votingIncentives in sample surveys

Model without interactions

I Estimated effects on response rate (in percentage points)

Beta (s.e.)Intercept 1.4 (1.6)Value of incentive 0.34 (0.17)Prepayment 2.8 (1.8)Gift −6.9 (1.5)Burden 3.3 (1.3)

I Will a low-value postpaid gift really reduce response rates by 7percentage points??

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Federal spendingVote preferencesIncome and votingIncentives in sample surveys

Model without interactions

I Estimated effects on response rate (in percentage points)

Beta (s.e.)Intercept 1.4 (1.6)Value of incentive 0.34 (0.17)Prepayment 2.8 (1.8)Gift −6.9 (1.5)Burden 3.3 (1.3)

I Will a low-value postpaid gift really reduce response rates by 7percentage points??

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Federal spendingVote preferencesIncome and votingIncentives in sample surveys

Models with interactions

Model I Model II Model III Model IVConstant 60.7 (2.2) 60.8 (2.5) 61.0 (2.5) 60.1 (2.5)Incentive 5.4 (0.7) 3.7 (0.8) 2.8 (1.0) 6.1 (1.2)Mode 15.2 (4.7) 16.1 (5.1) 16.0 (4.9) 18.0 (4.6)Burden −7.2 (4.3) −8.9 (5.0) −8.7 (5.0) −9.9 (5.0)Mode × Burden −7.6 (9.8) −7.8 (9.4) −4.9 (9.1)Incentive × Value 0.14 (0.03) 0.33 (0.09) 0.26 (0.09)Incentive × Timing 4.4 (1.3) 1.7 (1.7) −0.2 (2.1)Incentive × Form 1.4 (1.3) 1.1 (1.2) −1.2 (2.0)Incentive × Mode −2.3 (1.6) −2.0 (1.7) 7.8 (2.9)Incentive × Burden 4.8 (1.5) 5.4 (1.8) −5.2 (2.7)Incentive × Value × Timing 0.40 (0.17) 0.58 (0.18)Incentive × Value × Burden −0.06 (0.06) 1.10 (0.24)Incentive × Timing × Burden 11.1 (3.9)Incentive × Value × Form 0.30 (0.20)Incentive × Value × Mode −1.20 (0.24)Incentive × Timing × Form 9.9 (2.7)Incentive × Timing × Mode −17.4 (4.1)Incentive × Form × Mode −0.3 (2.5)Incentive × Form × Burden 5.9 (3.2)Incentive × Mode × Burden −5.8 (3.0)Within-study sd, σ 4.2 (0.3) 3.6 (0.3) 3.6 (0.3) 2.8 (0.3)Between-study sd, τ 18 (2) 19 (2) 18 (2) 18 (2)

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Federal spendingVote preferencesIncome and votingIncentives in sample surveys

Structured hierarchical models

I Need to go beyond exchangeability to shrink batches ofparameters in a reasonable way

I For example, parameter matrices αjk don’t look likeexchangeable vectors

I Similar problems arise in shrinking higher-order terms inneural nets, wavelets, tree models, image models, . . .

I Recall the “blessing of dimensionality”: as the number offactors, and the number of levels per factor, increases, moreinformation is available to estimate the hyperparameters ofthe big model

I In the background: advances in Bayesian computationincluding parameter expansion (Meng, Liu, Liu, Rubin, vanDyk), adaptive Metropolis algorithms (Pasarica), structuredcomputations (Kerman)

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Federal spendingVote preferencesIncome and votingIncentives in sample surveys

Structured hierarchical models

I Need to go beyond exchangeability to shrink batches ofparameters in a reasonable way

I For example, parameter matrices αjk don’t look likeexchangeable vectors

I Similar problems arise in shrinking higher-order terms inneural nets, wavelets, tree models, image models, . . .

I Recall the “blessing of dimensionality”: as the number offactors, and the number of levels per factor, increases, moreinformation is available to estimate the hyperparameters ofthe big model

I In the background: advances in Bayesian computationincluding parameter expansion (Meng, Liu, Liu, Rubin, vanDyk), adaptive Metropolis algorithms (Pasarica), structuredcomputations (Kerman)

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Federal spendingVote preferencesIncome and votingIncentives in sample surveys

Structured hierarchical models

I Need to go beyond exchangeability to shrink batches ofparameters in a reasonable way

I For example, parameter matrices αjk don’t look likeexchangeable vectors

I Similar problems arise in shrinking higher-order terms inneural nets, wavelets, tree models, image models, . . .

I Recall the “blessing of dimensionality”: as the number offactors, and the number of levels per factor, increases, moreinformation is available to estimate the hyperparameters ofthe big model

I In the background: advances in Bayesian computationincluding parameter expansion (Meng, Liu, Liu, Rubin, vanDyk), adaptive Metropolis algorithms (Pasarica), structuredcomputations (Kerman)

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Federal spendingVote preferencesIncome and votingIncentives in sample surveys

Structured hierarchical models

I Need to go beyond exchangeability to shrink batches ofparameters in a reasonable way

I For example, parameter matrices αjk don’t look likeexchangeable vectors

I Similar problems arise in shrinking higher-order terms inneural nets, wavelets, tree models, image models, . . .

I Recall the “blessing of dimensionality”: as the number offactors, and the number of levels per factor, increases, moreinformation is available to estimate the hyperparameters ofthe big model

I In the background: advances in Bayesian computationincluding parameter expansion (Meng, Liu, Liu, Rubin, vanDyk), adaptive Metropolis algorithms (Pasarica), structuredcomputations (Kerman)

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Federal spendingVote preferencesIncome and votingIncentives in sample surveys

Structured hierarchical models

I Need to go beyond exchangeability to shrink batches ofparameters in a reasonable way

I For example, parameter matrices αjk don’t look likeexchangeable vectors

I Similar problems arise in shrinking higher-order terms inneural nets, wavelets, tree models, image models, . . .

I Recall the “blessing of dimensionality”: as the number offactors, and the number of levels per factor, increases, moreinformation is available to estimate the hyperparameters ofthe big model

I In the background: advances in Bayesian computationincluding parameter expansion (Meng, Liu, Liu, Rubin, vanDyk), adaptive Metropolis algorithms (Pasarica), structuredcomputations (Kerman)

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Federal spendingVote preferencesIncome and votingIncentives in sample surveys

Structured hierarchical models

I Need to go beyond exchangeability to shrink batches ofparameters in a reasonable way

I For example, parameter matrices αjk don’t look likeexchangeable vectors

I Similar problems arise in shrinking higher-order terms inneural nets, wavelets, tree models, image models, . . .

I Recall the “blessing of dimensionality”: as the number offactors, and the number of levels per factor, increases, moreinformation is available to estimate the hyperparameters ofthe big model

I In the background: advances in Bayesian computationincluding parameter expansion (Meng, Liu, Liu, Rubin, vanDyk), adaptive Metropolis algorithms (Pasarica), structuredcomputations (Kerman)

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Federal spendingVote preferencesIncome and votingIncentives in sample surveys

End of part 3

I Interactions are important

I Technical challenges in modeling and computation

I Blessing of dimensionality

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Federal spendingVote preferencesIncome and votingIncentives in sample surveys

End of part 3

I Interactions are important

I Technical challenges in modeling and computation

I Blessing of dimensionality

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Federal spendingVote preferencesIncome and votingIncentives in sample surveys

End of part 3

I Interactions are important

I Technical challenges in modeling and computation

I Blessing of dimensionality

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Federal spendingVote preferencesIncome and votingIncentives in sample surveys

End of part 3

I Interactions are important

I Technical challenges in modeling and computation

I Blessing of dimensionality

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

Federal spendingVote preferencesIncome and votingIncentives in sample surveys

Another example: who supports school vouchers?

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

What have we learned?

I Models need structure but not too much structureI Interactions are important

I Treatment interactions in before-after studiesI 2-way, 3-way, . . . , interactions in regression models

I Conservatism in statistics

I Weak prior information is key

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

What have we learned?

I Models need structure but not too much structureI Interactions are important

I Treatment interactions in before-after studiesI 2-way, 3-way, . . . , interactions in regression models

I Conservatism in statistics

I Weak prior information is key

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

What have we learned?

I Models need structure but not too much structureI Interactions are important

I Treatment interactions in before-after studiesI 2-way, 3-way, . . . , interactions in regression models

I Conservatism in statistics

I Weak prior information is key

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

What have we learned?

I Models need structure but not too much structureI Interactions are important

I Treatment interactions in before-after studiesI 2-way, 3-way, . . . , interactions in regression models

I Conservatism in statistics

I Weak prior information is key

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

What have we learned?

I Models need structure but not too much structureI Interactions are important

I Treatment interactions in before-after studiesI 2-way, 3-way, . . . , interactions in regression models

I Conservatism in statistics

I Weak prior information is key

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

What have we learned?

I Models need structure but not too much structureI Interactions are important

I Treatment interactions in before-after studiesI 2-way, 3-way, . . . , interactions in regression models

I Conservatism in statistics

I Weak prior information is key

Andrew Gelman Creating structured and flexible models: some open problems

Weakly informative priorsInteractions in before-after studies

Interactions in regressionsConclusions

What have we learned?

I Models need structure but not too much structureI Interactions are important

I Treatment interactions in before-after studiesI 2-way, 3-way, . . . , interactions in regression models

I Conservatism in statistics

I Weak prior information is key

Andrew Gelman Creating structured and flexible models: some open problems

top related