introduction to multivariate regression: voter turnout

28
+ Miguel Centellas Croft Visiting Assistant Professor of Political Science The University of Mississippi UNDERSTANDING Voter Turnout around the WORLD (A Brief Guide to Multivariate Regression)

Upload: miguel-centellas

Post on 22-Nov-2014

503 views

Category:

Documents


1 download

DESCRIPTION

A presentation I gave my undergraduate research methods course to introduce them to multivariate regression. It uses the example of voter turnout in Europe and the Americas.

TRANSCRIPT

Page 1: Introduction to Multivariate Regression: Voter Turnout

+

Miguel CentellasCroft Visiting Assistant Professor of Political ScienceThe University of Mississippi

UNDERSTANDING Voter Turnout around the WORLD

(A Brief Guide to Multivariate Regression)

Page 2: Introduction to Multivariate Regression: Voter Turnout

+STEP 1: GET SOME DATAFind a source of data that is relevant to your research question.In our example, let’s look for a source that has data on voter turnout around the world—as well as other related data.

Page 3: Introduction to Multivariate Regression: Voter Turnout

+

Data from International IDEAFreely available on the web, along with background information.

Page 4: Introduction to Multivariate Regression: Voter Turnout

+

Data from International IDEAFreely available on the web, along with background information.

You can download all the data. Or select specific countries, years, type of election, and other variables.

Page 5: Introduction to Multivariate Regression: Voter Turnout

+STEP 2: CASE SELECTIONYou may use a number of criteria for case selection, but be sure you can justify your choices.In our case, let’s limit the cases to countries in Europe and the Americas—limiting it to countries that are rated as democracies (by Freedom House) and have not had a civil war recently (excludes most of the Balkans). We also limit our data to most recent legislative elections.

Page 6: Introduction to Multivariate Regression: Voter Turnout

+Sample Selection: 51 Countries in Europe & the Americas

EuropeAlbania • Austria • Belgium • Bulgaria • Czech Republic Denmark • Estonia • Finland • France • Germany Greece Hungary • Iceland • Ireland • Italy • Latvia • Lithuania Luxembourg • Malta • The Netherlands • Norway Poland • Portugal • Romania • Slovenia • Spain • Sweden Switzerland • United Kingdom

The AmericasArgentina • Bolivia • Brazil • Canada • Chile • Colombia Costa Rica • Dominican Republic • Ecuador • El Salvador  Guatemala • Honduras • Jamaica • Mexico • Nicaragua Panama • Peru • Suriname • United States • Uruguay  Venezuela

Page 7: Introduction to Multivariate Regression: Voter Turnout

+Sample Selection: 51 Countries in Europe & the Americas

Established DemocraciesAustria • Belgium • Canada • Colombia • Costa Rica Denmark • Finland • France • Germany • Iceland Ireland • Italy • Jamaica • Luxembourg • Malta The Netherlands • Norway • Sweden Switzerland United Kingdom • United States • Venezuela

New DemocraciesAlbania • Argentina • Bolivia • Brazil • Bulgaria • Chile Czech Republic • Dominican Republic • Ecuador El Salvador • Estonia • Greece • Guatemala • Honduras Hungary • Latvia • Lithuania • Mexico Nicaragua Panama • Peru • Poland • Portugal • Romania • Slovenia Spain • Suriname • Uruguay  

Page 8: Introduction to Multivariate Regression: Voter Turnout

+STEP 3A: SELECT THE DEPENDENT VARIABLEYour dependent variable is the object of your study, it is the thing you want to explain.In our case, we want to understand what causes changes in voter turnout across course selected countries.

Page 9: Introduction to Multivariate Regression: Voter Turnout

+STEP 3B: OPERATIONALIZE THE DEPENDENT VARIABLE

You need to clearly specify how you will measure your dependent variable.In our case, we want to make sure that differences in voter turnout aren’t masked by differences in voter registration procedures. We want to know how many citizens vote. Fortunately, IDEA has a variable called Vote/VAP (percent of voting age population that voted).

Page 10: Introduction to Multivariate Regression: Voter Turnout

+Voter Turnout in 51 Selected

CountriesVote/VAP in Legislative Elections, 2004-2008

0

20

40

60

80

100

Venezuela

Malta

Page 11: Introduction to Multivariate Regression: Voter Turnout

+STEP 4: HYPOTHESES

Use theory to develop testable hypotheses. Each hypothesis should link at least one INDEPENDENT variable with the dependent variable.In our case, let’s speculate about possible factors that may affect voter turnout.

Page 12: Introduction to Multivariate Regression: Voter Turnout

+How Can We Explain Differences in Voter Turnout?

Hypothesis 1: Electoral SystemVoter turnout is a function of electoral systems. Proportional representation should drive up voter turnout because voters are less likely to “waste” votes.

Hypothesis 2: Level of FreedomVoter turnout is a function of civil & political liberties. Citizens will exercise their right to vote if they enjoy a wide range of civil rights and political liberties.

Hypothesis 3: Compulsory Voting LawsVoter turnout is a function of voting laws. Where voting is compulsory, citizens are more likely to vote.

Page 13: Introduction to Multivariate Regression: Voter Turnout

+STEP 5: OPERATIONALIZE THE INDEPENDENT VARIABLES

You also need to specify how you will measure your independent variables.In our case, we need to explain how we will measure changes along our three independent variables

Page 14: Introduction to Multivariate Regression: Voter Turnout

+How Can We Explain Differences in Voter Turnout?

ELECTORAL SYSTEMSince we’re mostly interested in seeing if proportional representation increases voter turnout over first-past-the-post, let’s use a dummy variable (1=PR; 0=FPTP)1.

LEVEL OF FREEDOMOne way to measure level of freedom is to use the Freedom House Index included in the IDEA dataset.2

COMPULSORY VOTING LAWSThe IDEA dataset doesn’t include information on compulsory voting, but the website does have a list of countries with such laws. We can create a column in our spreadsheet for a dummy variable (1=compulsory voting law; 0=none).

1 Eight of our cases don’t use PR or FPTP electoral systems.2 I transformed the FH scores so that 7 is now the highest level of freedom, and 1 is the lowest.

Page 15: Introduction to Multivariate Regression: Voter Turnout

+STEP 5: DESCRIPTIVE STATISTICSSimply looking at the data may provide some evidence to support a hypothesis.In our case, let’s see if it looks like voter turnout is driven by either of our three dependent variables: electoral system, level of freedom, or compulsory voting laws.

Page 16: Introduction to Multivariate Regression: Voter Turnout

+Voter Turnout in 51 Selected

CountriesVote/VAP in Legislative Elections, 2004-2008

0

20

40

60

80

100

USA

JamaicaCanadaFrance

UK

ChileIreland

Malta

Page 17: Introduction to Multivariate Regression: Voter Turnout

+Voter Turnout and Level of Freedom

Vote/VAP & FH Index

3.5 4 4.5 5 5.5 6 6.5 70

20

40

60

80

100

Page 18: Introduction to Multivariate Regression: Voter Turnout

+Voter Turnout and Level of Freedom

Vote/VAP and Freedom House Index Scores

3.5 4 4.5 5 5.5 6 6.5 70

20

40

60

80

100

Peru

Switzerland

USAVenezuela

Expected

Unexpected

Outlier

Page 19: Introduction to Multivariate Regression: Voter Turnout

+Voter Turnout and Compulsory

Voting

0

20

40

60

80

100

Voter Turnout in Coun-tries with Compulsory

Voting

0

20

40

60

80

100

Voter Turnout in Countries without Compulsory Voting

Average = 69%

(STDEV = 13.7%) Average = 62%

(STDEV = 16.7%)

Page 20: Introduction to Multivariate Regression: Voter Turnout

+After Descriptive Statistics:What Explains Differences in Voter Turnout? It seems PR electoral systems tend to have higher

voter turnout. Although some PR countries had low voter turnout, all the FPTP countries had relatively low voter turnout.

It seems like there’s a tendency for countries with high levels of freedom to have higher voter turnout. But there are some interesting exceptions.

On average, countries with compulsory voting laws have higher voter turnout. But the range is too wide to be certain.

Page 21: Introduction to Multivariate Regression: Voter Turnout

+STEP 6: REGRESSION ANALYSISRegression analysis allows us to look at multiple variables simultaneously, to see if they have any effect on our dependent variable.In our case, let’s dump our dataset into STATA, a statistical package and run some regressions.

Page 22: Introduction to Multivariate Regression: Voter Turnout

Regression Analysis Output in STATA

Command

Goodness of fit

Statistical Significance

Coefficients

Page 23: Introduction to Multivariate Regression: Voter Turnout

+All You Need to Know to Interpret A Regression Analysis Output Goodness of Fit (Adjusted R-Squared)

How well a model fits a set of observations, or how much variation in the data is explained by the model. By itself, the R-Squared (R2) value is meaningless. What matters is whether a particular model has a larger R2 value than another (larger is better).

Correlation CoefficientThe “slope” of the relationship between the dependent and independent variable. Every unit increase on the independent variable produces an increase equal to the coefficient.

Statistical Significance (p value)Linear regression executes a t-test for each variable. The p value represents the probability that we can trust the coefficient. A p value of 0.05 means we have a 95% confidence the coefficient is accurate.

Page 24: Introduction to Multivariate Regression: Voter Turnout

+Regression Estimates for Voter

Turnout

Page 25: Introduction to Multivariate Regression: Voter Turnout

+STEP 6B: CHECK FOR INTERVENING EFFECTSThis is an optional step, but it never hurts to run additional models that check for intervening effects. But these should be guided by theory.In our case, let’s check for regional effects, whether new democracies behave differently, and whether spoiled votes made a difference.

Page 26: Introduction to Multivariate Regression: Voter Turnout

+Regression Estimates for Voter

Turnout

Page 27: Introduction to Multivariate Regression: Voter Turnout

+After Regression Analysis:What Explains Differences in Voter Turnout? By itself, PR does have a positive effect on voter

turnout. But in multivariate analysis, it has no statistically significant effect.

By itself, a country’s FH score does have a positive effect on voter turnout. This variable also is consistently significant in multivariate models.

By itself, compulsory voting has no effect on voter turnout. However, it does have a significant, positive effect on voter turnout in new democracies.

Page 28: Introduction to Multivariate Regression: Voter Turnout

+

WHAT DID WE LEARN?Voters are more likely to turn out to vote when civil liberties and political rights are protected.

(None of this was evident from descriptive statistics alone.)