surveymonkey 2012 presidential election poll: final models

10
"#$" %&'()*'+,)-. /.'0,)1+ %1.. "#$%$& '()%(*+, "#-.-, /" 012#3+3%345 .(61 '3%+71)4, 89: ;$(*( 9&<21$*, "#-.-, =1*$3) 012#3+3%34$<2 >**(71%% =?#

Upload: surveymonkey

Post on 22-Nov-2014

1.899 views

Category:

News & Politics


1 download

DESCRIPTION

SurveyMonkey has surveyed roughly 1.2 million people from August 17th to November 2nd. Still, skeptics will ask, “Can an internet poll really be successful at approximating voter turnout?”

TRANSCRIPT

Page 1: SurveyMonkey 2012 Presidential Election Poll: Final Models

! ! ! "#$"!%&'()*'+,)-.!/.'0,)1+!%1..!! ! ! "#$%$&!'()%(*+,!"#-.-,!/"!012#3+3%345!! ! ! .(61!'3%+71)4,!89:! ;$(*(!9&<21$*,!"#-.-,!=1*$3)!012#3+3%34$<2!! ! ! >**(71%%!=?#

Page 2: SurveyMonkey 2012 Presidential Election Poll: Final Models

$!

2345/6789:/6!"#$"!%4/2;</9=;>?!/?/@=;89!%8??!

SurveyMonkey has surveyed roughly 1.2 million people from August 17th to November 2nd. Still, skeptics will ask, “Can an internet poll really be successful at approximating voter turnout?” A'&'!)(!,B'!C-D!1E!-0,F-.!G1,'&!,F&+1F,!)+!"##H!IJ!01F+,JK!

=B)(!)(!,B'!C-D!1E!&'(D1+*'+,(!,1!2F&G'J71+L'JM(!"#$"!D&'()*'+,)-.!'.'0,)1+!D1..K! . This report contains our newest wave of data from the 600,000 people who responded to our presidential election poll from October 3rd through November 2nd. Results will be displayed in two different ways: first, as popular vote percentages and second as Electoral College distributions. With this data, we seek to show that internet data is as good as phone data (if not better) at assessing public opinion.

Page 3: SurveyMonkey 2012 Presidential Election Poll: Final Models

"!

!

!

-!E'N!+1,'(!-I1F,!1F&!*-,-!

!"#$%&'("#)'*$)#('()'*(')+$,%$)%$-.%'/"01!! OBJ!*1'(!-..!,B'!*-,-!I'P)+!1+!$#Q$#!&-,B'&!,B-+!$#QRS!

The data reported below begins at 10/10 due to the fact that we chose to use a seven-day trailing sum. This was done for three main reasons. First, all publicly available polls report data using trailing sums as well. Matching their methodology in this way will facilitate comparisons between SurveyMonkey and other polling firms. This provides a reality check for how well SurveyMonkey is doing measuring public opinion. Second, using a trailing sum, rather than a daily measure, provides a statistic that is less swayed by any single day’s events. Essentially, averaging over a week’s worth of data smoothes out and otherwise jagged curve. Lastly, for analyses at the state level, using more than one day of data gives us a larger sample that increases the power and accuracy of our analyses.

OBJ!)(!1+.J!N''L*-J!*-,-!&'D1&,'*S!

It is also important to note that all results that will be reported below exclude weekend data. This was done for two reasons. First, we observed that the graphs of our raw, daily data showed spikes every weekend that were aberrant from the trend line, and from publicly available polling data. We speculate that this is due to two main problems. First, our traffic volume is much lower on weekends, with traffic sinking as low as 15% of typical weekday traffic. This lower volume makes our results more susceptible to outliers. Second, we have found in prior studies of our SurveyMonkey traffic that the people who take surveys on weekends are often not representative of the general U.S. population and, consequently, qualitatively different from those who take surveys on weekdays.

Page 4: SurveyMonkey 2012 Presidential Election Poll: Final Models

R!

!

!

C1*'.!$T!-!4>O!.11L!!

2*/&)3.#$4)-%.5/#$&)()'%("&-(%$"')5/$+).6)'*$)#('(1!!

4>=;89>?/T We have included this model not because we think it will accurately predict what happens on Election Day, but because we want to be as transparent as possible about our methodology.

O/;UA=2T None. Other than excluding weekends and using a 7-day trailing sum, this is purely raw data. No corrections. No weighting.

4/23?=2

As can be seen in the graph below the raw results from our survey suggest that the two candidates standing in the Electoral College has flipped back and forth almost daily. This is strikingly different from all other polls, which have had Obama consistently ahead in the Electoral College for October. This inconsistency of electoral college projections was the main reason that we pursued weighted models rather than merely reporting our raw data. As of Friday (11/2), Model #1 predicts: Obama, 266; Romney, 272.

The above graph was created through a forced choice for each state between the candidates. Separating toss up states provides a glimpse into why SurveyMonkey’s numbers show a tighter race. RCP uses a 5% margin of error to determine if a state is a clear win for either candidate. SurveyMonkey, on the other hand, uses a slimmer 3% margin of error. Overall the graphs below show that SurveyMonkey has roughly half the number of toss up states that RCP does, with more of these going to Romney than Obama. This accounts for why this model estimates a much higher number of electoral votes for Romney than other polls.!

Page 5: SurveyMonkey 2012 Presidential Election Poll: Final Models

V!

Although the Electoral College decides the election, the popular vote is also of interest. Because we oversampled swing states to be able to conduct analyses at the state-level, the proportions of states in our sample relative to their representation in the population of American voters varied wildly. Additionally, due to low traffic, some states were under-represented in our sample. For example, the percentage of voters from Ohio was inflated, because we directed more respondents to our survey there—and percentage of voters from North Dakota was lower, as we directed less traffic there. Thus, publicly available statistics were used to adjust the weights of the state popular vote totals so that they accurately reflected the proportions of U.S. voter turnout by state in 2008. Unsurprisingly, given that SurveyMonkey’s electoral college shows an inconsistent margin of victory for Obama than other polls do, the SurveyMonkey popular vote total shows a lower margin of Obama supporters than other polls have.

Page 6: SurveyMonkey 2012 Presidential Election Poll: Final Models

W!

!

!

C1*'.!"T!,B'!XA8OY!01&&'0,)1+!!

2*/&)3.#$4)7.%%$7'&)6.%)&(3-4/"0)3$'*.#1!

4>=;89>?/T The anonymity of internet polling is a blessing and a curse. Because the person being polled has anonymity, he or she is free to respond without feeling self-conscious. This minimizes the demand characteristics of phone polls to change their answer in response to what they think the phone pollster wants to hear. When people are answering surveys online, as opposed to on the phone—they are “talking” to a computer instead of a real, live person. This matters because research has shown that when speaking with a real, live person, respondents are more concerned about what that person thinks of them. This makes respondents less willing to say “I don’t know,” when asked who they would vote for, because it would suggest that they haven’t thought about the election much. Unfortunately, this anonymity can also artificially inflate “don’t know” responses making accurate predictions tougher to make. Moreover, anonymity can also lead to people not taking the survey seriously enough, randomly clicking responses or not thinking through the questions sufficiently.

O/;UA=2T • ?'-+)+P! G1,'&(T! The “don’t know” response percentage in the SurveyMonkey

dataset was much higher than that of the average phone poll (9% versus 5%). Consequently, we used a question that asked what candidate voters were “leaning towards” to add a small subset of otherwise undecided voters to the results.!

• 51.-,).),JT!Each day was compared to the previous day to compute a “volatility” index. This weight was applied to the day’s average so that more consistent days were weighted more heavily. This makes our averages less susceptible to random error and “satisficers” (people who don’t take online surveys seriously).!

4/23?=2 Although RCP and Nate Silver’s “fivethirtyeight” blog have consistently predicted an Obama victory in the Electoral College by a fairly wide margin, Model # 2 shows a much tighter race. As can be seen in the graph below SurveyMonkey results suggest that if the election had been held anytime between 10/10 to 10/18, Mitt Romney would have won. Beginning on 10/18, however, all the way through Friday, Barack Obama has regained the edge in the Electoral College. As of Friday (11/2), Model #2 predicts: Obama, 272; Romney, 266.

Page 7: SurveyMonkey 2012 Presidential Election Poll: Final Models

Z!

Again, the above graph was created through a forced choice for each state between the candidates. Separating toss up states provides a glimpse into why SurveyMonkey’s numbers show a tighter race. Overall the graphs below show that Model #2 has roughly half the number of toss up states that RCP does, with 50% of these going to Obama and 50% to Romney.

Despite the fact that SurveyMonkey’s electoral college shows a thinner margin of victory for Obama than other polls do, the SurveyMonkey popular vote total shows a greater margin of Obama supporters than other polls have. Thus, while other polls indicate that Romney is ahead in the popular vote, SurveyMonkey data indicates that Obama is actually in the lead. Model #2’s estimation of the popular vote mirrors Nate Silver’s popular vote estimation more closely than RCP’s estimation.

Page 8: SurveyMonkey 2012 Presidential Election Poll: Final Models

[!

!

!

C1*'.!RT!,B'!XOA8Y!01&&'0,)1+!! ) 2*/&)3.#$4)7.%%$7'&)6.%)&(3-4/"0)6%(3$1!

4>=;89>?/T Whether you’re reaching people through their computer or their phone, having them answer your survey does not guarantee that they are going to show up at the polls on Election Day. The people who respond to surveys (whether on the internet or on the phone) and the people who show up to vote are not exactly the same set of people.

O/;UA=2T • %-&,J!;<T! Using voter turnout statistics from 2008, we adjusted the proportions of

Democrats, Republicans, and Independents in our sample. A state was coded as too “blue” or too “red” and the vote of Republicans or Democrats respectively was weighted heavier to even out the percentage. This correction was applied within a 5% margin of error, as this is the typical polling error.!

• /*F0-,)1+T! Having adjusted on party ideology, we then performed a mathematical correction for the representation of educational level (see Appendix for the question options) in the population of U.S. voters.!

• 3+*'0)*'*(T! Finally, we eliminated any voters who responded “don’t know” twice when asked who to vote for. If a voter is not leaning towards any political candidate only a few days before the election, chances are low that they will vote at all, and if they do they should be equally split between the two candidates. Eliminating these truly undecided voters from our sample allowed for a more realistic estimate of the popular vote.!

4/23?=2

Model #3 predicts a consistent victory for Obama over the past month—even when he was trailing in the popular vote. Unlike Model #2, which is more conservative in its Electoral College estimations than both RCP and Nate Silver, Model #3 predicts a wider margin of victory than either. The electoral vote estimations of Model 3 more closely mirror Nate Silver’s estimations (more so than RCP). Nevertheless, there is a striking difference in our graph for 10/22-10/25, which shows Romney briefly ahead in the electoral college. As of Friday (11/2), Model #3 predicts: Obama, 305; Romney, 233.

Again, the above graph was created through a forced choice for each state between the candidates. Separating toss up states provides a glimpse into why SurveyMonkey’s numbers

Page 9: SurveyMonkey 2012 Presidential Election Poll: Final Models

H!

show a bigger lead for Obama. Overall the graphs below show that SurveyMonkey has roughly half the number of toss up states that RCP does, but the majority of these tossup states tend to be attributed to Obama in a forced-choice scenario, creating a wide lead for Obama.

Despite the fact that SurveyMonkey’s electoral college shows a thinner margin of victory for Obama than RCP polls do, the SurveyMonkey popular vote total shows a greater margin of Obama supporters than RCP polls have. Thus, while RCP polls indicate that Romney is ahead in the popular vote, SurveyMonkey data indicates that Obama is actually in the lead.

!

Page 10: SurveyMonkey 2012 Presidential Election Poll: Final Models

Voting Registration.

• >)1!53?!@?))1*2%5!(!)14$<21)1+!(*+!1%$4$7%1!6321),!3)!*32A!

8$&)9.)

Zip Code.

• B#(2!$<!2#1!C$61D+$4$2!E$&!@3+1!C3)!2#1!(++)1<<!53?!)14$<21)1+!23!6321!C)3F,!3)!$C!53?G)1!*32!)14$<21)1+!23!6321,!H#(2!$<!2#1!E$&!@3+1!53?!H3?%+!?<1A!

:.-$";$"#$#<) Voting Likelihood.

I- J3H!F?@#!2#3?4#2!#(61!53?!4$61*!23!2#1!?&@3F$*4!1%1@2$3*!C3)!&)1<$+1*2A!

=>/'$)()4.')?.3$)@"4A)()4/''4$)9."$)B.",')C".+)

K- .3!53?!#(&&1*!23!L*3H!H#1)1!&13&%1!H#3!%$61!$*!53?)!*1$4#73)#33+!43!23!6321A!

8$&)9.)B.",')C".+)

M- J(61!53?!161)!6321+!$*!53?)!&)1@$*@2!3)!1%1@2$3*!+$<2)$@2A!

8$&)9.)B.",')C".+)

N- .3!53?,!53?)<1%C,!&%(*!23!6321!$*!2#1!1%1@2$3*!2#$<!O361F71),!3)!*32A!

8$&)9.)B.",')C".+)

P- J3H!@1)2($*!()1!53?!2#(2!53?!H$%%!6321A!DE&.4>'$4A)7$%'(/")F(/%4A)7$%'(/")9.')7$%'(/")B.",')C".+)

Q- J3H!%$L1%5!()1!53?!23!6321!$*!O361F71)G<!&)1<$+1*2$(%!1%1@2$3*A!

GH'%$3$4A)4/C$4A)I$%A)4/C$4A)?.3$+*(')4/C$4A)?4/0*'4A)4/C$4A)9.')(')(44)4/C$4A)

R- S#$*L$*4!7(@L!23!2#1!1%1@2$3*<!#1%+!C3)!83*4)1<<!$*!O361F71)!KTIT,!+$+!2#$*4<!@3F1!?&!2#(2!L1&2!53?!C)3F!632$*4,!3)!+$+!53?!#(&&1*!23!6321A!

8$&J)5.'$#)9.J)#/#)".')5.'$)B.",')C".+)

!U- J3H!$F&3)2(*2!$<!2#1!&)1<$+1*2$(%!1%1@2$3*!23!

53?A!GH'%$3$4A)/3-.%'("')I$%A)/3-.%'("')?.3$+*(')/3-.%'("')?4/0*'4A)/3-.%'("')9.')(')(44)/3-.%'("')

V- WC!2#1!1%1@2$3*!H1)1!#1%+!23F3))3H,!H3?%+!53?!L*3H!H#1)1!23!43!6321A!

8$&)9.)

IT- J3H!3C21*!H3?%+!53?!<(5!53?!6321!X!(%H(5<,!*1()%5!(%H(5<,!&()2!3C!2#1!2$F1,!3)!<1%+3FA!

D4+(A&)9$(%4A)(4+(A&)K(%').6)'*$)'/3$)?$4#.3)9$5$%)B.",')C".+)

II- S#$*L$*4!7(@L!23!2#1!1%1@2$3*<!#1%+!C3)!83*4)1<<!$*!O361F71)!KTIT,!+$+!53?!6321A!

8$&J)5.'$#)9.J)#/#)".')5.'$)!

Voting Preference. • =?&&3<1!2#1!&)1<$+1*2$(%!1%1@2$3*!H1)1!#1%+!

23+(5-!B#3!H3?%+!53?!71!%$L1%5!23!6321!C3)A!L(%(7C)@E(3()M/'')N.3"$A)B.",')C".+)O)@'*$%)

• B#$@#!@(*+$+(21!()1!53?!%1(*$*4!23H()+<A!L(%(7C)@E(3()M/'')N.3"$A)@'*$%)B.",')C".+)

Demographics.

• '1*1)(%%5!<&1(L$*4!+3!53?!?<?(%%5!2#$*L!3C!53?)<1%C!(<!(!Y1&?7%$@(*,!(!.1F3@)(2,!(*!W*+1&1*+1*2!3)!<3F12#$*4!1%<1A!

B$3.7%(')N$->E4/7(")P"#$-$"#$"')?.3$'*/"0)$4&$)

• B#(2!$<!2#1!#$4#1<2!%161%!3C!<@#33%!53?!#(61!@3F&%121+!3)!2#1!#$4#1<2!+14)11!53?!#(61!)1@1$61+A!

Q$&&)'*(")*/0*)&7*..4)#$0%$$)R/0*)&7*..4)#$0%$$).%)$S>/5(4$"')?.3$)7.44$0$)E>')".)#$0%$$)D&&.7/('$)#$0%$$)L(7*$4.%)#$0%$$)T%(#>('$)#$0%$$)

>&&1*+$Z!X![?1<2$3**($)1!