sampling and non response biases in election surveys ...two statmedia studies conducted in 1997 and...
TRANSCRIPT
Sampling and Non Response Biases in Election Surveys : The Case of the 1998 Quebec Election
Presented at the International Conference on Survey Non response,held in Portland, Oregon, October 27-30 1999
By Claire Durand , dept. of sociology, U of Montreal,
Andre Blais, dept. of political science, U of Montreal, Sebastien Vachon, dept. of sociology, U of Montreal
Contact First author at: Claire [email protected]. of sociology, University of Montreal,C.P. 6128, succ. Centre-ville,Montreal, Quebec, H3C 3J7
Note: We wish to thank the survey companies, and most specially Crop and Createc, for theircooperation to this research. We also wish to thank the FCAR and SSHRC for financialsupport.
Abstract :
During the last electoral campaign in Quebec, Canada, all the polls published in the media hada similar estimate of vote intentions. The Parti Quebecois (PQ), a centre-left party dedicatedto Quebec sovereignty, was clearly ahead, by an average of five points in the last six polls ofthe campaign. The PQ won the election, held on November 30, 1998, but with a lesser shareof the vote (43%) than the contending Liberal Party (44%), a centre-right federalist party.Pollsters and many observers have contended that the discrepancy between the polls and theactual vote could be explained either by a last minute shift in favor of theLiberals or bydifferential turnout.
We rely on a number of sources of data in order to sort out the possible causes for such adiscrepancy. A post election poll was conducted among fifteen hundred respondents who hadanswered one of three electoral surveys conducted during the penultimate week of thecampaign by two Quebec pollsters (CROP and CREATEC). The response rates for thecampaign surveys varied from 50% to 60% and the reinterview rate in the post election surveywas 83%. An analysis of the data from three surveys carried out by Crop during the four-weeks campaign was performed in order to estimate the impact of item and survey nonresponse. A study of voting sections with a high percentage of collective households allowsus to estimate the voting behavior of residents of collective households. Two Statmediastudies conducted in 1997 and 1998 provided information on the sociodemographiccharacteristics of respondents from unlisted and doubly listed telephone lines. Finally, threeCrop surveys carried out after the election allows us to compare the voting intentions ofrespondents from listed and unlisted telephone numbers.
The results of the post election survey do not support the late shift and differential turnouthypotheses. The most likely explanation for the discrepancy between vote intentions asrevealed in the polls and the actual vote is to be found in sampling and non response biases.Analysis of item non response as well as survey non response shows that there is a consistenttendency for non respondents to be supporters of the Liberal Party. An analysis of samplingframe biases also shows that Liberal supporters are likely to be under sampled. Finally,adjustment weighting also tends to increase the bias against the Liberal Party.
It is pointed out that these biases are not specific to the Quebec situation and are likely toincrease with demographic and technological changes.
Contents
1.0 Context of the study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
2.0 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
3.0 The first hypothesis : the electorate moved . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
4.0 Was it non response? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84.1 Item non response . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84.2 Survey non response . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
4.2.1 Hard to reach households and individuals . . . . . . . . . . . . . . . . . . . . . . 104.2.2 Non cooperative households or individuals . . . . . . . . . . . . . . . . . . . . 114.2.3 Non respondents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
5.0 Sampling frame issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135.1 Unlisted telephone numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135.2 Doubly listed phone numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145.3 Collective households . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155.4 Sampling frames and weighting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
6.0 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
1
The general purpose of this article is to trace the possible effects of non response on the
estimation of vote intentions. The case of the election held in November 1998 in Quebec,
Canada, is examined.
During the 30 day electoral campaign, 17 polls were published by the media. Only three of
these put the Quebec Liberal Party, a center-right federalist party, ahead of the Parti
Québécois, a center-left party dedicated to Quebec sovereignty. The last six polls of the
campaign gave, on average, a 5-point lead to the Parti Québécois over the Liberal Party. On
election day, it turned out, however, that the Parti Québécois had been outvoted by the Liberal
Party (44 p. cent of the vote vs. 43 p. cent).
The first reaction of most pollsters and academics was to attribute this situation to the
electorate (late campaign swing, differential turnover). Other academics argued that the gap
between the polls and the actual outcome of the election could be due to non response and/or
to sampling frame biases. The paper assesses the plausibility of these various explanations.
1.0 Context of the study
Three main areas of research have developed in order to explain discrepancies between the
measures of voting intentions by the polls and the outcomes of the vote. The first area of
research is related to the electorate : either it changed its mind between the time when the
survey was conducted and the vote or turnout is not proportionally distributed among party
supporters. This area of research has driven pollsters to conduct surveys till the end of
electoral campaigns, in order to explore the possibility of late campaign shifts.
2
Hypotheses attributing discrepancies to late campaign changes in the electorate have been
examined by a number of authors. The 1992 election in Great Britain is one “modern” case
that has drawn attention. Jowell et al. (1993) have shown that moves in the electorate and
differential turnout, together with item non response, could at best explain half of the nine
points discrepancy between the polls and the vote. A number of authors have examined
similar situations where discrepancy appeared between polls’ estimates of vote intention and
the outcomes of the vote (Howell and Simms, 1994; Curtice 1997, Bishoping and Schuman,
1994; Traugott and Price, 1992). The discrepancy seems to always head in the same direction,
that of an under representation of the more conservative vote in the polls. Validation of
substantial late moves that could explain discrepancies have yet to be found though shifts
between parties do indeed occur (Jowell et coll., 1993). When the examination of all the
available surveys of a campaign shows no move from one week to the other, whether using
traditional mean difference or time series analysis, it is unlikely that a shift occurred (Erikson
and Wlezien, 1999) in the last few days unless an important event could explain such a shift.
Differential turnout is another hypothesis that has been proposed to explain discrepancies. The
reported intention to participate in the vote and the actual participation of survey respondents
is generally higher than that of the general population. Some have argued that there is a
tendency to over report having voted since voting is socially desirable. Others have proposed
that respondents to surveys do in fact participate more in the vote for a number of reasons :
there are more socially integrated or participation in a survey stimulates voting behavior.
(Granberg and Holmberg, 1992; Traugott et Katosh, 1979, 1981). Though the existence of
misreport has been documented in some situations (Traugott et Katosh, 1979, 1981), it is not
clear if overreporting is proportionally distributed among party supporters (Marsh, 1985;
Jowell, 1993; Curtice and Sparrow, 1997) or not (Traugott and Katosh, 1979, 1981; Presser
and Traugott, 1992).
Another area of research is non response. A long-term hypothesis proposed to explain
3
discrepancy between the polls and the vote is item non response i.e. non response to the voting
intention questions. It has been hypothesized that those who refuse to answer and/or those
who indicate that they don’t know whom they will vote for, are more likely to be conservative.
The hypothesis has been confirmed at least in the U.K. (Jowell et al., 1993; Curtice, 1997;
Curtice and Sparrow, 1997).
Survey non response may also influence estimates of vote outcome. It is customary to
estimate the impact of survey non response, due to not-at-home and refusals, with data about
hard to reach respondents and those living in households where previous refusal has been
recorded. For this type of study to be carried out, it is necessary that a certain response rate
be reached, i.e. that a substantial number of attempts at reaching phone numbers in the
sampling frame and at refusal conversion be carried out. Studies have consistently found that
harder to reach respondents have specific characteristics in terms of demography (Triplett,
1998) and political attitude, i.e. that conservative voters are harder to reach (Traugott, 1987;
Lau, 1994; Curtice and Sparrow, 1997; Curtice, 1997; Bolstein, 1991). Furthermore,
respondents who come from households where a refusal has been recorded also have specific
characteristics i.e. they are more likely to be women (Triplett, 1998) and conservative (Curtice
and Sparrow, 1997). These last authors indicate that the more unpopular the Conservative
party, the stronger the propensity of conservative respondents to refuse to indicate their vote
intention. The more substantial survey non response is, the more substantial the likely bias
against conservative vote. It is possible however that the bias is related to social desirability,
thus varying between countries and with historical periods. Since surveys that use quotas
usually have lower response rates, they are more likely to under represent the conservative
vote (Curtice and Sparrow, 1997). Lau (1994) found no relationship between the size of
samples and the quality of estimation of the vote but reported a correlation with a proxy, the
number of days in the field. Vachon, Durand and Blais (1999) found that, after controlling for
the size of samples, a relationship still existed between efforts made in order to increase the
representativeness of the samples, and therefore the response rates, and the mean error as well
4
as the variance in prediction of voting intentions.
Estimates based on hard to reach respondents and on aggregate survey data are not sufficient.
It is important to also examine non respondents as such, i.e. those who were never reached or
convinced to cooperate. The data indicate that they are more likely to be non voters (Bolstein,
1991; Marsh, 1985; Granberg and Holmberg, 1992) or to vote more for the Republican party
(Bolstein, 1991).
The third area of research is related to sampling frames. In theory, sampling frames should
permit to represent the whole population of electors. A number of issues have been raised in
this area, related to data collection modes. In North America, surveys conducted by telephone
have spread, becoming the standard way to conduct surveys of the general population. A
number of reasons explain this situation, the most obvious being the good coverage of all
households by telephone and the low density of population. Meanwhile in Europe, quotas
related to age, sex and occupation have been widely used till recently and data collection is
often conducted using personal interviews either at peoples’ home or at street corners. If we
concentrate on surveys conducted by telephone, one of the first coverage problem is related to
unlisted phone numbers. Households with unlisted phone numbers have specific
characteristics : their members seem less likely to vote (Bolstein, 1991) and, in Great Britain,
more likely to be supporters of the Labour party (Curtice, 1997). Random digit dialling based
frames permit to deal with this problem, but since households with unlisted phone numbers are
considered to be less cooperative (Drew, Choudhry and Hunter, 1988; Traugott, Groves and
Lepkowski, 1987), pollsters have a tendency to rely on list based frames.
New problems have appeared in recent years with the proliferation of households with multiple
phone lines and phone numbers. These households could be over represented in the sampling
frames, particularly those using RDD based frames. Triplett (1998) notes a phenomenon that
could compensate for this higher probability of selection, namely the fact that these households
5
seem harder to reach.
The impact of an increase in the population residing in collective households has to be
assessed. Their absence among survey respondents has been mentioned by Converse and
Traugott (1986). Finally, pollsters will have to face the problems associated with the
increasing number of individuals who have only a portable phone. These phone numbers are
not in the sampling frames right now and, if they were, these respondents might be very
reluctant to pay the cost.
Weighting is rarely mentioned in the literature as a possible culprit for discrepancies. Traugott
(1987) reports that weighting procedures vary widely among pollsters. Jowell et al. (1993)
mention that weighting may have an impact only if weight variables are correlated with the
vote. In some instances, it is possible that such a situation occurs : Some minority groups--
Blacks or Hispanics in the United States, immigrants in various countries -- may be harder to
reach and may vote differently.
This paper relies on multiple sources of data to test hypotheses based on the three main areas
of research that could explain the systematic discrepancy between the polls and the vote found
in the Quebec 1998 general election.
2.0 Methodology
Since the first hypothesis was that a substantial number of voters had moved during the last
days of the campaign, the first step was to conduct a post election survey among pre election
respondents in order to determine whether they had in fact voted and for whom. The
cooperation of two pollsters, Createc and Crop, allowed us to conduct a survey of fifteen
1 Only French-speaking respondents were interviewed because it is in that group(which represents 85% of the electorate) that movement of opinion was assumed to havetaken place. Non French-speaking voters overwhelmingly support the Liberal party.
6
hundred pre election respondents1, using a non proportional stratified sample in order to over
represent non disclosers and supporters of third parties. The preliminary results of this survey
have been disclosed elsewhere (Durand & Blais, 1999; Durand, Blais & Vachon, 1999) and
will be presented succinctly here.
A second source of information comes from one pollster, Crop, who provided us with all the
data from the surveys conducted during the electoral campaign, including the administrative
basis of the surveys which provides detailed information (time of call, result of call,
interviewer, etc.) on all the attempts that were made in order to reach a household and
complete an interview. Since up to 25 attempts had been made to reach a phone number and
up to two attempts to convert initial refusals into completed interviews, it is possible to
compare the vote intention of those who were harder to reach and/or who had previously
refused to answer to the survey with vote intentions in the rest of the sample.
One additional source of information comes from the published polls of the campaign (17 polls
and six pollsters). We could also rely on two studies (Statmedia) conducted in Quebec in
June 1997 and June 1998 on the use and listing of telephone lines. Furthermore, Crop
provided us with the results of some of the tests on unlisted phone numbers that it carried out
in the months following the election.
Finally, we undertook a study to estimate the vote of collective households in a sample of
constituencies. For each selected constituency, we asked the MP’s personnel to identify the
collective households present in their constituency and the voting section in which they were
located. Information was gathered on these collective households (number of residents,
proportion with a private telephone line, proportion of registered voters, likely proportion of
2 Thirteen p. cent of those who declared both their vote intention and their actual votechanged their mind between the time of the pre-election survey and election day. These figuresmay be compared with figures of 5%, 7% and 10% for three different surveys presented byJowell et al. (1993).
7
voters) as well as on the outcome of the election in the constituencies where these households
were located.
3.0 The first hypothesis : the electorate moved
The first “easy” explanation for the discrepancy between pollsters’ estimation of the vote and
the actual outcome is that the electorate moved: Polls had not been conducted late enough in
the campaign and voters changed their minds or decided to stay home.
This interpretation can be tested in two different ways. The first is the post election poll
conducted during the week following the election. This poll has shown that there was no late
campaign swing in favor of the Quebec Liberal Party. In fact, there had been movement2
between all political parties during the last week of the campaign but the net effect of these
movements was slightly in favor of the Parti Québécois. Second, supporters of the Parti
Québécois voted in greater proportion than supporters of the other two parties, the Liberal
Party and Action Démocratique du Quebec (ADQ), a third party that finally got 12 p. cent of
the vote (14% among French-speaking voters). As a consequence, as shown in Table 1, the
overestimation of the Parti Québécois vote is even more substantial in the post election than in
the pre election polls.
A second source of information on this issue is provided by the polls published during the
campaign. These polls permit a time-series analysis of the evolution of vote intentions. A
Insert table 1
8
number of possible models were tested in order to estimate the model that best represents the
results from the various surveys and gives a good forecast of the actual vote. The model that
performs the best is one of stable vote intention throughout the four-week campaign, except
for a small increase in support for the ADQ after the televised debate held at mid-campaign.
These two tests lead to the same conclusion and confirm previous research (Jowell et al, 1993;
Curtice, 199; Erikson and Wlezien, 1999): no late campaign swing occurred.
4.0 Was it non response?
Two types of non response are examined in order to trace a possible impact on estimates of
vote intention. Item non response refers to the fact that some respondents do not indicate
which party they intend to vote for. Survey non response refers to the fact that some
individuals are not interviewed either because they are not at home (or do not answer) when
the survey firm calls or because they refuse to answer the survey.
4.1 Item non response
The first question here is whether non disclosers, i.e., those who answer surveys but either say
that they don’t know whom they will vote for or refuse to provide the information, vote
differently from disclosers.
A first source of information is the post election survey: How did non disclosers to the pre
election polls vote? The results, presented in Table 2, show that among those who had said
Insert table 2
3 The leaning question was put to all those who responded to the first question on voteintention that they did not know how they would vote, that they would not vote or that theywould spoil their ballot as well as to those who refused to answer. These respondents wereasked which party they were leaning toward.
9
they did not know or had refused to tell how they would vote, an equal proportion voted for
the two main parties. Item non response must therefore be ruled out as the source of the
discrepancy between the polls and the outcome since Quebec pollsters allocate 60% of non
disclosers to the Liberal party, 30% to the Parti Québécois and 10% to ADQ. The post
election polls suggest that this allocation may be slightly too generous for the Liberal party. It
is not because of the non disclosers that the Liberal vote was underestimated in the polls.
Another source of information leads to the same verdict. Since pollsters ask a leaning
question3 to those who do not divulge their preference in the initial vote intention question, it
is interesting to examine the answers given by the leaners. The data base from Crop
comprising three regular surveys conducted during the campaign shows that the proportion of
non disclosers to the first question was stable throughout the campaign (as can be seen in
Table 3): the proportion of “don’t knows” varies from 9 to 12 p. cent, the proportion of
refusals from two to four p. cent and the proportion of those who say they will not vote or will
spoil their ballot is two to three p. cent.
Table 3 shows that, among initial non disclosers, 45 to 49 p. cent maintained in the follow up
leaning question that they did not know whom they would vote for, this even in the last poll of
the campaign, that from 14 to 20 p. cent refused to declare their intention and from nine to 12
p. cent said that they would not vote or would spoil their ballot. Only 26 p. cent of non
disclosers to the first question indicated which party they were inclined to support when asked
the leaning question. These 26 p. cent turn out to be slightly more favorable to the Liberal
Party than those who revealed their vote intention in the initial question.
Insert table 3
10
In view of this information, it is possible to conclude that the second hypothesis related to the
vote of non disclosers is validated. Previous research (Jowell et al. 1993; Curtice, 1997;
Curtice and Sparrow, 1997) is confirmed: Non disclosers are slightly more inclined to vote a
more conservative party, in this case the Quebec Liberal Party,. However, this is taken into
account by Quebec pollsters who attribute 60% of the vote of non disclosers to the Quebec
Liberal party. The culprit has to be elsewhere.
4.2 Survey non response
Survey non response comprises two components: a) households or individuals who cannot be
reached and b) households or individuals who refuse to answer the survey.
4.2.1 Hard to reach households and individuals
Some pollsters and academics, in concordance with consistent findings of survey research
(Traugott, 1987; Lau, 1994; Curtice and Sparrow, 1997; Curtice, 1997; Bolstein, 1991), have
proposed the hypothesis that hard to reach individuals are more likely to support the Quebec
Liberal party. Our analysis, presented in tables 4 and 5, shows no relation between the number
of calls necessary to reach a household or to complete an interview and answers to the vote
intention question. All the tests are highly non significant, this despite the fact that the total
number of respondents is more than three thousands and up to 25 calls had been made in order
to complete interviews.
This result is surprising in view of previous research and given the fact that there is a
Insert tables 4 and 5
11
relationship between the number of calls and socio-demographic characteristics that are linked
to vote intention. In this study, the number of calls necessary to reach a phone number or to
complete an interview is higher among younger respondents, among full-time workers and
students and among residents of Montreal suburbs. The number of calls necessary to complete
an interview is also higher in the Montreal region, among the better educated and non French
speakers. It is possible that the diverse characteristics of hard to reach individuals
counterbalance each other and that the various impacts on voting intentions cancel out.
4.2.2 Non cooperative households or individuals
As with hard to reach respondents, it is possible to ascertain the potential bias associated with
refusals by examining the vote intention of less cooperative respondents. Close to 12 p. cent
of respondents in the Crop data base initially refused to give an interview or belong to a
household where an interview had been refused.
As can be seen in Table 6, respondents/households where a refusal had taken place tended to
refuse to reveal their vote intention. However, there is also a tendency for supporters of the
Liberal Party to come from households where a refusal had been recorded. Liberal voters
were not, however, more numerous among those who had personally refused to be
interviewed and among those who had refused more than once.
A similar pattern emerges out of the post election survey: Those who refused to indicate how
they had voted were more numerous in households where a refusal had been recorded (in the
pre election poll). Furthermore, respondents from these households are more likely to have
Insert table 6
12
voted for the Liberal Party (43% vs. 28% in the whole sample) while respondents who had
personally refused to answer the survey were more likely to have voted for the Parti
Québécois. Because the former group is twice as large as the latter, the most substantial bias
is an under representation of the Liberal party. These results confirm previous research by
Curtice and Sparrow (1997).
4.2.3 Non respondents
It is also possible that non respondents (never reached or not converted refusals) differ from
respondents who were hard to reach or who initially refused to answer but finally accepted.
If we look at the relationship between language spoken at home and vote intentions, we get
further clues as to the characteristics of non respondents. It appears that the vote intentions of
the non French-speaking respondents that have been reached by Crop do not reflect the
standard estimation made by political scientists. It is generally estimated that close to 90 p.
cent non French-speaking Quebeckers vote for the Liberal Party. Table 8 shows that, in the
actual data base comprising the three surveys, 77 p. cent of English-speaking respondents and
64 p. cent of non French/non English-speaking respondents (respectively 85% and 71% of
those who indicate their intention) intend to vote for the Liberal Party.
Since there is a differential response rate according to language spoken at home, non French
speaking respondents being 13 p. cent of the final sample while they are 17 p. cent of the
general population according to the Canada 1996 Census, the consequence of adjustment
Insert table 7
Insert table 8
13
weighting is to increase the bias against the Liberal party.
We may therefore conclude that survey non response may be partly responsible for
discrepancies in estimation of vote intention, non respondents being more likely to favor the
Liberal party.
5.0 Sampling frame issues
Three types of bias have been identified, some specific to Quebec pollsters, others generic, at
least in North America and in most industrialized countries. Sampling frames used by most
Quebec pollsters during the electoral campaign did not include unlisted telephone numbers.
The two other biases concern households with more than one listed telephone number and
collective households. We examine the possible impact of these biases.
5.1 Unlisted telephone numbers
The Statmedia 1998 study has shown that the proportion of respondents who indicate that
their telephone number is not listed or who are not sure whether it is listed is 12 p. cent in
Quebec, 17 p. cent in the Montreal region. This information, presented in table 9, is
confirmed by four Crop surveys conducted during the months following the election. Both
sources conclude that non listed telephone numbers are more likely to belong to non French
speaking Quebeckers and to younger people (1ess than 25 years old). This could explain in
part the discrepancy between Census Canada and the actual estimation of the proportion of
non French speaking and young people in the samples (see Table 11).
4 The RDD generated sample is originally divided into two parts, listed and unlisted. The unlisted part is composed of telephone numbers that were not found in the transcribedtelephone directories. A majority of respondents coming from the “unlisted” part of theoriginal sample indicated that their telephone number was listed. This situation may beattributed to the delay between the publication of the directory and its integration in the database from which samples are generated.
14
Crop provided us with the results of the surveys it conducted in February, April and August
1999. It included unlisted phone numbers4 in its sample in order to evaluate the possible
impact of the former exclusion. Crop also asked respondents whether their telephone number
was listed in the directory. From 16% to 22% of respondents came from “unlisted” phone
numbers and 11% indicated that their telephone number was not listed or that they did not
know if it was. In two of the three polls, “unlisted” respondents appeared more likely to intend
to vote for the Liberal Party and less likely to favor the Parti Québécois. Furthermore,
respondents who indicated that their telephone number is not listed are less likely to favor the
Parti Québécois.
We can thus conclude that in the Quebec case, contrary to England (Curtice, 1997), excluding
unlisted phone numbers has contributed to the bias against the Quebec Liberal party.
5.2 Doubly listed phone numbers
A recent phenomenon, even more prevalent with Internet access becoming more popular, is
that a number of households may be reached at more than one telephone number. In some
families, there is a telephone number for the kids and one for the parents. The Statmedia study
carried out in 1997 showed that, at that time, close to 11 p. cent of the 3008 respondents to a
survey on media consumption could be reached at more than one telephone number. Table 10
Insert table 9
15
shows that they were 15 p. cent among the more educated, 19 p. cent among 15 to 24 years
old as opposed to four p. cent among the 65 years old and over, 20 p. cent among those who
live in households where three or more members are 15 years old and over and 20% among
those whose principal occupation is studying. Close to one of four households with a high
income (more than 80,000$CAN) could be reached at more than one telephone number.
We may therefore conclude that some bias may be due to the higher selection rate in
households with more than one telephone number, though it is not possible to specify the
impact of this bias on estimates of vote intention.
5.3 Collective households
Changes in the demographic composition of the population may also have played a role in the
estimation of vote intention during the 1998 electoral campaign in Quebec. According to the
Census, the Quebec population aged 65 years and older has raised from 13% of the total
population over 18 in 1986 to 17% in 1996. About 10% of those over 65 live in collective
households, -- eg. old age pensioners, physically disabled people, members of religious
communities; their number increased by 25 p. cent from 1986 to 1996. People aged 65 years
and older constituted 71% of the people living in collective households in 1996. These people
are included in the sampling frames to a certain degree, i.e., when they have a private
telephone line.
We have tried to estimate the vote of collective households using data from the Census and
information gathered from a sample of constituencies (one out of 10).
Insert table 10
16
It is estimated that about 48% of the people living in collective households have access to a
private telephone number and may therefore be reached by survey companies; the proportion
registered to vote is estimated at 70% and the overall proportion of voters at about 43%. A
conservative evaluation for 1998 gives an estimate of 52,000 voters from collective
households, 1.3% of all voters.
Insert Figure 1.
These voters differ in their voting behavior from other electors in the same constituencies.
Figure 1. shows that, in the polling sections where more than 40% of registered electors are
residents of collective households, the participation rate is 11 percentage points lower than in
the other polling sections. In these same sections, the proportion who vote for the Liberal
Party is 20 percentage points higher. Even though half of the voters living in collective
households may be reached by pollsters, adjustment weighting used by these companies does
not take into account this segment of the voting population since it is based on Statistics
Canada' s Census of private households.
5.4 Sampling frames and weighting
All the biases related to the sampling frame tend to over represent Parti Québécois supporters
and to under represent supporters of the Liberal Party. Can these biases be corrected by
adjustment weighting on the basis of Census data?
Adjustment weighting based on the Census has two flaws. First, Census data may be
somewhat outdated and second, the population of voters does not distribute like the general
population: for example, younger people are harder to reach but they are also less likely to
vote than older people. Therefore, weighting according to age groups may lead to over
17
represent people who will not vote. Some immigrants included in the Census do not have the
right to vote. Members of collective households have the right to vote and do vote but they
are not included in the Census data of private households used by pollsters. Furthermore, we
have shown that respondents may differ from non respondents of the same age/language/sex
group.
Does weighting based on probability of selection and adjustment for differential response rates
provide a more accurate description of vote intentions? The short answer is no as shown in
Table 11. It does not improve the estimation of vote intentions, the discrepancy being even
slightly larger than with adjustment weighting.
6.0 Conclusion
In this study, we have examined the possible explanations for the discrepancy between
estimates of vote intentions and the actual outcome of the 1998 Quebec election in which a
systematic bias in favor of the Parti Québécois had appeared in the polls.
We have shown that the gap cannot be imputed to late campaign shift in vote intentions nor to
an inadequate allocation of the vote of non disclosers. The sources of the discrepancy appears
to be survey non response and sampling frames. Most of the time the biases appear to be
relatively small but their overall effect is substantial because they are all in the same direction.
The study indicates that pollsters will have to devote greater effort in improving the quality of
their sampling frame and weighting procedures and in increasing their response rate, more
specifically among the non French-speaking population, in order to come up with more reliable
Insert table 11
18
measures of party support.
It should be pointed out that the problems we have identified are not specific to Quebec.
Similar situations have occurred in a number of other elections, with usually the same direction
of bias : the under representation of more conservative vote. It is perplexing to note that the
problem occurred despite the fact that the methodology used by Quebec pollsters is quite
orthodox : Random samples with call backs are used and response rates are generally around
60%.
19
References
Bishoping K. and Schuman H. (1994). Pens and Polls in Nicaragua : An analysis of the 1990
Preelection Surveys. American Journal of Political Science, 36 (2), 331-350.
Bolstein, R. (1991). Comparison of the likelihood to vote among preelection poll respondents
and nonrespondents. Public Opinion Quarterly, 55, 648-650.
Converse, Ph. E. and Traugott, M. W. (1986). Assessing the Accuracy of Polls and Surveys,
Science, 234, 1094-1098.
Curtice, J. (1997). So How Well Did They Do? The Polls in the 1997 Election. Journal of the
Market Research Society, 39 (3), 449-461.
Curtice, J. and Sparrow, N. (1997). How Accurate Are Traditional Quota Opinion Polls?
Journal of the Market Research Society, 39 (3), 433-448.
Drew, J. D., Choudhry, G. H. And Hunter, L. A. (1988) Nonresponse issues in government
telephone surveys. In Groves, R. M., Biemer, P. P., Lyberg, L. E., Massey, J.T., Nicholls !!,
W. L. and Waksberg, J. Telephone Survey Methodology , New York: Wiley, 233-246.
Durand, C. and A. Blais (1999). Why did the polls go wrong in the 1998 Quebec election : the
answer from post election polls, BMS, 62, 43-47.
Durand, C., Blais, A. and S. Vachon (1999). Why did the polls go wrong in the 1998 Quebec
election? Paper presented at the 54th Annual Conference of the American Association for
Public Opinion Research (AAPOR), St. Pete Beach, Florida, May 13-16, 1999.
20
Erikson, R. E. and Wlezien, C. (1999). Presidential Polls as Time Series : The case of 1996,
Public Opinion Quarterly, 63(2), 163-177.
Granberg, D and Holmberg, S. (1992) The Hawthorne Effect in Election Studies : the Impact
of Survey Participation on Voting. British Journal of Political Science, 22 (2), 240-247.
Howell, S.E. and Sims, R. T. (1994). Survey research and racially charged elections : The case
of David Duke in Louisiana. Political Behavior, 16 (2), 219-236.
Jowell, R., Hedges, B., Lynn, P., Farrant, G. and Heath, A. (1993). The Polls - a Review; The
1992 British Election : the Failure of the Polls. Public Opinion Quarterly, 57, 238-263.
Lau, R.R. (1994). An Analysis of the Accuracy of "Trial Heat" Polls During The 1992
Presidential Election. Public Opinion Quarterly, 58, 2-20.
Marsh, C. (1985). Prediction of Voting Behavior from a Pre-election Survey, Political studies,
33(4),642-648.
Presser, S. and Traugott, M. W. (1992). Little White Lies and Social Science Models :
Correlated Response Errors in a Panel Study of Voting, Public Opinion Quarterly, 56 (1), 77-
86.
Traugott, M. W. (1987). The importance of Persistence in Respondent Selection for Pre-
Election Surveys, Public Opinion Quarterly, 51 (1), 48-57.
Traugott, M. W., Groves, R. M. and Lepkowski, J. M. (1987). Using Dural Frame Designs to
Reduce Nonresponse in Telephone Surveys, Public Opinion Quarterly, 51 (4), 522-539.
21
Traugott, M. W. and Katosh, J. P. (1981). The Consequences of Validated and Self-Reported
Voting Measures, Public Opinion Quarterly, 45, 519-535.
Traugott, M. W. and Katosh, J. P. (1979). Response Validity in Surveys of Voting Behavior,
Public Opinion Quarterly, 43 (3), 359-377.
Traugott, M. W. and Price, V. (1992). The polls - a review. Exit polls in the 1989 Virginia
gubernatorial race : where did they go wrong? Public Opinion Quarterly, 56, 245-253.
Triplett, T. (1998). What is Gained from Additional Call Attempts and Refusal Conversion and
What are the Cost Implications? Survey Research Center, University of Maryland, October
1998.
Vachon, S., Durand, C. and Blais, A. (1999). Les sondages moins rigoureux sont-ils moins
fiables? Canadian Public Policy/Analyse de politiques, 25 (4), in press.
22
Table 1Vote intention, reported vote and election results among French-speaking
respondents
Estimates ofvoter intention
and actualvote
N P.Q. Lib. ADQ
Otherparties
(+cancel)non
disclosers
will not vote -will cancel/ Did
not vote
Pre electionvote intention
1483 52% 30% 16% 1% (11%)attrib.
(7%) withdrawn
Post electionreported vote
1483 54% 31% 13% 2% (8%)attrib.
(12%)withdrawn
election(estimation ofFrench-speakingvoters)
50% 35% 14% 1% (22%)
23
TABLE 2.Reported vote of non disclosers to the pre-election poll
Reported vote
P.Q. Lib. ADQ Otherparties+cancel
refusals Did notvote
Will cancel/willnot vote (N=105)
9% 7% 4% 3% 13% 65%
undecided(N=108)
22% 27% 11% 6% 21% 13%
refusal(N=188)
26% 21% 10% 3% 37% 3%
24
Table 3Response to the initial vote intention question and to the leaning question (CROP pre election surveys)
Oct 30 - Nov 4 Nov 6 - 11 Nov 19-23
Vote intention Leaning Vote intention Leaning Vote intention Leaning
Count %
Count %
Count %
Count %
Count %
Count %
ADQ 293 5,4% 28 4,1% 277 5,2% 10 1,1% 575 10,7% 56 5,9%
Lib. 2044 38,0% 74 10,8% 1833 34,1% 99 11,1% 1580 29,4% 90 9,5%
PQ 2236 41,6% 75 11,0% 2302 42,8% 121 13,5% 2199 40,9% 82 8,7%
Other parties 119 2,2% 20 2,9% 74 1,4% 0 ,0% 75 1,4% 8 ,9%
Will cancel,will not vote
108 2,0% 79 11,5% 118 2,2% 80 9,0% 157 2,9% 103 10,9%
Undecided 466 8,7% 321 46,9% 631 11,7% 448 50,1% 589 11,0% 410 43,4%
Refusal 112 2,1% 88 12,8% 144 2,7% 135 15,2% 199 3,7% 195 20,7%
Total 5378 100,0% 686 100,0% 5378 100,0% 893 100,0% 5374 100,0% 945 100,0%
Note: The “N”s correspond to the weighted number of individuals represented in the population divided by 1,000.
25
Table 4Number of calls necessary to reach a household and vote intention (first + leaning questions combined)
one two three four to six seven and more
Count % Count % Count % Count % Count %
ADQ 660 7,9% 217 7,5% 108 6,1% 216 8,8% 38 5,7%
Lib. 2966 35,7% 1088 37,5% 591 33,2% 836 33,9% 238 36,0%
P.Q . 3635 43,7% 1268 43,7% 763 42,8% 1047 42,5% 301 45,6%
Otherparties
171 2,1% 57 2,0% 33 1,9% 17 ,7% 19 2,9%
Willcancel,not vote
111 1,3% 20 ,7% 66 3,7% 54 2,2% 11 1,7%
Undecided 570 6,9% 208 7,1% 160 9,0% 201 8,2% 39 5,9%
Refusal 204 2,5% 46 1,6% 61 3,4% 93 3,8% 14 2,2%
Total 8318 100,0% 2904 100,0% 1782 100,0% 2464 100,0% 662 100,0%
Note: The N are weighted according to the 1996 Census and then divided by 1,000. Since three surveys are combined the numberscorrespond to three times the population (divided by 1,000).
26
Table 5Number of calls necessary to complete an interview and vote intention (first + leaning questions combined)
one two three four or five six to nine ten and more
Count %
Count %
Count %
Count %
Count %
Count %
ADQ 394 8,5% 263 7,8% 138 5,2% 294 9,7% 125 6,1% 24 6,1%
Lib. 1665 35,8% 1190 35,5% 950 36,1% 994 32,7% 748 36,4% 173 44,1%
PQ 2098 45,1% 1492 44,5% 1110 42,1% 1320 43,5% 859 41,8% 135 34,5%
Otherparties
52 1,1% 72 2,2% 69 2,6% 41 1,4% 50 2,4% 12 3,1%
Will cancel,not vote
64 1,4% 53 1,6% 44 1,7% 45 1,5% 48 2,3% 9 2,2%
Undecided 303 6,5% 220 6,5% 252 9,6% 233 7,7% 142 6,9% 29 7,3%
Refusal 79 1,7% 64 1,9% 72 2,7% 110 3,6% 84 4,1% 11 2,7%
Total 4654 100,0% 3354 100,0% 2635 100,0% 3038 100,0% 2056 100,0% 393 100,0%
Note: The N are weighted according to the 1996 Census and then divided by 1,000. Since three surveys are combined the numberscorrespond to three times the population (divided by 1,000).
27
Table 6Initial refusal to answer the survey and vote intention (first + leaning questions)
No refusal
Onehousehold
refusalconverted
One selectedrespondent
refusalconverted
Two refusals(household orrespondent)converted
Count % Count % Count % Count %
ADQ 1120 7,8% 74 7,9% 36 8,2% 9 2,1%
Lib. 5034 35,2% 428 45,6% 117 26,9% 140 31,4%
PQ 6311 44,1% 290 30,9% 190 43,8% 224 50,1%
Otherparties
260 1,8% 21 2,2% 8 1,7% 8 1,9%
Willcancel,not vote
243 1,7% 15 1,6% 4 1,0% 0 ,0%
Undecided 1030 7,2% 67 7,1% 32 7,4% 50 11,1%
Refusal 312 2,2% 44 4,7% 48 11,0% 15 3,4%
Total 14309 100,0% 939 100,0% 435 100,0% 447 100,0%
Note: The N are weighted according to the 1996 Census and then divided by 1,000. Since threesurveys are combined the numbers correspond to three times the population (divided by 1,000).
28
Table 7Initial refusal and reported vote (sub sample from post election survey)
No refusal One household
refusalconverted
one selectedrespondent
refusalconverted
two successiverefusals
converted
Count % Count % Count % Count %
ADQ 303 14,0% 4 4,0% 0 ,0% 8 26,3%
Lib. 534 24,7% 36 35,4% 12 18,6% 4 13,6%
PQ 850 39,2% 13 13,1% 21 33,0% 9 30,0%
Otherparties,cancel
39 1,8% 4 4,1% 0 ,0% 4 13,6%
Refusal 152 7,0% 36 35,5% 10 16,1% 5 16,5%
Did notvote
288 13,3% 8 7,9% 21 32,3% 0 ,0%
Total 2167 100,0% 102 100,0% 65 100,0% 31 100,0%
Note: the weighting is based on pre-election weights (Census divided by 1,000).
29
Table 8
Vote intention and language spoken at home
French-speaking English-speaking Other language
Count % Count % Count %
ADQ 1189 9,0% 17 ,8% 32 4,2%
Lib. 3602 27,2% 1629 77,3% 489 64,3%
PQ 6701 50,5% 180 8,5% 134 17,6%
Otherparties
177 1,3% 99 4,7% 20 2,7%
Willcancel,not vote
240 1,8% 15 ,7% 8 1,0%
Undecided 999 7,5% 132 6,3% 47 6,2%
Refusal 354 2,7% 35 1,7% 30 3,9%
Total 13263 100,0% 2107 100,0% 760 100,0%
Note: The N are weighted according to the 1996 Census and then divided by 1,000. Since threesurveys are combined the numbers correspond to three times the population (divided by 1,000).
30
Table 9
Characteristics of respondents who declare that their telephone number isunlisted or who don’t know whether it is listed (Statmedia, 1998)
Montreal urban area 17%
Montreal suburbs 15%
English-speaking 14%
Other language 18%
15-24 years old 18%
Total 12%
31
Table 10
Characteristics of respondents living in households with more than one telephonenumber (Statmedia, 1997)
More than one phone number
Total 11%
college educated 15%
15-24 years old 19%
65 years old and over 4%
three or more 15 years old and over inhousehold
20%
Household income (80,000$Can andmore)
25%
Individual income (60,000$Can andmore)
23%
32
Table 11Characteristics of the Crop sample, unweighted, adjusted and weighted according to inverse of selection and response rate
Non weighted Adjusted w. Censusdata (,000 X3)
Weighted by inverseof selection/resp.Rates (,000 X3)
N % N % N %
Vote intention: ADQ PLQ PQOther parties Will cancel, notvoteUndecidedRefuse to sayTotal
238101613405252
232843014
7,933,744,51,71,7
7,72,8100,0
123957207015296262
117941816130
7,735,543,51,91,6
7,32,6100,0
107345496033237214
103334913488
8,033,744,71,81,6
7,72,6100,0
Age groups18-24 years25-34 years35-44 years45-54 years55-64 years65 years +
282541769588350484
9,417,925,519,511,616,1
179434053711287918112529
11,121,123,017,911,215,7
170324353355277614291789
12,618,124,920,610,613,3
SexMenWomen
13401674
44,555,5
8088322
48,451,6
62117277
46,054,0
Language spokenat homeFrenchEnglishOther language
262828898
87,29,63,3
132632107760
82,213,14,7
117391233515
87,09,13,8
33