selection and mode effects in risk preference elicitation …ftp.iza.org/dp3321.pdf · 2008. 1....
TRANSCRIPT
IZA DP No. 3321
Selection and Mode Effects in Risk PreferenceElicitation Experiments
Hans-Martin von GaudeckerArthur van SoestErik Wengström
DI
SC
US
SI
ON
PA
PE
R S
ER
IE
S
Forschungsinstitutzur Zukunft der ArbeitInstitute for the Studyof Labor
January 2008
Selection and Mode Effects in
Risk Preference Elicitation Experiments
Hans-Martin von Gaudecker Free University Amsterdam and Netspar
Arthur van Soest
Tilburg University, Netspar and IZA
Erik Wengström University of Copenhagen
Discussion Paper No. 3321 January 2008
IZA
P.O. Box 7240 53072 Bonn
Germany
Phone: +49-228-3894-0 Fax: +49-228-3894-180
E-mail: [email protected]
Any opinions expressed here are those of the author(s) and not those of IZA. Research published in this series may include views on policy, but the institute itself takes no institutional policy positions. The Institute for the Study of Labor (IZA) in Bonn is a local and virtual international research center and a place of communication between science, politics and business. IZA is an independent nonprofit organization supported by Deutsche Post World Net. The center is associated with the University of Bonn and offers a stimulating research environment through its international network, workshops and conferences, data service, project support, research visits and doctoral program. IZA engages in (i) original and internationally competitive research in all fields of labor economics, (ii) development of policy concepts, and (iii) dissemination of research results and concepts to the interested public. IZA Discussion Papers often represent preliminary work and are circulated to encourage discussion. Citation of such a paper should account for its provisional character. A revised version may be available directly from the author.
IZA Discussion Paper No. 3321 January 2008
ABSTRACT
Selection and Mode Effects in Risk Preference Elicitation Experiments*
We combine data from a risk preference elicitation experiment conducted on a representative sample via the Internet with laboratory data on students for the same experiment to investigate effects of implementation mode and of subject pool selection. We find that the frequency of errors in the lab experiment is drastically below that of the representative sample in the Internet experiment, and average risk aversion is also lower. Considering the student-like subsample of the Internet subjects and comparing a traditional lab design with an Internet-like design in the lab gives two ways to decompose these differences into differences due to subject pool selection and differences due to implementation mode. Both lead to the conclusion that the differences are due to selection and not implementation mode. An analysis of the various steps leading to participation or non-participation in the Internet survey leads shows that these processes are selective in selecting subjects who make fewer errors, but do not lead to biased conclusions on risk preferences. These findings point at the usefulness of the Internet survey as an alternative to a student pool in the laboratory if the ambition is to use the experiments to draw inference on a broad population. JEL Classification: C90, D81 Keywords: risk aversion, internet surveys, laboratory experiments Corresponding author: Erik Wengström Department of Economics University of Copenhagen Studiestræde 6 DK-1455 Copenhagen Denmark E-mail: [email protected]
* Financial support from the Dutch National Science Foundation (NWO), the Swedish Institute for Banking Research (Bankforskningsinstitutet), and from the European Union under grant HPRN-CT-2002-00235 (RTNAGE) is gratefully acknowledged. We thank the team of CentERdata, especially Marika Puumala, for their support with the experiments, as well as Morten Lau and Joachim Winter for very helpful comments on the experimental design. The analysis benefitted from comments received at presentations in Mannheim, Copenhagen, Gothenburg, the XIIth FUR conference at LUISS in Rome and the ESA meetings at Nottingham and Tucson. Daniel Kemptner provided able research assistance.
1 Introduction
Recently, there has been an increased interest in eliciting economically important preference
parameters by means of experimental methods (Harrison, Lau, and Williams (2002), Ble-
ichrodt, Pinto, and Wakker (2001), among many others). In this context, researchers are
generally interested in parameters that are valid for the general population and carry over to
situations outside the laboratory setting. There are several reasons why it may be difficult
or impossible to recover such parameters from standard experiments. First, the experimental
design could differ too widely from real-world situations in terms of context, stakes, or sim-
ilar features. The literature investigating this type of effect is reviewed in Harrison and List
(2004) and Levitt and List (2007). Second, the subjects taking part in the experiment may
not resemble the population of interest. There has been a growing concern that the standard
recruitment procedure – an experimenter inviting college students via emails or posters – may
restrict socio-demographic variation too severely as to allow meaningful inference on the broad
population of interest. Spurred by Harrison, Lau, and Williams (2002), this issue has been
addressed in several recent field studies. However, there is a second type of selection effect
that also applies if recruitment is broader than among students and is largely out of control
of the experimenter: Participation in experiments is voluntary and may be selective, so that
the participating subjects may differ from the sampling population in relevant dimensions.
In this paper, we address both types of selection effects.
Recent years have witnessed different approaches to enhance demographic variation in
experimental situations. One rather laborious possibility is to take the laboratory from the
university to the population of interest; for one of many examples along these lines see Har-
rison, List, and Towe (2007). Sample sizes usually do not exceed those typically encountered
in the laboratory which may pose a problem in accounting for demographic heterogeneity.
A similar strategy that has become available recently is to integrate experiments into ex-
isting household surveys; see for example the pioneering work by Fehr et al. (2003) and
Dohmen et al. (2005). Major advantages of this approach are that careful sampling frames
are employed and that a lot of background information on participants is available. Until
now, capacity constraints in the survey instruments and the relatively high costs of personal
interviews have hindered a more widespread use of this method. The two cited studies were
able to use moderately-sized subsamples from the much larger German Socio-Economic Panel
(N=429 and N=450, respectively). Third, experimenters have used convenience samples of
Internet respondents recruited by means of newspaper advertising or email-invitations. See,
e.g., Lucking-Reiley (1999) and Guth, Schmidt, and Sutter (2007). While this approach fa-
cilitates conducting experiments with very large numbers of participants, there is essentially
no control over the recruitment process in most cases. Experimenter-induced selection may
2
arise from reading a particular newspaper, the necessity of having access to the Internet, or
subscribing to a specific electronic mailing list.
We combine the advantages of the last two approaches in conducting an experiment
with a large sample (2,299 subjects) of respondents from a Dutch household survey, the
CentERpanel. This is carried out over the Internet and avoids non-coverage of those without
Internet access by providing them with a set-top box for their TV. In order to investigate
the traditional experimenter-induced subject pool bias, we compare the Internet outcomes to
those of parallel laboratory experiments with 178 students. However, replacing the laboratory
by the Internet also changes the environment, unlike the case of comparisons based on a “mo-
bile laboratory” approach (Andersen et al. 2005). Potential differences due to demographic
variation can therefore be confounded with implementation mode effects. We address this
issue from two angles. First, we introduce a treatment in the laboratory that replicates the
Internet setting as closely as possible. In particular, no experimenter is present while subjects
complete the experiment and there is no restriction to wait for the last person to finish before
leaving the room. Second, our Internet sample is sufficiently large to analyse a subsample
that resembles the student population in terms of age and education. If environmental factors
play a role, they should be revealed when comparing results from this subsample to results
for the laboratory experiment.
Section 2 contains an extensive description of our data and the experimental setup. The
issue of experimenter-induced selection effects is taken up in section 3. We find that when
moving from student samples to the general population, the most dramatic difference is a
drastic rise in the number of violations of the most basic economic principles, namely choosing
dominated options and non-monotonic behaviour. Risk aversion also turns out to be higher
in the overall population. We cannot detect any differences arising from the environmental
treatment for the young and educated, which leads us to conclude that the differences are
driven by subject pool effects.
Selection effects stemming from voluntary participation have not received much attention
until recently, see Harrison, Lau, and Rutstrom (2007) for a primer. The main reason for this
is probably that there is little control over the recruitment process in most cases. Experi-
menters typically collect some demographic information about participating subjects, but the
corresponding values of nonparticipants are unobserved. A crucial feature of our setup is that
we have access to rich background information for participants as well as nonparticipants.
This allows us to estimate a model of selection into the experiment (Section 4). We find
higher participation rates if incentives are provided. Participation is also larger for the more
educated, the non-elderly, males, and for those household members with most interest and ex-
pertise in financial matters. This induces some association of participation with inconsistent
3
behaviour – participants typically have observed characteristics that make them less prone
to making mistakes. But estimates of average risk preferences do not seem to be seriously
affected by selective participation.
2 Data and Experimental Setup
This section provides a detailed description of our experimental design and subject pools. The
starting point of the experiments is the multiple price list format, a well-established method-
ology for preference elicitation, which we modify in two ways. First, to help respondents
understand their tasks, we include pie-chart representations of the probabilities in addition
to the representations in numbers. Second, the experiment is designed to elicit not only risk
aversion, but also two additional preference parameters - reflecting loss aversion (Kahneman
and Tversky 1979) and uncertainty resolution preferences (Kreps and Porteus 1978). In this
paper, however, we focus on risk aversion.
We first describe the multiple price list format and how we implement it. We then point
out the aspects of the experiment that are specific to the Internet and laboratory settings,
respectively. In particular, we highlight the features of our design aimed at disentangling
subject pool (”selection”) effects and implementation method (”mode”) effects. The most
important of these is the introduction of two environmental treatments in the laboratory. One
replicates traditional experiments, the other mimics the Internet setting as much as possible.
We term them “Lab-Lab” and “Lab-Internet” to avoid confusion with the CentERpanel
experiment (also denoted as “Internet experiment”). The full experimental instructions,
samples of choice problems, help screens, final questions, and the debriefing part are available
at http://www.econ.ku.dk/wengstrom/.
2.1 The Multiple Price List Format
The experiments were conducted using an adapted version of the multiple price list format,
introduced in economics by Binswanger (1980) and recently employed in the context of risk
preferences by Holt and Laury (2002). An extensive description can be found in Andersen
et al. (2006). In principle, multiple price lists work as follows: Each subject is presented a
series of lotteries with identical payoffs but varying probabilities such as the one presented
in Figure 1. In each of the four cases, the participant can choose between Option ‘A’ and
Option ‘B’. The table is designed such that the expected payoff of lottery ‘A’ starts out higher
but moves up slower than the expected payoff of lottery ‘B’. A participant with monotone
preferences switches at some point from the safer Option ‘A’ to the riskier Option ‘B’, or
4
chooses ‘B’ throughout. This is because the last row amounts to a choice between two certain
payoffs, with that of Option ‘B’ higher than that of Option ‘A’.
A modification compared to previous studies is that we include pie-charts describing the
probabilities of the outcomes, in addition to the verbal descriptions of the decision tasks.
Pilot experiments showed that this was appreciated by subjects who were not familiar with
probability judgements. Using a design with low cognitive demands seems important when
moving outside the traditional laboratory environment. Because of the extra screen space
needed for the graphical probability representations, we reduced the number of decision tasks
per screen from the usual ten to four, avoiding that respondents need to scroll. To obtain
precise responses despite this, subjects who are consistent in the sense that they do not switch
back and forth between ‘A’ and ‘B’ and choose the higher certain payoff in the final question
are routed to a second screen containing lotteries with the same payoffs but a refined probabil-
ity grid. The probability grid of the second screen involves 10%-steps located approximately
between their highest choice of ‘A’ and their lowest choice of ‘B’ on the previous screen. This
is a version of the iterative multiple price list format described in Andersen et al. (2006).
Each subject faced seven payoff configurations, described in Table 1. For each configu-
ration subjects make either four or eight decisions, depending on their answers on the first
screen. Some of the riskier option ‘B’ lotteries involved negative payoffs, while payoffs from
the safer option ‘A’ were all strictly positive. The actual payments were always made three
months after the experiment and subjects were informed about this in the introduction. At
the top of each screen, we indicated whether the outcome of the lottery was revealed imme-
diately or in three months’ time (see Figure 1).
Subjects were randomly assigned to one of three groups with different payoff treatments:
groups with hypothetical and real lotteries with the amounts shown in Table 1, and one group
with real payoffs but amounts divided by three. We refer to these as hypothetical, high, and
low incentive treatments. All subjects in the high and low incentive groups received an upfront
payment of 15 or 5 Euros, respectively. No payment at all was made to the hypothetical
group of the CentERpanel experiment. The laboratory subjects in the hypothetical group
received a participation fee of 5 Euros for recruitment reasons. In the incentives treatments,
everyone received the participation fee, but only one in ten subjects got paid in addition for
one of the lotteries. The lottery to be paid out was selected at random to ensure incentive
compatibility. In order to avoid negative payoffs, the highest possible loss did not exceed the
fixed participation fee. We randomised the order in which the seven payoff configurations
were presented. In an effort to remain close to earlier work, the first payoff configuration in
the low incentive treatment is a scaled version of the payoff configuration in Table 1 of Holt
and Laury (2002), multiplied by six and rounded to the next lowest integer. The other payoff
5
configurations are derived in a similar way from those used by Holt and Laury.
2.2 The CentERpanel Experiment
The subjects in the Internet experiment are respondents in the CentERpanel,1 aged 16 and
older. The CentERpanel is managed by CentERdata, a survey research institute affiliated
with Tilburg University. The panel contains more than 2,000 households and covers the
complete Dutch population, excluding the institutionalised. Questionnaires and experiments
are fielded over the Internet. To avoid selection bias, households without Internet access
are provided with a set-top box for their TV (and with a TV if they do not have that
either). Panel members get questions every weekend. They are reimbursed for their costs of
participation (fees of dial-up Internet connections etc.) on a regular basis. We conducted our
experiments in November and December of 2005. Our payments were included in one of the
regular transactions three months after the experiments.
The welcome screen contained a brief introduction to the experiment followed by a non-
participation option. See Figure 2 for the introductory screens of all treatments. For the
treatments with real incentives, subjects were told the amount of the participation fee and
that they had the chance to win substantially more or lose (part of) this money again. It
was made clear that no payment would be made upon nonparticipation. In the hypothetical
treatment, subjects were informed that the questionnaire consisted of choices under uncer-
tainty in a hypothetical setting. In all treatments, subjects then had to indicate whether they
wanted to participate or not. Respondents who opted for participation first went through two
pages of online instructions before facing the seven price list configurations. The instructions
and specially designed help screens could be accessed throughout the experiment. They were
included to improve comparability with similar laboratory experiments, compensating for the
absence of an experimenter.
In total, 2,299 persons logged into the system. About 12.7% opted for nonparticipation,
leaving 2,008 respondents who started the experiment. 80 subjects dropped out before com-
pleting the questionnaire. Moreover, 138 respondents went through the experiment extremely
rapidly. Those who took less than 5:20 minutes are treated as dropouts in the analysis be-
low (see Section 4 for more details about the choice of cut-off point). Our final sample thus
consists of 1,790 subjects who made 91,808 choices.
The first three columns of Table 2 list descriptive statistics for the participants who1For related papers using data collected through the CentERpanel see Donkers, Melenberg, and van Soest
(2001) who analysed risk preferences using hypothetical questions, and Bellemare and Kroger (2007) for ev-
idence from a trust game with real payoffs. More information about the CentERpanel can be found at
http://www.uvt.nl/centerdata/.
6
completed the experiment (”final sample”), those who opted for nonparticipation, and those
who dropped out in the course of the experiment or sped through it. As expected, the
three groups differ in many respects. The variables in Table 2 can be broadly classified into
six groups: Incentive treatment; education; sex and age; employment status and residential
area; financial literacy and experience; income. Some of the questions, particularly those
on assets and financial literacy and experience, are drawn from the DNB household survey
(DHS), a survey focusing on financial issues held among CentERpanel respondents once a
year. Not everybody in our sample took part in the DHS and sample sizes fall if we include
the corresponding variables in the analysis.
2.3 The Laboratory Experiment
In order to compare the answers in the Internet survey to those in the environment of a
controlled laboratory experiment, we performed the same experiment in the economics labo-
ratory at Tilburg University. In total, 178 students participated (8 sessions in September 2005
and 8 sessions in May 2006). The same treatments were carried out as in the Internet survey.
The only difference was the above-mentioned payment of a show-up fee in the hypothetical
treatment. The payment procedure for the incentives treatments was as in the CentERpanel
experiment: The participation fee was transferred to participants’ bank accounts three months
after the experiment. One in ten subjects received the sum of the participation fee and the
(possibly negative) payment from one (randomly drawn) lottery.
To distinguish effects due to different subject pools from effects stemming from replac-
ing the controlled laboratory setting by the Internet environment, we also replicated this
latter change as good as possible in the lab. The first environmental treatment, labelled
the “Lab-Lab” treatment, was designed to replicate the traditional setup used in laboratory
experiments. In particular, an experimenter was present in the room to help the subjects
and answer questions. In contrast to the CentERpanel experiment, no links to the instruc-
tions and to the help screens were shown in the core part of the experiment. Otherwise, the
screens resembled the one shown in Figure 1. Participants also had to wait until everyone
else in the session had finished before they could leave the room. In the second environmental
treatment – termed the “Lab-Internet” treatment – the experimenter was not present. In-
stead subjects had access to the same help screens (including the introductory screens) as in
the CentERpanel experiment. Moreover, subjects could leave directly after completing the
experiment without having to wait for everyone else.
The last column of Table 2 contains the available demographic characteristics of the
laboratory subjects. Much less information is available than for the CentERpanel experiment,
and there is less variation among the basic demographic characteristics. Specifically, in terms
7
of age and education, the laboratory population represents less than five percent of the sample
in the first three columns.
3 Traditional Subject Pool Bias in the Laboratory
This section addresses “selection” or “subject pool bias”: the concern that the results of
standard laboratory experiments are not representative for a broader, heterogeneous, popu-
lation, since samples of students do not cover the population at large. Our design enables
analysing subject pool bias in laboratory experiments controlling for implementation mode
effects. With regard to elicitation and modelling of preferences, two issues are of special inter-
est. The first is the structure and frequency of errors and violations of fundamental principles
of choice. Several recent studies have highlighted the importance of accounting for errors in
the decision-making process when modelling risky choice (cf. Hey (1995), Loomes, Moffatt,
and Sugden (2002), and Schmidt and Neugebauer (2007)). The second concerns the extent
to which the distribution of preferences depends on the composition of the subject pool.
3.1 Errors and Inconsistencies
We classify three choice patterns as inconsistent. First, a dominance violation occurs if
somebody chooses option ‘B’ when the probability for the high outcome is zero or option ‘A’
when this probability is one. The second type of inconsistency emerges when subjects switch
back and forth from choosing ‘A’ and ‘B’ on the same screen. The third category of violations
consists of inconsistent choices on the initial and follow-up screens with the same payoffs and
probabilities. As explained above, we use an iterative version of the multiple price list format,
where after four choices on the first screen of each payoff configuration, subjects get a second
screen with the same payoffs but with a finer grid of probabilities. There was some overlap
of probabilities on the two screens, so subjects could make a choice on the second that was
inconsistent with their choice on the first screen.
We consider the average number of violations as a summary statistic for error frequencies.
Only one violation per subject is counted for each payoff configuration, limiting the maximum
amount of mistakes to seven. In Figure 3, the average number of inconsistent choices is
presented by sample. The error frequency is much higher in the Internet experiment than
in the Lab experiment: 2.43 (first bar) versus 1.21 (second bar). The null hypothesis that
underlying population frequencies of the number of inconsistencies are the same is rejected
using a Mann-Whitney (MW) or Kolmogorov-Smirnov (KS) nonparametric tests (two-sided
p-values< 0.01).
8
On the other hand, it is evident from Figure 3 that average error frequencies are very
similar in the two Laboratory treatments: 1.29 in the “Lab-Internet” treatment (fourth bar)
and 1.14 in the “Lab-Lab” treatment (fifth bar). We cannot reject the null hypothesis of
identical underlying distributions using the MW or KS test (two-sided p-values> 0.3). This
suggests that the disparity between the lab and the Internet does not stem from the different
environments under which the experiments were conducted. The higher frequency of viola-
tions observed in the Internet treatment is not driven by the presence of an experimenter or
the ability to leave immediately after finishing the experiment.
While this already suggests that differences between Internet and Lab inconsistencies are
driven by subject pool bias rather than implementation mode effects, one may still doubt
whether the CentERpanel setting has been perfectly replicated in the “Lab-Internet” treat-
ment. Hence, we also compare the laboratory sample with a sub-sample of the Internet sample
that has similar characteristics as the laboratory sample, thus at least partially controlling for
subject pool selection. We label this sub-sample “Internet-Uni”. Its 96 observations comprise
respondents between 18 and 34 years of age who hold a university degree or study at a univer-
sity. Behaviour of the Internet participants in this sub-sample resembles that of the student
sample: The average number of violations in the “Internet-Uni” sub-sample is 1.28 (third
bar in Figure 3), close to laboratory means of 1.29 and 1.14 (the differences are insignificant
according to KS and MW tests). This confirms that the higher frequency of errors in the
Internet treatment is driven by the different composition of the subject pool.
Table 3 displays the frequencies of the different types of errors as a percentage of the
number of possible violations. The pattern found in the laboratory is similar to results
reported by Loomes, Moffatt, and Sugden (2002) on a different risky choice design: very
few dominance violations but many more inconsistencies when faced twice with the same
decision problem. Inconsistencies between screens constitute more than 70% of all consistency
violations in both the Lab-Lab and the Lab-Internet treatment. Our results indicate that this
changes dramatically when the general population is considered, where dominance violations
play a much larger role (more than 38% of all consistency violations), though the numbers
of within and between screens inconsistencies are also larger than in the lab. As above,
the figures suggest that the difference between the laboratory and the internet is mainly
driven by subject pool effects. The error frequencies of the young and well educated in
the “Internet-Uni” group resemble those of the laboratory samples. The only discrepancy
concerns dominance violations, which appear to be slightly more common in the “Internet-
Uni” sample than in the laboratory samples. However, using individual-level frequencies of
dominance violations, we cannot reject the null hypothesis of identical underlying distributions
(MW two-sided p-values> 0.05; KS two-sided p-values> 0.6). Taken together, the frequencies
in Table 3 confirm that findings for student samples in the lab cannot always be generalised
9
to the general population.
Finally, we checked whether providing monetary incentives makes subjects take more care
in answering the questions and make fewer errors. We find no evidence of this – differences
between incentive treatments are not significant according to the MW and KS tests.
3.2 Preferences
We first consider the fraction of subjects choosing the safe option for a given probability of
the high outcome. In order to get comparable data across probabilities we restrict attention
to choices on the first screen. Subjects in the low incentive treatment are excluded, since they
faced a different payoff scale and cannot be directly compared with the other treatments. We
aggregate the data over the seven decision problems (looking at subgroups does not lead to
additional insights). Comparing the answers between the laboratory and the Internet, it is
evident that there are considerable differences – see Figure 4. Except for the case of a 0.25
probability of the high outcome, the fractions of risky choices are higher in the laboratory than
in the Internet experiment. The figure also suggests that the decisions in the lab are more
sensitive to the probability of the high outcome than in the Internet experiment. There are
two explanations for the smaller slope in the CentERpanel experiment: More heterogeneity
in preferences or higher error rates. When deleting payoff configurations with dominance
violations the pattern persists, suggesting that more dominance violations are not the only
explanation. Still, disentangling the two explanations calls for a structural model of choice
at the level of the individual, which is beyond the scope of this paper.
We can check whether the differences are due to subject pool or implementation mode
effects. First, Figure 5 shows that there are hardly any differences between the choice fre-
quencies in the “Lab-Lab” and “Lab-Internet” treatments. Second, Figure 6 shows that the
pattern in the “Internet-Uni” sub-sample of CentERpanel is similar to the pattern in the
laboratory experiment. Both figures indicate that the main driving force for the difference
between lab and Internet is the subject pool composition rather than the environment in
which the experiments were conducted.
To obtain a simple measure of individual preferences we consider at which probabilities
subjects switched from (the safer) option ‘A’ to (the riskier) option ‘B’ in each payoff config-
uration. Similar measures have been used in earlier studies, cf., e.g., Holt and Laury (2002).
We can only compute bounds that will at best be a 5%-interval (e.g. between 75% as the
highest ‘A’-choice on the first screen and 80% as the lowest choice of ‘B’ on the second screen).
In many cases, the bounds are substantially wider because of the inconsistencies discussed in
section 3.1. We computed the bounds as follows: the lowest possible switch point is defined
10
as the highest probability corresponding to an ‘A’ choice that is still lower than the minimum
probability with a ‘B’ choice; the upper bound is the minimum probability with a ‘B’ choice
that is still higher than the maximum probability where option ‘A’ was chosen. If only choice
‘A’ (‘B’) was observed, both upper and lower bound were set to one (zero).
For each individual, we averaged the upper bounds and the lower bounds across the seven
payoff configurations. This leaves us with two preference measures per individual – the higher
the measure, the more averse the subject is to more risky choices. To save space, we just
report results using the midpoint of the two bounds. All results remain qualitatively the same
if we use the upper or lower bounds or if we discard all payoff configurations with inconsistent
choices.
The average switch point of 70.4 in the Internet experiment is considerably higher than the
corresponding figure of 61.5 for the laboratory experiment. Moreover, the difference between
the two samples is found for all seven decision problems. Using the MW and KS tests we find
that the differences between the laboratory and Internet samples are highly significant, both
comparing averages across all questions or looking at each question separately (two-sided
p-values < 0.01). This is in contrast with Andersen et al. (2005), who find no significant
difference between average risk aversion in a laboratory and a field experiment.
To disentangle subject pool bias and implementation mode effects as explanations for the
observed differences in risk preferences, we also compared the average switch points in the
“Lab-Lab” treatment and the “Lab-Internet” treatment. The MW and KS tests (using mean
switch points across all payoff configurations or each payoff configuration separately), do not
reject the null hypothesis that there is no difference (two-sided p-values > 0.10). The switch
points are actually slightly higher in the “Lab-Lab” treatment, suggesting that the observed
difference between the laboratory and the Internet experiments is not due to characteristics
of the laboratory setting – in that case we would expect the difference to go in the other
direction.
A similar picture emerges when we compare the average switch points in the “Internet-
Uni” subsample to those of the student sample in the lab. The average switch point of 65.1 of
the “Internet-Uni” sample is much closer to the laboratory mean of 61.5 than the overall In-
ternet mean, and we cannot reject the null hypothesis of equality (two-sided p-values > 0.15).
For each payoff configuration separately, MW and KS tests show no significant differences
except for payoff configurations 3 and 5. Controlling for the composition of subject pools
hence eliminates most of the differences between the Internet and laboratory findings. The
disparity found between both errors and preferences in the Internet sample and the laboratory
experiments is mainly driven by the fact that the behaviour of the student population differs
from the behaviour of the general population.
11
Providing monetary incentives or not does not seem to affect the behaviour in any sys-
tematic way: the choices in the hypothetical and high incentive treatments are very similar.
This is confirmed by the MW and KS tests on differences in average switch points.
4 Self-Selection Bias in the CentERpanel Experiment
Conducting the experiment via an existing survey allows us to observe more features of the
recruitment process than usual, since we know many characteristics of all persons eligible
for participation, regardless of whether they actually take part in the experiment or not. In
order to get reliable population-wide estimates, sampling from a representative sub-population
suffices only if non-response is perfectly random. Since people self-select into the experiment
this condition may not hold. Indeed, the descriptive statistics in Table 2 suggest that there
are some important differences between the three groups of Internet-participants: those who
completed the experiment without rushing through it, those who chose not to participate,
and those who started but rushed through or did not complete. In addition to this, Harrison,
Lau, and Rutstrom (2007) have shown that self-selection effects may be important for the
estimation of risk preferences. We look at the same issue, but we can exploit much more
information about nonparticipants.
We first analyse the determinants of self-selection and then investigate their impact on
observed choices. In order to structure the analysis, it is useful to divide the sampling process
in the Internet experiment into three stages:
1. Dutch individuals are contacted at random and participate in the CentERpanel or not.
2. A random subsample of CentERpanel respondents is asked to take part in our experi-
ment. After learning about the nature of the experiment, they decide to participate or
decline participation.
3. Some of the subjects drop out during the experiment or click through it extremely
rapidly;
Steps 2 and 3 are especially interesting for experimental economics because they replicate the
recruitment process for laboratory experiments to some extent.
To see how Step 2 relates to typical recruitment procedures, note that some information
on payoffs and the type of experiment is usually conveyed before subjects come to the lab.
This is the typical form of communication in recruitment emails or on posters announcing the
experiments. Such information is provided on our welcome screen (Figure 2). Subjects learn
12
about the nature of the experiment and the possible payoffs, and then choose to participate
or not.
Step 3 seems typical for the Internet environment – it usually plays no role in the labo-
ratory. One may argue, however, that the participation decisions for laboratory experiments
combine features of steps 2 and 3. Part of the nonparticipation in laboratory experiments
may be similar to dropping out of the CentERpanel experiment, because of the negligible
fixed costs of participation in the latter. Showing up at the laboratory at a specific time
and date entails a significant cost – and subjects can be expected to have made the trade-off
between the costs and benefits of participation beforehand. This is probably not the case
in the Internet setting, where the experiment can be accessed within seconds of notification.
Hence the cost-benefit analysis may well be postponed and carried out during the experiment.
This may explain why subjects hardly ever leave the economics laboratory prematurely (and
nobody left our laboratory sessions), whereas 4% of subjects did not finish the CentERpanel
experiment. Similarly, rushing through the experiment can be considered as a form of nonpar-
ticipation, since there is a lower bound on the time needed to digest the instructions and to
give serious answers. This minimum time certainly seems higher than the 1:43 minutes which
is the minimum time observed in the Internet experiment. We checked several cut-off points
between 3 and 7 minutes and finally chose the minimum duration observed in the laboratory
(5:20 minutes). Results were robust to the precise value chosen. With this threshold, about
seven percent of the Internet subjects fall into the category of “speeders.”2
An alternative explanation why Step 3 features prominently in the Internet experiment and
not in laboratory experiments is the interaction with the experimenter and the typical rules
in the laboratory. One difference is the possibility to ask questions. Internet participant who
do not understand a task and cannot ask questions may more easily opt for randomly ticking
options or drop out entirely. Another difference is that in typical laboratory experiments
everybody is expected to stay until the last subject has finished, so that there is no point
in rapid completion. We can analyse the consequences of these differences by comparing
the “Lab-Lab” and “Lab-Internet” treatments (see Section 2.3). The distributions of the
completion times look rather similar, with mean durations of about 12.5 minutes in both
cases. Surprisingly, in the traditional “Lab-Lab” treatment the dispersion is higher and the
left tail of the distribution has more mass. If rapid completion were due to the two factors
mentioned above, we would expect the opposite. Completion times in the “Internet-Uni”-2The combined response rate for steps 2 and 3 in our Internet experiment is 78%. This seems to compare
favourably to Harrison, Lau, and Rutstrom (2007), who employed more standard recruitment procedures in
mailing out a letter to a random subsample of the Danish population and achieved a response rate of 38% (253
of 664 subjects), but it should be noted that our response rate is within a preselected sample that has shown
a general inclination to fill out survey questionnaires.
13
subgroup of the Internet subjects are lower than those of the laboratory subjects. This
is consistent with our preferred interpretation of step 3 of the selection process, since the
“Internet-Uni” group will contain more respondents who rush through the experiment.
4.1 The Determinants of Self-Selection
To analyse the factors that drive participation in the Internet experiment, we estimated a
multinomial logit model with three possible outcomes: non-participation, rushing through or
dropping out, and regular (“full”) participation. Results are presented in Table 5, with full
participation as the baseline category. Columns 1 and 2 contain the coefficients and standard
errors of our basic specification, for nonparticipation and dropouts/speeders, respectively.
Only basic covariates that are available for almost everybody are included – dummies for the
incentive treatments, education, gender, age, occupational status, and residential area. In
the extended specification (columns 3 and 4) the number of observations is lower since we
use covariates from the DNB Household Survey questionnaires which were not filled out by
all subjects. This specification adds household income (in four categories) and variables mea-
suring financial expertise and preferences: Whether the respondent manages the household’s
finances, whether the employer offers a save-as-you-earn deduction arrangement,3 whether
the respondent holds such a plan, whether the household’s financial portfolio includes savings
accounts, and whether it includes risky assets like mutual funds or stocks.4
For the variables included in both specifications, results for the two specifications are
very similar. Nonparticipation is significantly less likely in the incentive treatments than in
the hypothetical treatment (the benchmark). Translated to marginal effects, the coefficients
indicate response rates that are almost ten percentage points higher than in the hypothetical
treatment (all marginal effects are evaluated at a baseline with all dummy variables set to
zero). The point estimates on not finishing the experiment are much smaller in magnitude
and not significant. While incentives increase participation, they do not seem to attract a
different subject pool – we estimated models that included a wide variety of interaction effects
and none of them was significant. The coefficients on the high and low incentive treatments
are not significantly different from each other, so with the relatively small stakes we consider
here, the size of the incentives does not seem to have an impact.3This is an employer provided savings plan that is heavily subsidised by the state through tax deductions,
see Alessie, Hochgurtel, and van Soest (2006).These subsidies make net returns much higher than on any other
safe asset. While it is easy to sign up for these plans and the employer does most of the paperwork, the default
is not to participate. This may explain why employees with little financial knowledge or interest often do not
sign up, cf., e.g., the work of Madrian and Shea (2001) on non-take up of 401(k) plans.4Other specifications of the selection model did not yield additional insights; they also included subjective
income measures, wealth, and interactions between the covariates.
14
Persons in the top two education categories are both significantly more likely to participate
in the experiment and to finish it. Women’s nonparticipation rates are four to five percentage
points higher than those of men. Women also are slightly more prone to quit during the
experiment or to finish it very rapidly. Age effects start to matter at age 45, beyond which
participation rates decline. Those beyond 65 years of age are only half as likely to start
the experiment as those younger than 35. At the same time, however, non-completion rates
decrease significantly with age. This is mainly due to the fact that older participants are
less likely to rush through the experiment.5 Part of this result may be due to the fact that
the elderly are slower in working with computers, but the effects are already visible for ages
around forty. The combined effects of age on full participation are small and insignificant in
almost all cases.
Working respondents have higher participation rates than non-workers according to the
parsimonious specification, but the effect becomes insignificant in the richer specification.
Labour market status does not affect quitting or speeding. Living in an urban area has no
significant impact at all.
Point estimates of the effects of income and wealth variables on participation are generally
small and insignificant, and a joint test confirms that they play no role. The other financial
variables are proxies for preferences and financial knowledge. Being the financial administrator
of the household for example may reflect a preference for spending one’s time with risky
choice problems. It significantly increases the propensity to participate and also to finish
the experiment in more than 5:20 minutes. Whether the employer offers a savings plan is
just a control variable necessary to avoid confounding the effects of holding such a plan
with employment type. The variable of interest, taking part in a save-as-you-earn savings
arrangement, is associated with a (significant) six percentage points higher propensity to
begin the experiment. This supports the interpretation that the financial variables reflect an
interest in contemplating financial questions. This interpretation is strengthened by the other
two portfolio variables – on the one hand, having an ordinary savings account does not have
any predictive power for taking part in the experiment. Saving accounts are commonly known
and do not require much expertise or effort. On the other hand, the ownership of mutual
funds, stocks, etc. is significantly associated with higher participation in the experiment.
These are much more sophisticated products and investing in them requires more financial
knowledge. The results thus point towards a type of preference-based selection process in
which interested and knowledgeable individuals have a higher probability of participating.5We estimated models that treated the two components of step 3 (speeding through and dropping out) as
separate outcomes. This is the only case where the distinction mattered, so we report only the results from
the more parsimonious specification.
15
4.2 The Impact of Self-Selection on Outcomes
To test whether selection effects matter for the outcomes considered in Section 3, we compare
the observed sample distribution of full participants with a weighted distribution that corrects
for the various steps of the selection process.
For Step 1, CentERdata provides standard survey weights based upon comparing with a
much larger household survey drawn by Statistics Netherlands. We will assume that selection
into the CentERpanel is independent of the variables of interest, conditional on the basic
background variables used to construct these weights (age, sex, education, home ownership,
region). This is a missing at random (MAR) assumption, see e.g. Little and Rubin (2002)).
It implies that the weights can be used to correct for the selection in Step 1.
We make similar MAR assumptions for the other two steps, but then conditioning on the
much larger set of background variables used in the previous subsection. We construct weights
from a probit model that jointly explains the selection in Steps 2 and 3; each weight is the
inverse of the predicted probability of being in the final sample. We multiply these weights
with the weights for Step 1 to get weights that correct for all steps of the selection process.
Due to sample size considerations we opt for the parsimonious specification in the probit
regression. We then test whether weighted sample statistics on the outcomes are significantly
different from unweighted statistics. This can be seen as a test whether the selection process
is selective, under the maintained assumption that selection in each step is MAR given the
covariates used to construct the weights.
In Table 4 the average number of inconsistencies and average mean switch points for
the weighted and unweighted data are presented together with p-values of t-tests comparing
mean values.6 Taking the full selection process (steps 1, 2 and 3) into account the estimated
population average number of inconsistencies is 2.60, compared to 2.43 for the unweighted
sample. The difference is statistically significant (two-sided p-value=0.0001). This result
is consistent with the findings in Section 4.1. The only variable for which we had clear
priors was the education variable. We expect better educated people to make fewer errors.
Indeed, there are fewer inconsistencies in the raw estimates, where the better educated are
overrepresented. Controlling only for the selection at steps 2 and 3, the average number
of inconsistencies is 2.55, which is significantly different from the unweighted number 2.43
(two-sided p-value:=0.001) but not from the 2.60, which also corrects for Step 1 (two-sided6The test works as follows: Let y denote our variable of interest (i.e. the average number of violations
or mean switch point) and w(x) the weight. Our null hypothesis of no difference between the weighted and
unweighted observations can then be stated as E{w(x)y} = E{y} or E{z} = 0, with z = (w(x) − 1)y. Since
we have a large sample size and few explanatory variables, we neglect the estimation error in w. The null
hypothesis can then be tested with a standard t-test on whether the mean of z is zero or not.
16
p-value=0.1). Hence, the selection bias reported for the full process originates mostly from
steps 2 and 3, while step 1 has less of an impact.
For the estimates of risk preferences, we find a small underestimation of the mean switch
points when using the unweighted sample compared to both weighted samples, but the dif-
ference is insignificant (irrespective of which switch points we use). Thus, although the
participation decision is correlated with demographics, selection based on observables does
not fundamentally alter the results on this simple summary measure of risk preferences.
Harrison, Lau, and Rutstrom (2007) find that not controlling for selection effects leads to
an underestimation of average risk aversion in the population. They attribute this finding to
the use of fixed show-up fees attracting less risk averse participants. To investigate this in
our sample, we tested for selection effects for each incentive treatment group separately. In
contrast to Harrison, Lau, and Rutstrom (2007) we do not observe any selection effects on
the aggregated measures of preferences when only considering the treatments with fixed show
up fees. Combining this with the similarities we found in behaviour of the high incentive and
hypothetical groups (Section 3.2), we conclude that our data do not support the explanation
of selection effects suggested by Harrison, Lau, and Rutstrom (2007).
5 Conclusions
We have analysed different aspects of the representativeness of preference elicitation experi-
ments. First, we disentangled environmental effects (laboratory vs. Internet) from traditional
subject pool bias. On the one hand, we showed that the implementation mode does not mat-
ter for either error frequencies or preference estimates among the young and educated. On the
other hand, we found dramatic differences in the number of dominance and monotonicity vi-
olations when moving from a student sample to a sample drawn from the general population.
Students also exhibited a higher degree of risk tolerance than the general population.
Since this implies that selection of subjects is important, we then looked at selection effects
that may arise from voluntary participation in the Internet experiment. We first considered
the relation between self-selection and observed characteristics and showed that higher ed-
ucation, being male, as well as interest and expertise in financial matters are predictors of
participating in the experiment. Consistent with the education and financial literacy vari-
ables, we showed that this type of selection leads to a sample with less prevalence of violations
of monotonicity and dominance. On the other hand, we found no evidence of selection bias
on risk preferences for the Internet experiment.
Our main messages are the following. First, in line with existing evidence, we find large
17
differences between experimental choices of students and the general population, reflecting
not only different tendencies to make errors but also differences in risk preferences. This
confirms that findings based upon laboratory experiments with student samples cannot be
simply extrapolated to the general population. Second, we conclude that using a represen-
tative Internet survey offers a feasible solution to this problem. We find hardly any effects
of implementation mode (help screens replacing an experimenter, etc.), with similar choice
distributions of students in the traditional lab setting, a lab setting mimicking the Internet
experiment, and the Internet experiment. We also find that the selection processes leading to
participation in the Internet experiment leads to a subject pool making fewer errors than the
general population, but with the same distribution of risk preferences, irrespective of whether
or not we provide real monetary incentives. Thus the Internet experiment seems an appro-
priate way to estimate risk preferences for the general population, which is substantially less
inclined to make risky choices than the subpopulation of students. The conclusion that im-
plementation mode does not matter may of course be specific to the nature of the experiment
– whether it remains the same in more complicated experiments or experiments measuring
different sorts of preferences remains to be seen.
A Tables
Table 1: Characteristics of the Seven Payoff Configurations
Payoff Uncertainty Payoff Payoff Uncertainty Payoff Payoff
Configuration Resolution Low, A High, A Resolution Low, B High, B
1 early 27 33 early 0 69
2 early 39 48 early 9 87
3 early 12 15 early -15 48
4 early 33 36 late 6 69
5 early 18 21 late -9 54
6 early 24 27 early -3 60
7 late 15 18 late -12 51
Note: These values were shown in the high incentive and hypothetical treatments. For the low incentive
treatment they were divided by three.
18
Table 2: Selected Characteristics of Participants
CentERpanel Laboratory
Final Non- Dropouts Final
Variable Sample Participants Speeders Sample
Hypothetical treatment 0.31 0.50 0.37 0.37
Low incentive treatment 0.37 0.23 0.35 0.27
High incentive treatment 0.32 0.27 0.28 0.37
Primary / lower sec. education 0.31 0.44 0.34 .
Higher sec. / interm. voc. train. 0.33 0.30 0.41 .
Higher vocational training 0.25 0.20 0.16 .
University degree / univ. student 0.12 0.06 0.09 1.00
Female 0.45 0.54 0.56 0.46
Age 16-34 years 0.24 0.14 0.46 1.00
Age 35-44 years 0.19 0.13 0.21 .
Age 45-54 years 0.23 0.22 0.14 .
Age 55-64 years 0.18 0.18 0.09 .
Age 65+ years 0.16 0.33 0.10 .
Working 0.56 0.36 0.55 .
Unemployed, Looking for Job 0.02 0.03 0.03 .
Student, Pensioner, Housework 0.42 0.62 0.42 1.00
Lives in Urban Area 0.60 0.63 0.58 .
HH financial administrator 0.66 0.56 0.48 .
Employer offers Savings Plan 0.44 0.25 0.32 .
Has Sav. Plan via Employer 0.36 0.17 0.26 .
Has Sav. Acc. or similar 0.87 0.85 0.90 .
Holds Stocks, or similar 0.31 0.25 0.29 .
HH income below 22k Euros 0.34 0.35 0.33 .
HH income ∈ [22k, 40k Euros) 0.49 0.51 0.49 .
HH income at least 40k Euros 0.17 0.14 0.18 .
Max. Number of Observations 1790 291 218 178
Note: The numbers shown indicate fractions in the final sample. Some households did not
complete the questionnaires of the DHS from which some of the variables are drawn. Hence
the number of observations is lower for some of the variables in question. This is particularly
true for the last two sections of the table.
19
Table 3: Frequency of inconsistencies by type of error and subsample
Dominance Within Between
Internet 11.0% 3.9% 21.1%
Laboratory 1.5% 1.8% 14.4%
Internet - Uni 3.9% 1.5% 13.1%
Lab - Internet 1.4% 1.8% 15.1%
Lab - Lab 1.5% 1.7% 13.7%
Note: The figures represent frequencies of the different
types of errors as a percentage of the number of possible
violations. The fractions of violations for the dominance
category were obtained by dividing the total number of
dominance violations in each category by the total num-
ber of screens shown to subjects on which dominance
violations could be made. The numbers for the within
category are calculated as the number of within viola-
tions, divided by the total number of screens shown to
subjects in each group. The figures of the last column
were obtained by dividing the number of between errors
by the number of times the second screen was displayed
to subjects.
Table 4: Weighted data
Weight Average # p-value p-value Meanswitch p-value p-value
inconsistencies steps 2,3 no weight steps 2,3 no weight
Steps 1,2,3 2.6 0.1 0.0001 71.55 0.2 0.1
(0.05) (0.50)
Steps 2,3 2.55 . 0.001 71.14 . 0.1
(0.05) (0.49)
None 2.43 . . 70.36 . .
(0.05) (0.53)
Note: Variables in category Steps 1,2,3 use weights for step 1,2 and 3; variables in category Steps
2,3 use weights for step 2 and 3. Average mean switch points are calculated using data from
the hypothetical and high incentive treatments only. Standard errors are given in parenthesis.
P-values comes from t-tests, described in footnote 6 in Section 4.2, with the null hypotheses of
equal means.
20
Table 5: Self-Selection into the CentERpanel Experiment
NP DO/SP NP DO/SPSpecification (1) (2) (3) (4)
Low incentive treatment -1.047∗∗∗ -.220 -1.053∗∗∗ -.247(.163) (.174) (.183) (.207)
High incentive treatment -.699∗∗∗ -.277 -.881∗∗∗ -.342(.156) (.183) (.188) (.226)
Higher sec. / interm. voc. train. -.285∗ -.120 -.180 -.290(.160) (.177) (.185) (.215)
Higher vocational training -.413∗∗ -.677∗∗∗ -.329 -.801∗∗∗
(.181) (.228) (.217) (.282)
University degree / univ. student -.881∗∗∗ -.447 -.840∗∗ -.622∗
(.281) (.279) (.356) (.356)
Female .383∗∗∗ .275∗ .357∗∗ .253(.137) (.152) (.158) (.184)
Age 35-44 years .336 -.450∗∗ .385 -.471∗
(.242) (.196) (.301) (.253)
Age 45-54 years .579∗∗∗ -1.076∗∗∗ .761∗∗∗ -1.044∗∗∗
(.221) (.222) (.268) (.275)
Age 55-64 years .489∗∗ -1.326∗∗∗ .781∗∗∗ -1.214∗∗∗
(.231) (.262) (.274) (.301)
Age 65+ years 1.120∗∗∗ -1.171∗∗∗ 1.349∗∗∗ -1.297∗∗∗
(.233) (.278) (.284) (.341)
Working -.402∗∗ -.104 -.185 -.054(.175) (.179) (.214) (.229)
Unemployed, Looking for Job .059 .198 -.011 .263(.416) (.439) (.515) (.519)
Lives in Urban Area .165 -.059 .235 -.051(.137) (.151) (.159) (.183)
HH financial administrator -.379∗∗ -.328∗
(.164) (.193)
Employer offers Savings Plan .180 -.369(.281) (.365)
Has Sav. Plan via Employer -.845∗∗∗ -.045(.310) (.379)
Has Sav. Acc. or similar .095 .228(.225) (.292)
Holds Stocks, or similar -.368∗∗ .058(.177) (.204)
HH income ∈ [22k, 40k Euros) .153 .191(.176) (.208)
HH income at least 40k Euros .040 .539∗
(.259) (.280)
Constant -1.693∗∗∗ -1.147∗∗∗ -1.735∗∗∗ -1.244∗∗∗
(.253) (.240) (.371) (.401)
No. of Observations 2296 2296 1802 1802
Note: Coefficient estimates and corresponding standard errors of multinomial logit regression.
Columns indicate categories of the dependent variable by regression type. The reference category
contains those respondents who completed the experiment in more than 5:20 minutes. Columns
(1) and (3) list estimates for opting for nonparticipation on the first screen (NP); columns (2)
and (4) those for dropping out before completion (DO) or finishing the experiment in less than
5:20 minutes (SP). Left-out categories of relevant variables are hypothetical treatment; primary
and lower secondary education; ages 16-34; other type of occupation; and household income less
than 22,000 Euro. Asterisks indicate significance at the 10%, 5%, and 1%-level.
21
B Figures
Figure 1: Screenshot of Payoff Configuration 5, First Screen
Progress: 70% Instructions Help
Please, make a choice between A and B for each of the decision problems below.
Option A -outcome IMMEDIATELY revealed
Option B -outcome revealed in THREE MONTHS
Choice
A B
€ 21 with probability 25%€ 18 with probability 75%
€ 54 with probability 25%€ -9 with probability 75%
€ 21 with probability 50%€ 18 with probability 50%
€ 54 with probability 50%€ -9 with probability 50%
€ 21 with probability 75%€ 18 with probability 25%
€ 54 with probability 75%€ -9 with probability 25%
€ 21 with probability100%€ 18 with probability 0%
€ 54 with probability100%€ -9 with probability 0%
Continue
22
Figure 2: Translations of the Welcome Screens in the CentERpanel Experiment
High (Low) Incentive Treatment
Welcome to this economic experiment carriedout by researchers of Tilburg University. Theexperiment is about making choices under un-certainty. Please read the instructions care-fully in order to understand how the experimentworks.
If you have questions after the beginning of theexperiment, you can return to the instructionsby clicking on a link at the top of the screen.If you have questions on the specific screen, youcan click on ‘Help’ at the top right corner of thescreen.
You will receive 15 (5) Euros for participating.Then you can, depending on the choices youmake and on chance, earn more or lose part ofthe 15 (5) Euros. If completing the total exper-iment, you receive the reward for participating,possibly increased by your gain (or reduced byyour loss) in one of the choices you have made.Whether the latter occurs and which choice thendetermines your payoff, will be determined bychance. Your total reward will be addedto your CentERpoints.
The questions are not designed to test you.Answers are therefore not correct or incorrect;please give the answers that reflect your ownpreferences. Assume in each choice problemthat this choice determines your actual payoff.
This questionnaire is about making choices, andyour payoff depends on your choices and onchance. If you do not want to participate out ofprinciple, you can indicate this below. In thatcase you will not continue with the question-naire.
O Yes, I proceed with the questionnaire
O No, I do not want to complete this question-naire
Hypothetical Treatment
Welcome to this economic experiment carriedout by researchers of Tilburg University. Theexperiment is about making choices under un-certainty. Please read the instructions care-fully in order to understand how the experimentworks.
If you have questions after the beginning of theexperiment, you can return to the instructionsby clicking on a link at the top of the screen.If you have questions on the specific screen, youcan click on ‘Help’ at the top right corner of thescreen.
The questions are not designed to test you.Answers are therefore not correct or incorrect;please give the answers that reflect your ownpreferences.
This questionnaire is about making choices be-tween several situations in which you can (hy-pothetically) gain or lose money. Your revenuedepends on the choices you make and on chance.What matters is what you would do in hy-pothetical situations, in reality, there isnothing at stake for you. If you neverthelessdo not want to participate out of principle, youcan indicate this below. In that case you willnot continue with the questionnaire.
O Yes, I proceed with the questionnaire
O No, I do not want to complete this question-naire
23
Figure 3: Average Number of Answers that Violate Monotonicity or Dominance by Sample.
0.5
11.
52
2.5
3M
ean
num
ber
of in
cons
iste
ncie
s
Internet Laboratory Internet−Uni Lab−Internet Lab−Lab
Sample
Note: Values shown are means over all screens per subject (mini-mum 0, maximum 7). “Internet” consists of unweighted numbers fromCentERpanel respondents. “Lab” are averages for all laboratory subjectsand “Internet-Uni” mean values for those respondents of the CentERpanelthat are less than 35 years of age and hold a university degree or studyto obtain one. “Lab-Lab” and “Lab-Internet” are averages for laboratorysubjects in the“Lab-Lab” and “Lab-Internet” treatments respectively.
Figure 4: Risky Choices, Internet and Lab Subsamples
0.1
.2.3
.4.5
.6.7
.8.9
1
Fra
ctio
n C
hoos
ing
Ris
ky O
ptio
n B
25 50 75 100
Probability of the High Outcome
Internet Lab
Note: Solid lines depict means over all first screen choices. Dashed linesdisplay corresponding confidence intervals. Data from the low incen-tive treatment is excluded. “Internet” consists of the raw numbers fromCentERpanel respondents. “Lab” are averages for laboratory subjects.
24
Figure 5: Risky Choices, Lab-Internet and Lab-Lab Subsamples
0.1
.2.3
.4.5
.6.7
.8.9
1
Fra
ctio
n C
hoos
ing
Ris
ky O
ptio
n B
25 50 75 100
Probability of the High Outcome
Lab−Lab Lab−Internet
Note: Solid lines depict means over all first screen choices. Dashed linesdisplay corresponding confidence intervals. Data from the low incentivetreatment is excluded. “Lab-Lab” are averages for laboratory subjects inthe “Lab-Lab” treatment and “Lab-Internet” mean values for subjects ofthe “Lab-Internet” treatment.
Figure 6: Risky Choices, Internet-Uni and Lab Subsamples
0.1
.2.3
.4.5
.6.7
.8.9
1
Fra
ctio
n C
hoos
ing
Ris
ky O
ptio
n B
25 50 75 100
Probability of the High Outcome
Internet−Uni Lab
Note: Solid lines depict means over all first screen choices. Dashed linesdisplay corresponding confidence intervals. Data from the low incentivetreatment is excluded. “Lab” are averages for laboratory subjects and“Internet-Uni” mean values for those respondents of the CentERpanelthat are less than 35 years of age and hold a university degree or study toobtain one.
25
References
Alessie, R., S. Hochgurtel, and A. van Soest (2006): “Non-take-up of tax-favoredsavings plans: evidence from Dutch employees,” Journal of Economic Psychology, 27, 483–501.
Andersen, S., G. W. Harrison, M. I. Lau, and E. E. Rutstrom (2005): “Preferenceheterogeneity in experiments: comparing the field and lab,” Centre for Economic andBusiness Research Discussion Paper No. 2005-03, Copenhagen.
(2006): “Elicitation using multiple price list formats,” Experimental Economics,9(4), 383–405.
Bellemare, C., and S. Kroger (2007): “On representative social capital,” EuropeanEconomic Review, 21, 183–202.
Binswanger, H. (1980): “Attitudes toward risk: experimental measurement in rural India,”American Journal of Agricultural Economics, 62, 395–407.
Bleichrodt, H., J. L. Pinto, and P. P. Wakker (2001): “Making descriptive use ofprospect theory to improve the prescriptive use of expected utility,” Management Science,47(11), 1498–1514.
Dohmen, T., A. Falk, D. Huffman, U. Sunde, J. Schupp, and G. G. Wagner (2005):“Individual risk attitudes: new evidence from a large, representative, experimentally-validated survey,” IZA Discussion Paper No 1730.
Donkers, B., B. Melenberg, and A. van Soest (2001): “Estimating risk attitudes usinglotteries; a large sample approach,” Journal of Risk and Uncertainty, 22(2), 165–195.
Fehr, E., U. Fischbacher, J. Schupp, B. von Rosenbladt, and G. G. Wagner (2003):“A nationwide laboratory examining trust and trustworthiness by integrating behaviouralexperiments into representative surveys,” CEPR Discussion Papers 3858, C.E.P.R. Discus-sion Papers.
Guth, W., C. Schmidt, and M. Sutter (2007): “Bargaining outside the lab - a newspaperexperiment of a three-person ultimatum game,” Economic Journal, 117(518), 449–469.
Harrison, G., M. Lau, and E. Rutstrom (2007): “Risk attitudes, randomization totreatment, and self-selection into experiments,” UCF Economics Working Paper.
Harrison, G. W., M. I. Lau, and M. B. Williams (2002): “Estimating individual dis-count rates in Denmark: a field experiment,” American Economic Review, 92(5), 1606–1617.
Harrison, G. W., and J. A. List (2004): “Field experiments,” Journal of EconomicLiterature, 42, 1013–1059.
Harrison, G. W., J. A. List, and C. Towe (2007): “Naturally occurring preferences andexogenous laboratory experiments: a case study of risk aversion,” Econometrica, 75(2),433–458.
26
Hey, J. (1995): “Experimental investigations of errors in decision making under risk,” Eu-ropean Economic Review, 39, 633–641.
Holt, C. A., and S. K. Laury (2002): “Risk aversion and incentive effects,” AmericanEconomic Review, 92, 1644–1655.
Kahneman, D., and A. Tversky (1979): “Prospect theory: an analysis of decision underrisk,” Econometrica, 47, 263–291.
Kreps, D. M., and E. L. Porteus (1978): “Temporal resolution of uncertainty and dy-namic choice theory,” Econometrica, 46, 185–200.
Levitt, S., and J. List (2007): “What do laboratory experiments measuring social prefer-ences reveal about the real world?,” The Journal of Economic Perspectives, 21(2), 153–174.
Little, R. J., and D. B. Rubin (2002): Statistical Analysis with Missing Data. John Wiley& Sons Inc., New York, 2nd edn.
Loomes, G., P. Moffatt, and R. Sugden (2002): “A microeconometric test of alternativestochastic theories of risky Choice,” Journal of Risk and Uncertainty, 24, 103–130.
Lucking-Reiley, D. (1999): “Using field experiments to test equivalence between auctionformats: magic on the internet.,” American Economic Review, 89, 1063–1080.
Madrian, B. C., and D. F. Shea (2001): “The power of suggestion: inertia in 401(K)participation and savings behavior,” Quarterly Journal of Economics, 116(4), 1149–1187.
Schmidt, U., and T. Neugebauer (2007): “Testing expected utility in the presence oferrors,” Economic Journal, 117, 470–485.
27