comparing gps and prompted recall data records€¦ · comparing gps and prompted recall data...
TRANSCRIPT
Comparing GPS and Prompted Recall Data Records STOPHER, Peter, PRASAD, Christine, and ZHANG, Jun
12th WCTR, July 11-15, 2010 – Lisbon, Portugal
1
COMPARING GPS AND PROMPTED RECALL DATA RECORDS
Peter Stopher, Institute of Transport and Logistics Studies, The University of Sydney
Christine Prasad, Institute of Transport and Logistics Studies, The University of Sydney
Jun Zhang, Institute of Transport and Logistics Studies, The University of Sydney
ABSTRACT
Global Position System (GPS) devices as a substitute for conventional travel diaries have
been considered for some time in the past. However, only in the past year or so has a
serious effort been made to trial replacement of travel diaries with GPS. Apart from the
refinements in the technology to obtain sufficiently accurate measurements of position and to
have battery power that will last sufficiently long, the major barrier to the widespread
adoption of GPS as a means of undertaking household travel surveys is the information
required for travel demand modelling that is not collected by the GPS device. GPS devices
are, of course, very accurate at recording the time and positional characteristics of travel.
However, the GPS device cannot record modes of travel used, trip purposes, or number of
occupants in private vehicles, all of which are important attributes normally acquired in a
household travel survey. Some researchers have developed interactive GPS devices, in
which respondents are asked to either respond to questions asked on a PDA screen about
these non-GPS characteristics of travel or, in a simpler process for the respondent, to press
certain buttons on a GPS device that record the mode and purpose. Such procedures are
possibly useful, but have to rely on the memory of the respondent to enter such data at the
time of travel, somewhat undermining the low burden and non-reliance on respondent
memory of the passive GPS device.
The alternative approach, and the one that is the focus of this paper, is to develop software
processes that are able to deduce, with sufficient accuracy, the missing data from a
combination of the GPS records, other data collected from respondents, and data available
in GIS records. The authors of this paper have developed software that is largely able to
deduce these characteristics, but the level of accuracy has largely been unchecked in the
past. As part of an ongoing GPS-only survey in the Greater Cincinnati region, the consultant
team is undertaking a prompted recall survey on a sample of households. In this paper, we
report on comparisons between the results from the processing software and the prompted
Comparing GPS and Prompted Recall Data Records STOPHER, Peter, PRASAD, Christine, and ZHANG, Jun
12th WCTR, July 11-15, 2010 – Lisbon, Portugal
2
recall web survey, with respect to identifying trips, mode of travel, purpose of travel, and
number of household members present in a privately-owned vehicle. These results, when
collected from the entire survey (due to be completed in August 2010) will be used to
develop refinements to the processing software, either through refinements to the present
rules used or through the addition of Artificial Intelligence (AI) procedures that are able to
learn from the prompted recall results. In the latter case, it would be expected that the future
of GPS-only surveys would necessarily include a small subsample prompted recall survey to
provide the needed data to teach the AI procedure for each new GPS survey undertaken.
However, if satisfactory results can be obtained from a refinement of rules, it may be possible
that future GPS surveys will be able to be done without a prompted recall component.
INTRODUCTION
In the mid-1990s, GPS was first trialled (Wagner, 1997) as a means to measure people‟s
travel, as a direct outcome of a Conference held in Irvine, California on Household Travel
Surveys (TRB, 1996). At that time, selective availability was still in place, GPS technology
was in its infancy, and devices were cumbersome and required an external power source.
Over a little more than a decade, selective availability has been turned off and the technology
has improved enormously, as summarised by Wolf (2009). Since the outset of GPS use, the
idea in the minds of the profession has been that one day GPS might replace the
conventional interview or self-administered household travel survey (Wolf et al., 2001).
However, in the early years of the development of GPS surveys, it was clear that neither the
technology nor the processing software was yet ready for such a replacement to take place.
Rather, most of the use of GPS was to validate travel surveys (Wolf et al., 2003; Stopher,
2009) and for evaluation of travel behaviour changes aimed at reducing daily vehicle
kilometres of travel (VKT) (Stopher, 2009). However, these uses of GPS served both to
provide the opportunity to improve and change the design of the devices and also to develop
increasingly sophisticated processing software (Stopher et al., 2008).
With these developments in mind, the Ohio Department of Transportation (ODOT)
commissioned the first GPS-only Household Travel Study, which is taking place at the time
of writing this paper in the Greater Cincinnati Area of the Ohio-Kentucky-Indiana (OKI)
metropolitan planning organisation region. This study was commissioned in early 2009, with
a pilot survey to be conducted in April of that year and with the expectation of the main
survey being conducted over a 12-month period from about August 2009 to August 2010. In
the original plan for this study, it was expected that about 4,000 households would eventually
be included in the sample, with all households using GPS devices for a period of two to three
days. GPS devices were to be given to all members of a household aged 12 years and over,
and brief diary surveys were to be used for those members of the household under the age
of 12. In addition, a subsample of about 1,200 households would be asked to undertake a
Prompted Recall survey (details of which are provided in the next section of this paper). The
survey has two purposes. The first purpose is to provide an opportunity to upgrade and
improve the processing software, so that more complete and more accurate data can be
obtained from the GPS records. The second purpose is to provide a database that can
Comparing GPS and Prompted Recall Data Records STOPHER, Peter, PRASAD, Christine, and ZHANG, Jun
12th WCTR, July 11-15, 2010 – Lisbon, Portugal
3
support the continued updating and improvement of travel demand models for the OKI
region, and to do this with processed GPS data for the first time.
In the pilot survey (which is the main focus of this paper), it was intended to recruit 250
households that would use the GPS devices, and to recruit 100 of these households to
undertake the prompted recall survey. As a result of lower response rates than expected, the
pilot survey resulted in a sample of 120 households and 228 persons who provided GPS
data. More details of the pilot survey are provided in Stopher and Wargelin (2010). It was
decided to recruit all pilot survey households to undertake the prompted recall survey. This
resulted in 35 households providing usable prompted recall data from a total of 46
individuals. The prompted recall survey provided data for one day of travel for each person
who completed it. The focus of this paper is a comparison of the prompted recall data with
the GPS data, with a particular focus on how well the processing software functions and
identification of where improvements seem possible to make.
THE PROMPTED RECALL SURVEY
The prompted recall survey first appeared very early in the development of GPS applications
in transport (Bachu et al., 2001) and was then developed further in a number of subsequent
studies. Stopher et al. (2002) used a small pilot study to investigate the concept further,
subsequently transitioning the survey from a paper prompted recall to an Internet version
(Stopher and Collins, 2005). This was followed by a number of further developments in
Internet-based prompted recall surveys in the next few years (Lee-Gosselin et al., 2006; Li
and Shalaby, 2008, Auld et al., 2009). Auld et al. (2009) provide a more detailed history of
the development of prompted recall surveys over the past eight years or so.
The concept of the prompted recall survey is that respondents who have earlier carried a
GPS device with them for a day or more are subsequently sent information that allows
display of the travel recorded on the GPS device. They are then asked to provide additional
information about the travel, such as the mode of travel, the purpose of travel and the size of
their travel party, as well as to indicate if there are any errors in the processed GPS data.
The maps that display the travel recorded by the GPS device therefore act as a memory
prompt to the individual, then allowing the individual to respond to questions about the travel.
In the earliest form of the prompted recall survey, maps of each day of travel undertaken by
the respondent were printed and incorporated within a paper survey that then asked for
further information about the travel and also offered the respondent the opportunity to
indicate if there were errors in the processing or if there were gaps in the GPS record. In
general, however, the paper survey was rather clumsy, in that the respondent could
generally indicate only limited information about the displayed trips and correction of the
processing. Indicating that a mapped stop was not a stop, or that a stop had been omitted at
a certain location, or that entire travel had been omitted was generally difficult to
accommodate in a paper format.
Thus, the transition of the prompted recall survey from paper to the Internet, providing an
interactive environment in which respondents could indicate corrections to the GPS
Comparing GPS and Prompted Recall Data Records STOPHER, Peter, PRASAD, Christine, and ZHANG, Jun
12th WCTR, July 11-15, 2010 – Lisbon, Portugal
4
processed record, was extremely important to the continuing use of the prompted recall
survey. There remain two problems associated with the prompted recall survey, however.
The first is that the survey requires that respondents are familiar with maps and map reading,
and have the ability to understand the implications of a series of trips shown on a map. Ability
to read a map may require a higher level of literacy than is often required for standard paper
and pencil surveys. Second, the survey requires access to and familiarity with the Internet.
This necessarily reduces the proportion of households and household members who could
respond to a prompted recall survey administered over the Internet.
At the outset of the Greater Cincinnati Area Household Travel Survey (GCAHTS), it was
proposed to conduct prompted recall surveys by both paper and pencil and the Internet.
However, as the specification of the survey was developed in an Internet environment, it
rapidly became clear that a comparable survey by paper and pencil could not be developed
within reasonable resources. The decision was made, therefore, that the prompted recall
survey would be conducted only by Internet. While this could be considered to generate
some bias in the responses, there is, in fact, no need for the prompted recall survey to be
undertaken by a representative sample of the population, because the purposes of the
survey are not to expand the prompted recall results to the entire population of the region.
In the case of the GCAHTS, the purposes of the prompted recall survey are to provide
“ground truth” about the travel undertaken by a subsample of people, against which to check
the results of the processing of GPS data, and also to provide a data source for potentially
improving the processing software. Neither of these uses demands a representative sample.
There is no question that a representative sample would be nice to have, but it is not a
requirement for the use of the data.
Whilst the decision was made to limit the prompted recall survey to an Internet version, it was
also decided to develop the prompted recall survey by using Google® Maps, so that
respondents would be likely to find some similarity between the prompted recall survey and
maps that they may possibly be familiar with using in their own use of the Internet. It is
necessary for processing to be undertaken on the GPS data, prior to creating the maps for
the prompted recall survey. The procedures for data processing and analysis and creation of
the prompted recall survey are described in the next section of this paper.
DATA PROCESSING AND ANALYSIS AND GENERATION OF THE PROMPTED RECALL SURVEY
In this survey, the GPS devices that were used for the pilot and are being used in the main
survey are Atmel BTT-08 devices with various modifications to the hardware and the
firmware, as specified by the Institute of Transport and Logistics Studies (ITLS). The device
is shown in Figure 1. Each person 12 years of age or older in each sampled household is
asked to carry one of these devices (identified to that person for the duration of the
household‟s use of the device) for about three days. The device is set to record position
every second and are equipped with a vibration sensor. If no vibration is sensed for 3
Comparing GPS and Prompted Recall Data Records STOPHER, Peter, PRASAD, Christine, and ZHANG, Jun
12th WCTR, July 11-15, 2010 – Lisbon, Portugal
5
minutes, the device turns itself off. As soon as the device is vibrated again, it turns on and
seeks a position. If the time it has been off is less than about an hour, then the position is
usually acquired within a matter of 10 to 15 seconds. However, if the device has been off for
more than an hour, position acquisition may take from 10 or 15 seconds up to about a minute
or so, depending on the speed of movement and location of the device.
When the survey period is completed by a household, the devices (which are logging
devices) are returned to the survey team who download the data. What is obtained is a
modified stream of data from the GPS device, giving the second-by-second position of the
device for the two or three days during which the sampled respondent had the device. These
data are then processed by a series of software programs developed at ITLS. The first of
these programs uses a number of rules to delete spurious data (generally the data collected
whilst the device is at rest at the end of a trip, or possibly in the middle of a trip when there is
a lengthy delay in movement, such as may occur at a traffic signal) and to split the data
stream up into what are assumed to be individual trips. At the completion of this process,
maps are generated by the software, along with a summary file showing the assumed start
and end locations of each trip, the time (to the nearest second) when the trip started and
ended, and some of the other characteristics of the trip (distance, elapsed time, average
speed, etc.).
Because it is not possible to craft rules that will work 100 percent of the time, the next step in
the process is what ITLS calls „map editing‟. In this process, trained staff at ITLS review the
maps for each day using TransCAD software and look for possible spurious data that may
not have been deleted in the initial processing, for possible stops in a trip that the software
identified as a single trip, and trips that might be split into two or that may be missing due to
loss of GPS signal. Trips may not be split correctly by the software due to a rule that dictates
that an identifiable stop (after removing spurious data) must last for at least 120 seconds to
define the end of a trip. Because there are a number of activities that will take less than 120
seconds to accomplish that should also define the end of a trip (such as picking up or
dropping off a person), and also because the deletion of spurious data is also done
conservatively by the software, it is necessary to inspect the map and make some edits to
the list of trips provided by the software processing. An example of this is shown in Figure 2.
This figure shows part of what the processing software identified as a single trip. However, it
can be seen that there are some clusters of points at three locations and another location
Figure 1: Atmel BTT-08 GPS
Device
Comparing GPS and Prompted Recall Data Records STOPHER, Peter, PRASAD, Christine, and ZHANG, Jun
12th WCTR, July 11-15, 2010 – Lisbon, Portugal
6
where the respondent appears to have travelled to a point and then returned, without a stop
of any noticeable duration. These all suggest the possibility that the trip should be broken
into a number of different trips.
Figure 2: Part of a 'Single' Trip Identified by the Software
The map editing process, in this case, would have removed the agglomeration of data points
at each of the three locations on the left and bottom of the map, would have inserted stops at
those locations, and would also have created a stop at the end of the trip near the top of the
map. Edits of this type are needed so that the respondent is not expected to understand how
such agglomerations of data points occur and to provide a map that is more clearly
representative of the travel undertaken. All editing of trips are done to a trip list in text format
so the original visual map that was generated will remain unchanged.
After map editing is complete, the data are then run through several processes prior to
producing the data for the web survey. One of the processes applies the changes from the
amended trip list file to the original trip database to remove data points, and split or join trips.
Another process compares the address information collected from respondents to the
locations of trip ends in the modified trip list and records any matches for input into the web
survey, so that home, work, educational establishment, or grocery shopping locations can be
shown on the map and the possible purpose of the trip can be shown.
Figure 3 shows an example of the web survey as it was used in the pilot survey, displaying at
the outset to the respondent the full day‟s worth of travel that was recorded by the GPS. This
is intended to orientate them and to show the overall task that the respondent is to undertake
as part of the web survey. The survey then proceeds by displaying to the respondent one trip
Probable trip ends
Another possible
trip end
Comparing GPS and Prompted Recall Data Records STOPHER, Peter, PRASAD, Christine, and ZHANG, Jun
12th WCTR, July 11-15, 2010 – Lisbon, Portugal
7
at a time as shown in Figure 4. With each trip displayed, the respondent is given a number of
options. In the pilot survey, respondents were allowed to amend the time that a trip started,
the time it ended, the trip‟s distance and speed, as well as to fill in the purpose of the trip and
the mode of travel used. Respondents could also indicate if a stop had not occurred where it
was shown (i.e., combining two or more trips into one), or if a stop had been made that was
not shown on the map (i.e., splitting a trip into two or more separate trips).
Figure 3: Overview of Travel from the Pilot Version of the Prompted Recall Survey
Figure 4: Details of One Trip on the Pilot Version of the Prompted Recall Survey
Comparing GPS and Prompted Recall Data Records STOPHER, Peter, PRASAD, Christine, and ZHANG, Jun
12th WCTR, July 11-15, 2010 – Lisbon, Portugal
8
In the questions shown on Figure 4, when the respondent clicks on a red question mark, a
window pops up with a list of available responses to the questions about the activity, means
of travel, and accompanying household members. Not all of the steps to create the URLs for
the web survey are explained here, because some require a detailed knowledge of other
aspects of the survey and would be excessive for this paper. It is sufficient to understand the
broad process used to generate the web survey, as described here.
RESULTS OF THE PILOT SURVEY
In this section, the results of the pilot survey are described. As noted in the introduction, out
of 120 households and 228 persons who carried GPS devices in the pilot survey, a total of
46 people in 35 households responded to the prompted recall survey with information
sufficient for analysis. This represents a response rate of 29 percent for households and 20
percent for persons. No incentives were offered for completing the prompted recall survey
and, in the pilot survey, respondents were not forewarned about the prompted recall survey
at the time of recruitment to undertake the GPS survey. (Both of these were changed for the
main survey, with incentives offered for completing the prompted recall and a clear indication
given in the recruitment for the GPS survey that respondents would be asked to complete a
prompted recall survey subsequently.) The response rates for the pilot prompted recall,
whilst low, are neither surprising nor damaging. Assuming that Internet penetration is on the
order of about 74 percent in the US, it should be expected that probably only about 88 of the
households that did the GPS would have Internet available. There are no statistics readily
available about cartographic literacy in the world or in the US, so it is not known what
proportion of those households with Internet would also be able to read maps. However, one
can speculate that possibly no more than 70 percent of the households with the Internet
would also have a household member who is cartographically literate, which would further
reduce the potential for response to the prompted recall to about 62 households. At that
assumed level of Internet penetration and cartographic literacy, the actual response rate for
the prompted recall survey may be closer to 56 percent than 29 percent. Furthermore, within
the homes of those respondents, by no means all family members will be familiar with the
Internet nor able to read maps, which is likely to reduce further the number of persons who
could respond. Overall, for a prompted recall survey conducted without prior warning and
without incentive in the United States, the response rates seem reasonable.
The first step in the analysis of the data was to prepare a file of the GPS trips and the
prompted recall trips placed side-by-side so that once could see the correspondence or lack
of it between the prompted recall editing done by respondents and the map-edited GPS trips.
There were a total of 301 trips identified either by the respondents or the GPS or both. An
initial detailed review of the results provided the following information.
Comparing GPS and Prompted Recall Data Records STOPHER, Peter, PRASAD, Christine, and ZHANG, Jun
12th WCTR, July 11-15, 2010 – Lisbon, Portugal
9
Trip Analysis
Out of 301 trips, there were 9 GPS trips that, on further examination, appeared to be
spurious and should have been deleted in the map editing process. There were an additional
6 trips that were added in map editing that respondents did not agree with and which should
probably not have been added. Thus 15 out of 301 trips were clear map editing errors. This
is an error rate of less than 5 percent and can be reduced further. A total of 208 trips showed
a match between the GPS and the Prompted Recall survey, although further comment is
made on these subsequently. There were 56 trips that the GPS records showed that
respondents did not identify as trips they had made. After further scrutiny of the travel that
respondents admitted to making and further checks on the nature of the trips shown by the
GPS, these 56 trips were all categorised as being genuine trips. These are also discussed
further shortly. There were also 22 trips that were identified by respondents to the prompted
recall survey which did not appear in the GPS record. These are also discussed further.
Overall, the GPS results, following map editing, claimed that respondents to the prompted
recall survey had made 279 trips. On the other hand, respondents claimed to have made 230
trips. After removing those trips that were erroneously added by map editing, the GPS
devices showed 264 trips, where respondents claimed to have made 230 trips. This
represents about 14 percent underreporting even with a prompted recall survey available,
which is quite surprising, even though validation surveys of trip diaries have shown
underreporting of the order of 20-30 percent of trips (Wolf, 2006). It is surprising because, in
the case of validation studies, respondents simply fill out the diaries although they are also
carrying GPS devices which are later downloaded to tell what actual travel was performed. In
this survey, however, respondents are provided with the evidence of what they did first and
then asked to edit it. Because the simplest strategy would be to agree that the GPS was
correct, which would leave the GPS and prompted recall reporting the same number of trips,
it is surprising that respondents took the trouble to delete trips that they do not believe that
they made. It is therefore worthwhile to look in more detail at what happened.
One of the first conclusions that can be drawn is that some respondents do not understand
the definition of a trip, as used in the transport planning profession, a hardly surprising result.
For example, one respondent apparently drove a child to school and then returned home.
The GPS record showed this as two trips (one to school and one back home). However the
respondent disagreed and edited the trip to be a single trip from home back to home, via
school. Later in the day, this same respondent made a multi-stop trip, where the respondent
spent about a minute in one stop (obviously a stop by reason of the travel into and then back
out of that location), and then spent 1 minute and 5 seconds at another location (similarly
obvious by the travel into and out of the location), but joined the three trips together to
represent a single trip. There were eight respondents that had this problem and who insisted
that a multi-stop tour was actually a single trip, where the GPS processing either prior to or
following map editing had split the tour into its constituent trips. This accounted for 23 of the
trips that the GPS reported and that respondents did not report as separate trips. For each of
these cases, one match between GPS and prompted recall was counted, since the combined
trip was reported by the respondent, and the split trips were counted as being GPS only trips.
Comparing GPS and Prompted Recall Data Records STOPHER, Peter, PRASAD, Christine, and ZHANG, Jun
12th WCTR, July 11-15, 2010 – Lisbon, Portugal
10
The remaining 33 GPS trips all appear to be genuine travel that respondents did not agree
that they had taken. In some cases, the trips are necessary for the respondent to return
home, where they clearly left from home on the first trip of the following day. In other cases,
there are clear trips along the highway network that respondents deleted. Two cases were
walking trips made probably from a parked car to a destination or from an origin to a parked
car, where the respondent simply deleted the trip. In neither case did the respondent suggest
that the trip should have been added into and combined with the car trip.
Among the 22 trips that respondents reported that were not in the prompted recall survey, it
was found that 11 of these trips were also map editing errors, in that the trips appeared in the
original GPS record but had been edited out by the map editing process. Therefore, these
must be considered as additional map editing errors and bring the total number of such
errors up to 26, or about 8.6 percent. Of the remaining 11 trips that respondents inserted,
where no trip was shown on the prompted recall survey, one trip appears to be a trip that
was missed by the GPS at the beginning of the day and one appears to be a trip missed at
the end of the day. A further six trips occur in two records where the respondent claimed to
have undertaken travel that did not match in any possible way the travel recorded by the
GPS and presented to them in the prompted recall survey. The remaining 3 trips were
reported by respondents as occurring somewhere in the middle of a day in which there were
matching trips before and after, but these trips were omitted. Thus, out of 22 trips that
respondents claimed to have been made that did not appear in the web survey, 50 percent or
11 actually were recorded by the GPS devices, but were incorrectly deleted in map editing. A
further six trips are in the records of two respondents who completely disagreed with the
GPS record, while two are trips that occurred at the beginning or end of the day and were not
picked up by the GPS devices. These two, plus three other trips appear to be the only
genuine trips that were possibly not picked up by the GPS. Therefore, there are probably no
more than 5 trips in total that were not picked up by the GPS devices, for a total level of error
of about 1.8 percent.
It is also interesting to note that respondents changed the times of starting and ending of a
number of trips, sometimes by very large amounts. Some prompted recall times were
changed by as much or more than an hour, while others were changed by 20 to 50 minutes.
These changes suggested that, in the main survey, respondents should not be given the
option to change the times, because apparently many people do not remember times
accurately and editing of times is not appropriate, given that the GPS device cannot lie about
the times at which people travelled. There were also two cases, as noted earlier, where there
was no possible match between what the respondent provided on the prompted recall and
what the GPS showed for any of the three days that the respondent had the GPS device.
This type of total mismatch has also been found in validation surveys and, so far, lacks
explanation.
Finally, it is worth noting that the average trip rate from the GPS survey computes to 5.74
trips per person per day, and even the results from the prompted recall give a person trip rate
of 5.00 trips per person per day. These rates are noticeably higher than those normally found
in diary surveys, but are comparable to those reported in most GPS surveys. With the
Comparing GPS and Prompted Recall Data Records STOPHER, Peter, PRASAD, Christine, and ZHANG, Jun
12th WCTR, July 11-15, 2010 – Lisbon, Portugal
11
addition to the GPS records of the trips that were deleted in map editing that should not have
been, the rate would increase slightly to 5.98 trips per person per day. This figure suggests
that the GPS survey is working well. Given that almost all trips were by car, there would be
little or no difference between the linked and unlinked trip rate in this case.
Mode and Purpose Analysis
Although respondents were asked to fill in the purpose of their travel (by means of identifying
the nature of the activity at each place to which they travelled) and also to choose their mode
of travel from a drop down list of modes, 18 respondents refused to supply either or both the
purpose and mode of travel. The software that ITLS has developed classifies travel mode
into car, public transport, walk, bicycle, and other, while purpose is classified into home,
work, education, shopping, social-recreational, and other, with the first four being identified
generally from the address information collected at the time of the survey.
Respondents were asked to identify mode of travel from a considerably more detailed list,
including distinguishing between car driver and car passenger, and between school bus and
regular bus. Following the pilot survey, attempts have been made to add into the software
the ability to identify school bus trips, but such was not the case in the pilot survey. However,
the reported modes of travel in the pilot survey were only walk, car driver and car passenger.
In total, respondents provided mode for 123 of the 229 trips that they reported in the
prompted recall, with 106 missing any entry for mode. Of these, five trips were walking trips,
108 were car driver, and 10 were car passenger. No other modes were reported by
respondents in the prompted recall from the pilot survey. From the processing of the GPS
data, mode was identified for 255 of the 279 GPS trips, with 24 being categorised as
unknown. Of these 255 trips with a processed mode, 31 were identified as walk trips, 6 as
bicycle, and 218 as car (either driver or passenger).
Cross-tabulating the mode as identified by GPS processing with that identified by
respondents in the prompted recall survey shows that there are 95 cases where both the
software and the respondent identified the mode as car. Of these, 88 were car drivers and 7
were car passengers. Interestingly, only 2 of the unknown modes from the software were
identified as car driver by the respondents and no other unknown modes from the processing
were identified by respondents. Two trips were identified by both the GPS processing and
the prompted recall as walk, while 9 of the walk trips identified by the software were reported
as car trips by respondents to the prompted recall. One bicycle trip from the processing was
identified by the prompted recall respondent as a car trip and one as a car passenger trip.
Overall, on mode, the processing did a remarkably accurate job, with 97 out of 110 trips
showing both respondents and the software agreed on the mode. This is 88 percent
accuracy from the software.
For purpose, there are four variables that are recorded in the data. There is an origin type,
origin activity, destination type, and destination activity for each trip. The difference between
the type and the activity is that the type identifies the nature of what is at the origin or
destination (e.g., home, primary workplace, secondary workplace, school, retail, etc.), while
Comparing GPS and Prompted Recall Data Records STOPHER, Peter, PRASAD, Christine, and ZHANG, Jun
12th WCTR, July 11-15, 2010 – Lisbon, Portugal
12
the activity defines what the person does there. There are ten types and 15 activities. At
present, the software for trip purpose identification can identify home, work, education,
grocery shopping, and some social-recreational trips. All other purposes and origin or
destination types are defined as „other‟. Table 1 shows the frequencies of occurrence of each
of the origin and destination types from the software and also the same information from the
prompted recall.
Table 1: Frequencies of Origin and Destination Type from Software and Prompted Recall
Type GPS Software Prompted Recall
Origin Destination Origin Destination
Number Percent Number Percent Number Percent Number Percent
Home 82 29.6 76 27.3 71 31.0 54 23.6
Primary Workplace 23 8.3 23 8.3 30 13.1 32 14.0
Secondary Workplace 0 0 0 0 0 0 0 0
Volunteer Job 0 0 0 0 0 0 0 0
School (Daycare, K-12) 2 0.7 2 0.7 2 0.9 2 0.9
School (College, Vocl.) 2 0.7 2 0.7 3 1.3 3 1.3
Retail 17 6.1 17 6.1 22 9.6 24 10.5
Other habitual address 0 0 0 0 0 0 0 0
New place 151 54.5 158 56.8 101 44.1 114 49.8
Out of Study Area 0 0 0 0 0 0 0 0
TOTAL 277 100.0 278 100.0 229 100.0 229 100.0
From Table 1, it appears that the software falls slightly below the prompted recall in correctly
identifying the origin and destination types, with primary workplace and retail both falling
significantly below the proportions for the prompted recall. By the same token, the
percentage that are indicated as a new place are higher with the software than with the
prompted recall. It is useful, then, to compare the results between the software and the
prompted recall, to see with what frequency the software gets the purpose correct. The
results are shown in Table 2. From this table, it can be seen that the GPS processing
software has been quite successful in identifying the origin type, with agreement between the
prompted recall and the software on 174 out of 206 comparable records, or over 84 percent.
Table 2: Cross-Tabulation of GPS Software and Prompted Recall Origin Types
GPS Software Prompted Recall
Home Primary
Workplace
School
(Daycare, K-12)
School (College,
Vocational)
Retail New Total
Home 61 3 0 0 0 2 66
Primary Workplace 0 13 0 0 1 4 18
School (Daycare, K-12) 0 0 2 0 0 0 2
School (College,
Vocational)
0 0 0 2 0 0 2
Retail 0 1 0 0 10 2 13
New 3 7 0 0 9 86 105
TOTAL 64 24 2 2 20 94 206
Comparing GPS and Prompted Recall Data Records STOPHER, Peter, PRASAD, Christine, and ZHANG, Jun
12th WCTR, July 11-15, 2010 – Lisbon, Portugal
13
Table 3 shows the same type of cross-tabulation for the origin activity. In this case, where the
software is currently only able to distinguish the six options of home, work, school, social-
recreational, shop, and other, 65 of the 129 comparisons are exactly correct. However, given
that the software is currently not able to identify volunteer work, pick-up and drop-off,
personal business, eat meal, go for a drive, work-related and school-related, all of these
should have been identified as „other‟ by the software, with the possibility of work-related
being classified as work, school-related as school, and volunteer work as work, then another
22 origin activities can be considered as being correct, bringing the total to 87, or 67 percent.
Table 3: Cross-Tabulation of GPS Software and Prompted Recall Origin Activity
GPS Software Prompted Recall
Home Paid
Work
Vol.
Work
Pick
up/
Drop
Off
Social,
Rec.,
Relig.
Shop Pers.
Bus.
Eat
Meal
Go
for a
Drive
Work
Related
School
Related
Other Total
Home 38 1 1 1 0 0 2 0 0 2 1 0 46
Paid Work 0 11 0 1 0 0 0 0 0 0 0 1 13
School 0 0 1 1 0 0 0 0 0 0 2 0 4
Social, Rec., Rel. 2 2 0 1 1 3 1 1 1 1 0 0 13
Shop 0 0 0 0 0 9 0 0 0 0 0 0 9
Other 2 11 0 2 2 7 5 5 0 4 0 6 44
TOTAL 42 25 2 6 3 19 8 6 1 7 3 7 129
The same comparisons can be made for destination type and destination activity. These are
shown in Tables 4 and 5. In the case of the destination type, a similar result occurred to that
for the origin type, with the software computing the correct destination type on 161 out of 207
comparable trips, or almost 78 percent of cases being correctly identified. For the destination
activity, the results were also similar to the origin activity with the software giving a correct
result for 77 of 129 destinations, or almost 60 percent correct. These results indicate that the
software is doing a reasonably good job, although it would be desirable to try to improve it
further, which is one of the tasks to be undertaken from the main survey.
Table 4: Cross-Tabulation of GPS Software and Prompted Recall Destination Types
GPS Software Prompted Recall
Home Primary
Workplace
School
(Daycare, K-12)
School (College,
Vocational)
Retail New Total
Home 42 0 0 0 1 7 50
Primary Workplace 0 15 0 0 1 5 21
School (Daycare, K-
12)
0 0 2 0 0 0 2
School (College,
Vocational)
0 0 0 2 0 0 2
Retail 0 2 0 0 8 4 14
New 5 9 0 0 12 92 118
TOTAL 47 26 2 2 22 108 207
Comparing GPS and Prompted Recall Data Records STOPHER, Peter, PRASAD, Christine, and ZHANG, Jun
12th WCTR, July 11-15, 2010 – Lisbon, Portugal
14
Table 5: Cross-Tabulation of GPS Software and Prompted Recall Destination Activity
GPS Software Prompted Recall
Home Paid
Work
Vol.
Work
Pick
up/
Drop
Off
Social,
Rec.,
Relig.
Shop Pers.
Bus.
Eat
Meal
Work
Related
School
Related
Other Total
Home 29 0 1 1 0 1 0 0 0 0 0 32
Paid Work 0 13 0 1 0 0 1 0 0 0 1 16
School 0 0 1 1 0 0 0 0 0 2 0 4
Social, Rec., Rel. 3 2 0 0 1 2 1 1 1 1 0 12
Shop 1 1 0 0 0 8 0 0 1 0 0 11
Other 7 12 0 4 2 9 6 5 4 0 5 54
TOTAL 40 28 2 7 3 20 8 6 6 3 6 129
One aspect of the prompted recall that has not been investigated so far is the estimation of
car occupancy and party size for non-car travel. It is planned that this will be developed
further in the main survey. In the pilot survey, there were 87 prompted recall responses that
showed only one person on the trip, with 23 having 2 persons and 12 having three persons.
In the main survey, one change that has been made is to ask for identification of the
members of the household that are accompanying each person. It is expected that software
will be developed to match people together within a household, where the travel recorded by
the GPS appears to be almost identical, and derive occupancy from this. However, this is a
future development at this time.
CONCLUSIONS
Based on the results of the pilot study, it was decided to emphasise to respondents at the
time of the recruitment that there was likely to be a prompted recall follow-up survey. In
addition, it was decided to offer an incentive for completion of the prompted recall survey.
Although such incentives are not considered to be the most effective, because of budgetary
restrictions, it was decided to offer a draw for prizes for completion of the prompted recall
survey. These strategies appear to be having a good effect, because, at the time of writing,
the response rate in the main survey is running at nearly 40 percent for the prompted recall.
This represents a major improvement on the 29 percent response rate of the pilot survey.
Changes have also been made in the training of map editors, so that the errors from map
editing of the pilot survey are unlikely to occur in the main survey. This training has included
instructions to not add trips at the beginning or end of the day, to look more carefully for
spurious GPS data points that do not represent travel, and to be particularly careful to not
delete GPS data that could potentially be a real trip. It is likely that this additional training will
reduce significantly the 8 percent error rate of the pilot map editing.
The GPS devices appear to be functioning extremely well, given an error level of about 1.8
percent in identifying travel. Compared to the underreporting of conventional travel surveys
Comparing GPS and Prompted Recall Data Records STOPHER, Peter, PRASAD, Christine, and ZHANG, Jun
12th WCTR, July 11-15, 2010 – Lisbon, Portugal
15
of 20 to 30 percent, this represents a much higher level of accuracy than has been achieved
in previous travel surveys. In terms of the functioning of the software, it appears that the trip
identification is working very well. It is not possible to determine the exact accuracy of the trip
identification software, because of the map editing process that changes and corrects some
of the resulting trip identification. However, by comparing the results of the software with the
map edited results, it is hoped to be able to propose some further improvements in the
software that will reduce the amount of effort required for the map editing itself. Together,
however, it appears that trip identification is working well.
Mode identification is currently considered to be working relatively well, given an 88 percent
accuracy level. Again, by studying the results where the prompted recall and the GPS
software differed, it is expected that some improvements may be able to be made to the
software. The goal would be to achieve between 90 and 95 percent accuracy.
The results of purpose identification, which, in this application did not make use of any land
use information, are considered very reasonable, although it is hoped that it can be improved
beyond present levels. The results of this exercise showed that the type of origin and
destination were identified to around 80 percent, which is lower than would be desired, but
still quite reasonably high. Improvements in this will be sought. Several changes have
already been made to the purpose identification software which should show improvements
in this area. The match on activities was less favourable, with around 60-67 percent
matching. Again, the programming improvements that have already been made for the main
survey are likely to show some improvement here. Further study of the situations where the
software fails to identify the activity correctly may lead to further improvements in the
software.
Overall, this pilot study shows the value of the prompted recall in validating and helping to
improve the software, and also shows that even a relatively small sample of prompted recall
surveys, not randomly selected, can provide considerable assistance to improving the GPS-
only survey. The results also show the potential value of a GPS-based household travel
survey with the much higher trip rates measured and the detail that is available on the routes
and the geography and timing of travel. Once again, as in all of the GPS validation surveys of
conventional household travel surveys, this study shows that people still have difficulty in
reporting the travel that they undertake. People do not recall the times at which travel takes
place in many cases. Some people have little idea of their travel, as shown by those who
reported in the prompted recall something that did not resemble what the GPS recorded in
any way at all. Such records occur in conventional household surveys. In this case, the
prompted recall results can simply be ignored, while when they occur in a conventional
survey, they will result in the introduction of significant error.
REFERENCES
Auld, J.; Williams, C.; Mohammadian, A.; Nelson, P. (2009). An Automated GPS-Based
Prompted Recall Survey with Learning Algorithms. Transportation Letter, 1 (1), 59-79.
Comparing GPS and Prompted Recall Data Records STOPHER, Peter, PRASAD, Christine, and ZHANG, Jun
12th WCTR, July 11-15, 2010 – Lisbon, Portugal
16
Bachu, P.K.; Dudala, T.; Kothuri, S.M. (2001). Prompted Recall in Global Positioning System
Survey: Proof-of-Concept Study. Transportation Research Record 1768, pp. 106-113.
Lee-Gosselin, M.E.; Doherty, S.T.; Papinski, D. (2006). An Internet-Based Prompted Recall
Diary with Automated GPS Activity-trip Detection: System Design. Proceedings of the
85th Annual Meeting of the Transportation Research Board, January 2006,
Washington, DC.
Li, Z.J.; Shalaby, A.S. (2008). Web-Based GIS System for Prompted Recall of GPS-Assisted
Personal Travel Surveys: System Development and Experimental Study.
Proceedings of the 87th Annual Meeting of the Transportation Research Board,
January 2008, Washington, DC.
Stopher, P.R.; Bullock, P.; Horst, F. (2002). Exploring the Use of Passive GPS Devices to
Measure Travel. Proceedings of the 7th International Conference on Applications of
Advanced Technologies to Transportation, edited by K.C.P. Wang, S. Medanat, S.
Nambisan, and G. Spring, ASCE, Reston VA, pp. 959-967.
Stopher, P.R.; Collins, A. (2005). Conducting a GPS Prompted Recall Survey over the
Internet. Proceedings of the 84th Annual Meeting of the Transportation Research
Board, January 2005, Washington, DC.
Stopher, P.; FitzGerald, C.; Zhang, J. (2008). In Search of a GPS Device for Measuring
Personal Travel. Special Issue of Transportation Research C, on Emerging
Commercial Technologies, 16, 350-369.
Stopher, P.R. (2009). Collecting and Processing Data from Mobile Technologies. In Bonnel,
P.; Lee-Gosselin, M.; Zmud, J.; Madre, J.-L. (editors), Transport Survey Methods:
Keeping Up With a Changing World, Emerald Group Publishing Ltd., England, pp.
361-391.
Stopher, P.; Wargelin, L. (2010). Conducting a Household Travel Survey with GPS. Paper to
be presented to the World Conference on Transport Research, Lisbon, Portugal, July.
TRB (1996). Conference on Household Travel Surveys: New Concepts and Research
Needs. Transportation Research Board Conference Proceedings No.10.
Wagner, D.P. (1997). Lexington Area Travel Data Collection Test: GPS for Personal Travel
Surveys. Final Report for OHIM, OTA, and FHWA, USDOT.
Wolf, J.; Guensler, R.; Bachman, W. (2001). Elimination of the Travel Diary: An Experiment
to Derive Trip Purpose from GPS Travel Data. Transportation Research Record
1768, pp. 124-134.
Wolf, J.; Loechl, M.; Thompson, M.; Arce, C. (2003). Trip Rate Analysis in GPS-Enhanced
Personal Travel Surveys. In Stopher, P.R.; Jones, P.M. (editors), Transport Survey
Quality and Innovation, Pergamon Press, Oxford, England, pp. 483-498.
Wolf, J. (2006). Applications of New Technologies in Travel Surveys. In Travel Survey
Methods: Quality and Future Directions, Stopher, P. and Stecher, C. (editors),
Elsevier, Oxford, pp. 531-544.
Wolf, J. (2009). Mobile Technologies: Synthesis of a Workshop. In Bonnel, P.; Lee-Gosselin,
M.; Zmud, J.; Madre, J.-L. (editors), Transport Survey Methods: Keeping Up With a
Changing World, Emerald Group Publishing Ltd., England, pp. 393-401.