comparing gps and prompted recall data records€¦ · comparing gps and prompted recall data...

16
Comparing GPS and Prompted Recall Data Records STOPHER, Peter, PRASAD, Christine, and ZHANG, Jun 12 th WCTR, July 11-15, 2010 Lisbon, Portugal 1 COMPARING GPS AND PROMPTED RECALL DATA RECORDS Peter Stopher, Institute of Transport and Logistics Studies, The University of Sydney Christine Prasad, Institute of Transport and Logistics Studies, The University of Sydney Jun Zhang, Institute of Transport and Logistics Studies, The University of Sydney ABSTRACT Global Position System (GPS) devices as a substitute for conventional travel diaries have been considered for some time in the past. However, only in the past year or so has a serious effort been made to trial replacement of travel diaries with GPS. Apart from the refinements in the technology to obtain sufficiently accurate measurements of position and to have battery power that will last sufficiently long, the major barrier to the widespread adoption of GPS as a means of undertaking household travel surveys is the information required for travel demand modelling that is not collected by the GPS device. GPS devices are, of course, very accurate at recording the time and positional characteristics of travel. However, the GPS device cannot record modes of travel used, trip purposes, or number of occupants in private vehicles, all of which are important attributes normally acquired in a household travel survey. Some researchers have developed interactive GPS devices, in which respondents are asked to either respond to questions asked on a PDA screen about these non-GPS characteristics of travel or, in a simpler process for the respondent, to press certain buttons on a GPS device that record the mode and purpose. Such procedures are possibly useful, but have to rely on the memory of the respondent to enter such data at the time of travel, somewhat undermining the low burden and non-reliance on respondent memory of the passive GPS device. The alternative approach, and the one that is the focus of this paper, is to develop software processes that are able to deduce, with sufficient accuracy, the missing data from a combination of the GPS records, other data collected from respondents, and data available in GIS records. The authors of this paper have developed software that is largely able to deduce these characteristics, but the level of accuracy has largely been unchecked in the past. As part of an ongoing GPS-only survey in the Greater Cincinnati region, the consultant team is undertaking a prompted recall survey on a sample of households. In this paper, we report on comparisons between the results from the processing software and the prompted

Upload: others

Post on 05-Aug-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: COMPARING GPS AND PROMPTED RECALL DATA RECORDS€¦ · Comparing GPS and Prompted Recall Data Records STOPHER, Peter, PRASAD, Christine, and ZHANG, Jun 12th WCTR, July 11-15, 2010

Comparing GPS and Prompted Recall Data Records STOPHER, Peter, PRASAD, Christine, and ZHANG, Jun

12th WCTR, July 11-15, 2010 – Lisbon, Portugal

1

COMPARING GPS AND PROMPTED RECALL DATA RECORDS

Peter Stopher, Institute of Transport and Logistics Studies, The University of Sydney

Christine Prasad, Institute of Transport and Logistics Studies, The University of Sydney

Jun Zhang, Institute of Transport and Logistics Studies, The University of Sydney

ABSTRACT

Global Position System (GPS) devices as a substitute for conventional travel diaries have

been considered for some time in the past. However, only in the past year or so has a

serious effort been made to trial replacement of travel diaries with GPS. Apart from the

refinements in the technology to obtain sufficiently accurate measurements of position and to

have battery power that will last sufficiently long, the major barrier to the widespread

adoption of GPS as a means of undertaking household travel surveys is the information

required for travel demand modelling that is not collected by the GPS device. GPS devices

are, of course, very accurate at recording the time and positional characteristics of travel.

However, the GPS device cannot record modes of travel used, trip purposes, or number of

occupants in private vehicles, all of which are important attributes normally acquired in a

household travel survey. Some researchers have developed interactive GPS devices, in

which respondents are asked to either respond to questions asked on a PDA screen about

these non-GPS characteristics of travel or, in a simpler process for the respondent, to press

certain buttons on a GPS device that record the mode and purpose. Such procedures are

possibly useful, but have to rely on the memory of the respondent to enter such data at the

time of travel, somewhat undermining the low burden and non-reliance on respondent

memory of the passive GPS device.

The alternative approach, and the one that is the focus of this paper, is to develop software

processes that are able to deduce, with sufficient accuracy, the missing data from a

combination of the GPS records, other data collected from respondents, and data available

in GIS records. The authors of this paper have developed software that is largely able to

deduce these characteristics, but the level of accuracy has largely been unchecked in the

past. As part of an ongoing GPS-only survey in the Greater Cincinnati region, the consultant

team is undertaking a prompted recall survey on a sample of households. In this paper, we

report on comparisons between the results from the processing software and the prompted

Page 2: COMPARING GPS AND PROMPTED RECALL DATA RECORDS€¦ · Comparing GPS and Prompted Recall Data Records STOPHER, Peter, PRASAD, Christine, and ZHANG, Jun 12th WCTR, July 11-15, 2010

Comparing GPS and Prompted Recall Data Records STOPHER, Peter, PRASAD, Christine, and ZHANG, Jun

12th WCTR, July 11-15, 2010 – Lisbon, Portugal

2

recall web survey, with respect to identifying trips, mode of travel, purpose of travel, and

number of household members present in a privately-owned vehicle. These results, when

collected from the entire survey (due to be completed in August 2010) will be used to

develop refinements to the processing software, either through refinements to the present

rules used or through the addition of Artificial Intelligence (AI) procedures that are able to

learn from the prompted recall results. In the latter case, it would be expected that the future

of GPS-only surveys would necessarily include a small subsample prompted recall survey to

provide the needed data to teach the AI procedure for each new GPS survey undertaken.

However, if satisfactory results can be obtained from a refinement of rules, it may be possible

that future GPS surveys will be able to be done without a prompted recall component.

INTRODUCTION

In the mid-1990s, GPS was first trialled (Wagner, 1997) as a means to measure people‟s

travel, as a direct outcome of a Conference held in Irvine, California on Household Travel

Surveys (TRB, 1996). At that time, selective availability was still in place, GPS technology

was in its infancy, and devices were cumbersome and required an external power source.

Over a little more than a decade, selective availability has been turned off and the technology

has improved enormously, as summarised by Wolf (2009). Since the outset of GPS use, the

idea in the minds of the profession has been that one day GPS might replace the

conventional interview or self-administered household travel survey (Wolf et al., 2001).

However, in the early years of the development of GPS surveys, it was clear that neither the

technology nor the processing software was yet ready for such a replacement to take place.

Rather, most of the use of GPS was to validate travel surveys (Wolf et al., 2003; Stopher,

2009) and for evaluation of travel behaviour changes aimed at reducing daily vehicle

kilometres of travel (VKT) (Stopher, 2009). However, these uses of GPS served both to

provide the opportunity to improve and change the design of the devices and also to develop

increasingly sophisticated processing software (Stopher et al., 2008).

With these developments in mind, the Ohio Department of Transportation (ODOT)

commissioned the first GPS-only Household Travel Study, which is taking place at the time

of writing this paper in the Greater Cincinnati Area of the Ohio-Kentucky-Indiana (OKI)

metropolitan planning organisation region. This study was commissioned in early 2009, with

a pilot survey to be conducted in April of that year and with the expectation of the main

survey being conducted over a 12-month period from about August 2009 to August 2010. In

the original plan for this study, it was expected that about 4,000 households would eventually

be included in the sample, with all households using GPS devices for a period of two to three

days. GPS devices were to be given to all members of a household aged 12 years and over,

and brief diary surveys were to be used for those members of the household under the age

of 12. In addition, a subsample of about 1,200 households would be asked to undertake a

Prompted Recall survey (details of which are provided in the next section of this paper). The

survey has two purposes. The first purpose is to provide an opportunity to upgrade and

improve the processing software, so that more complete and more accurate data can be

obtained from the GPS records. The second purpose is to provide a database that can

Page 3: COMPARING GPS AND PROMPTED RECALL DATA RECORDS€¦ · Comparing GPS and Prompted Recall Data Records STOPHER, Peter, PRASAD, Christine, and ZHANG, Jun 12th WCTR, July 11-15, 2010

Comparing GPS and Prompted Recall Data Records STOPHER, Peter, PRASAD, Christine, and ZHANG, Jun

12th WCTR, July 11-15, 2010 – Lisbon, Portugal

3

support the continued updating and improvement of travel demand models for the OKI

region, and to do this with processed GPS data for the first time.

In the pilot survey (which is the main focus of this paper), it was intended to recruit 250

households that would use the GPS devices, and to recruit 100 of these households to

undertake the prompted recall survey. As a result of lower response rates than expected, the

pilot survey resulted in a sample of 120 households and 228 persons who provided GPS

data. More details of the pilot survey are provided in Stopher and Wargelin (2010). It was

decided to recruit all pilot survey households to undertake the prompted recall survey. This

resulted in 35 households providing usable prompted recall data from a total of 46

individuals. The prompted recall survey provided data for one day of travel for each person

who completed it. The focus of this paper is a comparison of the prompted recall data with

the GPS data, with a particular focus on how well the processing software functions and

identification of where improvements seem possible to make.

THE PROMPTED RECALL SURVEY

The prompted recall survey first appeared very early in the development of GPS applications

in transport (Bachu et al., 2001) and was then developed further in a number of subsequent

studies. Stopher et al. (2002) used a small pilot study to investigate the concept further,

subsequently transitioning the survey from a paper prompted recall to an Internet version

(Stopher and Collins, 2005). This was followed by a number of further developments in

Internet-based prompted recall surveys in the next few years (Lee-Gosselin et al., 2006; Li

and Shalaby, 2008, Auld et al., 2009). Auld et al. (2009) provide a more detailed history of

the development of prompted recall surveys over the past eight years or so.

The concept of the prompted recall survey is that respondents who have earlier carried a

GPS device with them for a day or more are subsequently sent information that allows

display of the travel recorded on the GPS device. They are then asked to provide additional

information about the travel, such as the mode of travel, the purpose of travel and the size of

their travel party, as well as to indicate if there are any errors in the processed GPS data.

The maps that display the travel recorded by the GPS device therefore act as a memory

prompt to the individual, then allowing the individual to respond to questions about the travel.

In the earliest form of the prompted recall survey, maps of each day of travel undertaken by

the respondent were printed and incorporated within a paper survey that then asked for

further information about the travel and also offered the respondent the opportunity to

indicate if there were errors in the processing or if there were gaps in the GPS record. In

general, however, the paper survey was rather clumsy, in that the respondent could

generally indicate only limited information about the displayed trips and correction of the

processing. Indicating that a mapped stop was not a stop, or that a stop had been omitted at

a certain location, or that entire travel had been omitted was generally difficult to

accommodate in a paper format.

Thus, the transition of the prompted recall survey from paper to the Internet, providing an

interactive environment in which respondents could indicate corrections to the GPS

Page 4: COMPARING GPS AND PROMPTED RECALL DATA RECORDS€¦ · Comparing GPS and Prompted Recall Data Records STOPHER, Peter, PRASAD, Christine, and ZHANG, Jun 12th WCTR, July 11-15, 2010

Comparing GPS and Prompted Recall Data Records STOPHER, Peter, PRASAD, Christine, and ZHANG, Jun

12th WCTR, July 11-15, 2010 – Lisbon, Portugal

4

processed record, was extremely important to the continuing use of the prompted recall

survey. There remain two problems associated with the prompted recall survey, however.

The first is that the survey requires that respondents are familiar with maps and map reading,

and have the ability to understand the implications of a series of trips shown on a map. Ability

to read a map may require a higher level of literacy than is often required for standard paper

and pencil surveys. Second, the survey requires access to and familiarity with the Internet.

This necessarily reduces the proportion of households and household members who could

respond to a prompted recall survey administered over the Internet.

At the outset of the Greater Cincinnati Area Household Travel Survey (GCAHTS), it was

proposed to conduct prompted recall surveys by both paper and pencil and the Internet.

However, as the specification of the survey was developed in an Internet environment, it

rapidly became clear that a comparable survey by paper and pencil could not be developed

within reasonable resources. The decision was made, therefore, that the prompted recall

survey would be conducted only by Internet. While this could be considered to generate

some bias in the responses, there is, in fact, no need for the prompted recall survey to be

undertaken by a representative sample of the population, because the purposes of the

survey are not to expand the prompted recall results to the entire population of the region.

In the case of the GCAHTS, the purposes of the prompted recall survey are to provide

“ground truth” about the travel undertaken by a subsample of people, against which to check

the results of the processing of GPS data, and also to provide a data source for potentially

improving the processing software. Neither of these uses demands a representative sample.

There is no question that a representative sample would be nice to have, but it is not a

requirement for the use of the data.

Whilst the decision was made to limit the prompted recall survey to an Internet version, it was

also decided to develop the prompted recall survey by using Google® Maps, so that

respondents would be likely to find some similarity between the prompted recall survey and

maps that they may possibly be familiar with using in their own use of the Internet. It is

necessary for processing to be undertaken on the GPS data, prior to creating the maps for

the prompted recall survey. The procedures for data processing and analysis and creation of

the prompted recall survey are described in the next section of this paper.

DATA PROCESSING AND ANALYSIS AND GENERATION OF THE PROMPTED RECALL SURVEY

In this survey, the GPS devices that were used for the pilot and are being used in the main

survey are Atmel BTT-08 devices with various modifications to the hardware and the

firmware, as specified by the Institute of Transport and Logistics Studies (ITLS). The device

is shown in Figure 1. Each person 12 years of age or older in each sampled household is

asked to carry one of these devices (identified to that person for the duration of the

household‟s use of the device) for about three days. The device is set to record position

every second and are equipped with a vibration sensor. If no vibration is sensed for 3

Page 5: COMPARING GPS AND PROMPTED RECALL DATA RECORDS€¦ · Comparing GPS and Prompted Recall Data Records STOPHER, Peter, PRASAD, Christine, and ZHANG, Jun 12th WCTR, July 11-15, 2010

Comparing GPS and Prompted Recall Data Records STOPHER, Peter, PRASAD, Christine, and ZHANG, Jun

12th WCTR, July 11-15, 2010 – Lisbon, Portugal

5

minutes, the device turns itself off. As soon as the device is vibrated again, it turns on and

seeks a position. If the time it has been off is less than about an hour, then the position is

usually acquired within a matter of 10 to 15 seconds. However, if the device has been off for

more than an hour, position acquisition may take from 10 or 15 seconds up to about a minute

or so, depending on the speed of movement and location of the device.

When the survey period is completed by a household, the devices (which are logging

devices) are returned to the survey team who download the data. What is obtained is a

modified stream of data from the GPS device, giving the second-by-second position of the

device for the two or three days during which the sampled respondent had the device. These

data are then processed by a series of software programs developed at ITLS. The first of

these programs uses a number of rules to delete spurious data (generally the data collected

whilst the device is at rest at the end of a trip, or possibly in the middle of a trip when there is

a lengthy delay in movement, such as may occur at a traffic signal) and to split the data

stream up into what are assumed to be individual trips. At the completion of this process,

maps are generated by the software, along with a summary file showing the assumed start

and end locations of each trip, the time (to the nearest second) when the trip started and

ended, and some of the other characteristics of the trip (distance, elapsed time, average

speed, etc.).

Because it is not possible to craft rules that will work 100 percent of the time, the next step in

the process is what ITLS calls „map editing‟. In this process, trained staff at ITLS review the

maps for each day using TransCAD software and look for possible spurious data that may

not have been deleted in the initial processing, for possible stops in a trip that the software

identified as a single trip, and trips that might be split into two or that may be missing due to

loss of GPS signal. Trips may not be split correctly by the software due to a rule that dictates

that an identifiable stop (after removing spurious data) must last for at least 120 seconds to

define the end of a trip. Because there are a number of activities that will take less than 120

seconds to accomplish that should also define the end of a trip (such as picking up or

dropping off a person), and also because the deletion of spurious data is also done

conservatively by the software, it is necessary to inspect the map and make some edits to

the list of trips provided by the software processing. An example of this is shown in Figure 2.

This figure shows part of what the processing software identified as a single trip. However, it

can be seen that there are some clusters of points at three locations and another location

Figure 1: Atmel BTT-08 GPS

Device

Page 6: COMPARING GPS AND PROMPTED RECALL DATA RECORDS€¦ · Comparing GPS and Prompted Recall Data Records STOPHER, Peter, PRASAD, Christine, and ZHANG, Jun 12th WCTR, July 11-15, 2010

Comparing GPS and Prompted Recall Data Records STOPHER, Peter, PRASAD, Christine, and ZHANG, Jun

12th WCTR, July 11-15, 2010 – Lisbon, Portugal

6

where the respondent appears to have travelled to a point and then returned, without a stop

of any noticeable duration. These all suggest the possibility that the trip should be broken

into a number of different trips.

Figure 2: Part of a 'Single' Trip Identified by the Software

The map editing process, in this case, would have removed the agglomeration of data points

at each of the three locations on the left and bottom of the map, would have inserted stops at

those locations, and would also have created a stop at the end of the trip near the top of the

map. Edits of this type are needed so that the respondent is not expected to understand how

such agglomerations of data points occur and to provide a map that is more clearly

representative of the travel undertaken. All editing of trips are done to a trip list in text format

so the original visual map that was generated will remain unchanged.

After map editing is complete, the data are then run through several processes prior to

producing the data for the web survey. One of the processes applies the changes from the

amended trip list file to the original trip database to remove data points, and split or join trips.

Another process compares the address information collected from respondents to the

locations of trip ends in the modified trip list and records any matches for input into the web

survey, so that home, work, educational establishment, or grocery shopping locations can be

shown on the map and the possible purpose of the trip can be shown.

Figure 3 shows an example of the web survey as it was used in the pilot survey, displaying at

the outset to the respondent the full day‟s worth of travel that was recorded by the GPS. This

is intended to orientate them and to show the overall task that the respondent is to undertake

as part of the web survey. The survey then proceeds by displaying to the respondent one trip

Probable trip ends

Another possible

trip end

Page 7: COMPARING GPS AND PROMPTED RECALL DATA RECORDS€¦ · Comparing GPS and Prompted Recall Data Records STOPHER, Peter, PRASAD, Christine, and ZHANG, Jun 12th WCTR, July 11-15, 2010

Comparing GPS and Prompted Recall Data Records STOPHER, Peter, PRASAD, Christine, and ZHANG, Jun

12th WCTR, July 11-15, 2010 – Lisbon, Portugal

7

at a time as shown in Figure 4. With each trip displayed, the respondent is given a number of

options. In the pilot survey, respondents were allowed to amend the time that a trip started,

the time it ended, the trip‟s distance and speed, as well as to fill in the purpose of the trip and

the mode of travel used. Respondents could also indicate if a stop had not occurred where it

was shown (i.e., combining two or more trips into one), or if a stop had been made that was

not shown on the map (i.e., splitting a trip into two or more separate trips).

Figure 3: Overview of Travel from the Pilot Version of the Prompted Recall Survey

Figure 4: Details of One Trip on the Pilot Version of the Prompted Recall Survey

Page 8: COMPARING GPS AND PROMPTED RECALL DATA RECORDS€¦ · Comparing GPS and Prompted Recall Data Records STOPHER, Peter, PRASAD, Christine, and ZHANG, Jun 12th WCTR, July 11-15, 2010

Comparing GPS and Prompted Recall Data Records STOPHER, Peter, PRASAD, Christine, and ZHANG, Jun

12th WCTR, July 11-15, 2010 – Lisbon, Portugal

8

In the questions shown on Figure 4, when the respondent clicks on a red question mark, a

window pops up with a list of available responses to the questions about the activity, means

of travel, and accompanying household members. Not all of the steps to create the URLs for

the web survey are explained here, because some require a detailed knowledge of other

aspects of the survey and would be excessive for this paper. It is sufficient to understand the

broad process used to generate the web survey, as described here.

RESULTS OF THE PILOT SURVEY

In this section, the results of the pilot survey are described. As noted in the introduction, out

of 120 households and 228 persons who carried GPS devices in the pilot survey, a total of

46 people in 35 households responded to the prompted recall survey with information

sufficient for analysis. This represents a response rate of 29 percent for households and 20

percent for persons. No incentives were offered for completing the prompted recall survey

and, in the pilot survey, respondents were not forewarned about the prompted recall survey

at the time of recruitment to undertake the GPS survey. (Both of these were changed for the

main survey, with incentives offered for completing the prompted recall and a clear indication

given in the recruitment for the GPS survey that respondents would be asked to complete a

prompted recall survey subsequently.) The response rates for the pilot prompted recall,

whilst low, are neither surprising nor damaging. Assuming that Internet penetration is on the

order of about 74 percent in the US, it should be expected that probably only about 88 of the

households that did the GPS would have Internet available. There are no statistics readily

available about cartographic literacy in the world or in the US, so it is not known what

proportion of those households with Internet would also be able to read maps. However, one

can speculate that possibly no more than 70 percent of the households with the Internet

would also have a household member who is cartographically literate, which would further

reduce the potential for response to the prompted recall to about 62 households. At that

assumed level of Internet penetration and cartographic literacy, the actual response rate for

the prompted recall survey may be closer to 56 percent than 29 percent. Furthermore, within

the homes of those respondents, by no means all family members will be familiar with the

Internet nor able to read maps, which is likely to reduce further the number of persons who

could respond. Overall, for a prompted recall survey conducted without prior warning and

without incentive in the United States, the response rates seem reasonable.

The first step in the analysis of the data was to prepare a file of the GPS trips and the

prompted recall trips placed side-by-side so that once could see the correspondence or lack

of it between the prompted recall editing done by respondents and the map-edited GPS trips.

There were a total of 301 trips identified either by the respondents or the GPS or both. An

initial detailed review of the results provided the following information.

Page 9: COMPARING GPS AND PROMPTED RECALL DATA RECORDS€¦ · Comparing GPS and Prompted Recall Data Records STOPHER, Peter, PRASAD, Christine, and ZHANG, Jun 12th WCTR, July 11-15, 2010

Comparing GPS and Prompted Recall Data Records STOPHER, Peter, PRASAD, Christine, and ZHANG, Jun

12th WCTR, July 11-15, 2010 – Lisbon, Portugal

9

Trip Analysis

Out of 301 trips, there were 9 GPS trips that, on further examination, appeared to be

spurious and should have been deleted in the map editing process. There were an additional

6 trips that were added in map editing that respondents did not agree with and which should

probably not have been added. Thus 15 out of 301 trips were clear map editing errors. This

is an error rate of less than 5 percent and can be reduced further. A total of 208 trips showed

a match between the GPS and the Prompted Recall survey, although further comment is

made on these subsequently. There were 56 trips that the GPS records showed that

respondents did not identify as trips they had made. After further scrutiny of the travel that

respondents admitted to making and further checks on the nature of the trips shown by the

GPS, these 56 trips were all categorised as being genuine trips. These are also discussed

further shortly. There were also 22 trips that were identified by respondents to the prompted

recall survey which did not appear in the GPS record. These are also discussed further.

Overall, the GPS results, following map editing, claimed that respondents to the prompted

recall survey had made 279 trips. On the other hand, respondents claimed to have made 230

trips. After removing those trips that were erroneously added by map editing, the GPS

devices showed 264 trips, where respondents claimed to have made 230 trips. This

represents about 14 percent underreporting even with a prompted recall survey available,

which is quite surprising, even though validation surveys of trip diaries have shown

underreporting of the order of 20-30 percent of trips (Wolf, 2006). It is surprising because, in

the case of validation studies, respondents simply fill out the diaries although they are also

carrying GPS devices which are later downloaded to tell what actual travel was performed. In

this survey, however, respondents are provided with the evidence of what they did first and

then asked to edit it. Because the simplest strategy would be to agree that the GPS was

correct, which would leave the GPS and prompted recall reporting the same number of trips,

it is surprising that respondents took the trouble to delete trips that they do not believe that

they made. It is therefore worthwhile to look in more detail at what happened.

One of the first conclusions that can be drawn is that some respondents do not understand

the definition of a trip, as used in the transport planning profession, a hardly surprising result.

For example, one respondent apparently drove a child to school and then returned home.

The GPS record showed this as two trips (one to school and one back home). However the

respondent disagreed and edited the trip to be a single trip from home back to home, via

school. Later in the day, this same respondent made a multi-stop trip, where the respondent

spent about a minute in one stop (obviously a stop by reason of the travel into and then back

out of that location), and then spent 1 minute and 5 seconds at another location (similarly

obvious by the travel into and out of the location), but joined the three trips together to

represent a single trip. There were eight respondents that had this problem and who insisted

that a multi-stop tour was actually a single trip, where the GPS processing either prior to or

following map editing had split the tour into its constituent trips. This accounted for 23 of the

trips that the GPS reported and that respondents did not report as separate trips. For each of

these cases, one match between GPS and prompted recall was counted, since the combined

trip was reported by the respondent, and the split trips were counted as being GPS only trips.

Page 10: COMPARING GPS AND PROMPTED RECALL DATA RECORDS€¦ · Comparing GPS and Prompted Recall Data Records STOPHER, Peter, PRASAD, Christine, and ZHANG, Jun 12th WCTR, July 11-15, 2010

Comparing GPS and Prompted Recall Data Records STOPHER, Peter, PRASAD, Christine, and ZHANG, Jun

12th WCTR, July 11-15, 2010 – Lisbon, Portugal

10

The remaining 33 GPS trips all appear to be genuine travel that respondents did not agree

that they had taken. In some cases, the trips are necessary for the respondent to return

home, where they clearly left from home on the first trip of the following day. In other cases,

there are clear trips along the highway network that respondents deleted. Two cases were

walking trips made probably from a parked car to a destination or from an origin to a parked

car, where the respondent simply deleted the trip. In neither case did the respondent suggest

that the trip should have been added into and combined with the car trip.

Among the 22 trips that respondents reported that were not in the prompted recall survey, it

was found that 11 of these trips were also map editing errors, in that the trips appeared in the

original GPS record but had been edited out by the map editing process. Therefore, these

must be considered as additional map editing errors and bring the total number of such

errors up to 26, or about 8.6 percent. Of the remaining 11 trips that respondents inserted,

where no trip was shown on the prompted recall survey, one trip appears to be a trip that

was missed by the GPS at the beginning of the day and one appears to be a trip missed at

the end of the day. A further six trips occur in two records where the respondent claimed to

have undertaken travel that did not match in any possible way the travel recorded by the

GPS and presented to them in the prompted recall survey. The remaining 3 trips were

reported by respondents as occurring somewhere in the middle of a day in which there were

matching trips before and after, but these trips were omitted. Thus, out of 22 trips that

respondents claimed to have been made that did not appear in the web survey, 50 percent or

11 actually were recorded by the GPS devices, but were incorrectly deleted in map editing. A

further six trips are in the records of two respondents who completely disagreed with the

GPS record, while two are trips that occurred at the beginning or end of the day and were not

picked up by the GPS devices. These two, plus three other trips appear to be the only

genuine trips that were possibly not picked up by the GPS. Therefore, there are probably no

more than 5 trips in total that were not picked up by the GPS devices, for a total level of error

of about 1.8 percent.

It is also interesting to note that respondents changed the times of starting and ending of a

number of trips, sometimes by very large amounts. Some prompted recall times were

changed by as much or more than an hour, while others were changed by 20 to 50 minutes.

These changes suggested that, in the main survey, respondents should not be given the

option to change the times, because apparently many people do not remember times

accurately and editing of times is not appropriate, given that the GPS device cannot lie about

the times at which people travelled. There were also two cases, as noted earlier, where there

was no possible match between what the respondent provided on the prompted recall and

what the GPS showed for any of the three days that the respondent had the GPS device.

This type of total mismatch has also been found in validation surveys and, so far, lacks

explanation.

Finally, it is worth noting that the average trip rate from the GPS survey computes to 5.74

trips per person per day, and even the results from the prompted recall give a person trip rate

of 5.00 trips per person per day. These rates are noticeably higher than those normally found

in diary surveys, but are comparable to those reported in most GPS surveys. With the

Page 11: COMPARING GPS AND PROMPTED RECALL DATA RECORDS€¦ · Comparing GPS and Prompted Recall Data Records STOPHER, Peter, PRASAD, Christine, and ZHANG, Jun 12th WCTR, July 11-15, 2010

Comparing GPS and Prompted Recall Data Records STOPHER, Peter, PRASAD, Christine, and ZHANG, Jun

12th WCTR, July 11-15, 2010 – Lisbon, Portugal

11

addition to the GPS records of the trips that were deleted in map editing that should not have

been, the rate would increase slightly to 5.98 trips per person per day. This figure suggests

that the GPS survey is working well. Given that almost all trips were by car, there would be

little or no difference between the linked and unlinked trip rate in this case.

Mode and Purpose Analysis

Although respondents were asked to fill in the purpose of their travel (by means of identifying

the nature of the activity at each place to which they travelled) and also to choose their mode

of travel from a drop down list of modes, 18 respondents refused to supply either or both the

purpose and mode of travel. The software that ITLS has developed classifies travel mode

into car, public transport, walk, bicycle, and other, while purpose is classified into home,

work, education, shopping, social-recreational, and other, with the first four being identified

generally from the address information collected at the time of the survey.

Respondents were asked to identify mode of travel from a considerably more detailed list,

including distinguishing between car driver and car passenger, and between school bus and

regular bus. Following the pilot survey, attempts have been made to add into the software

the ability to identify school bus trips, but such was not the case in the pilot survey. However,

the reported modes of travel in the pilot survey were only walk, car driver and car passenger.

In total, respondents provided mode for 123 of the 229 trips that they reported in the

prompted recall, with 106 missing any entry for mode. Of these, five trips were walking trips,

108 were car driver, and 10 were car passenger. No other modes were reported by

respondents in the prompted recall from the pilot survey. From the processing of the GPS

data, mode was identified for 255 of the 279 GPS trips, with 24 being categorised as

unknown. Of these 255 trips with a processed mode, 31 were identified as walk trips, 6 as

bicycle, and 218 as car (either driver or passenger).

Cross-tabulating the mode as identified by GPS processing with that identified by

respondents in the prompted recall survey shows that there are 95 cases where both the

software and the respondent identified the mode as car. Of these, 88 were car drivers and 7

were car passengers. Interestingly, only 2 of the unknown modes from the software were

identified as car driver by the respondents and no other unknown modes from the processing

were identified by respondents. Two trips were identified by both the GPS processing and

the prompted recall as walk, while 9 of the walk trips identified by the software were reported

as car trips by respondents to the prompted recall. One bicycle trip from the processing was

identified by the prompted recall respondent as a car trip and one as a car passenger trip.

Overall, on mode, the processing did a remarkably accurate job, with 97 out of 110 trips

showing both respondents and the software agreed on the mode. This is 88 percent

accuracy from the software.

For purpose, there are four variables that are recorded in the data. There is an origin type,

origin activity, destination type, and destination activity for each trip. The difference between

the type and the activity is that the type identifies the nature of what is at the origin or

destination (e.g., home, primary workplace, secondary workplace, school, retail, etc.), while

Page 12: COMPARING GPS AND PROMPTED RECALL DATA RECORDS€¦ · Comparing GPS and Prompted Recall Data Records STOPHER, Peter, PRASAD, Christine, and ZHANG, Jun 12th WCTR, July 11-15, 2010

Comparing GPS and Prompted Recall Data Records STOPHER, Peter, PRASAD, Christine, and ZHANG, Jun

12th WCTR, July 11-15, 2010 – Lisbon, Portugal

12

the activity defines what the person does there. There are ten types and 15 activities. At

present, the software for trip purpose identification can identify home, work, education,

grocery shopping, and some social-recreational trips. All other purposes and origin or

destination types are defined as „other‟. Table 1 shows the frequencies of occurrence of each

of the origin and destination types from the software and also the same information from the

prompted recall.

Table 1: Frequencies of Origin and Destination Type from Software and Prompted Recall

Type GPS Software Prompted Recall

Origin Destination Origin Destination

Number Percent Number Percent Number Percent Number Percent

Home 82 29.6 76 27.3 71 31.0 54 23.6

Primary Workplace 23 8.3 23 8.3 30 13.1 32 14.0

Secondary Workplace 0 0 0 0 0 0 0 0

Volunteer Job 0 0 0 0 0 0 0 0

School (Daycare, K-12) 2 0.7 2 0.7 2 0.9 2 0.9

School (College, Vocl.) 2 0.7 2 0.7 3 1.3 3 1.3

Retail 17 6.1 17 6.1 22 9.6 24 10.5

Other habitual address 0 0 0 0 0 0 0 0

New place 151 54.5 158 56.8 101 44.1 114 49.8

Out of Study Area 0 0 0 0 0 0 0 0

TOTAL 277 100.0 278 100.0 229 100.0 229 100.0

From Table 1, it appears that the software falls slightly below the prompted recall in correctly

identifying the origin and destination types, with primary workplace and retail both falling

significantly below the proportions for the prompted recall. By the same token, the

percentage that are indicated as a new place are higher with the software than with the

prompted recall. It is useful, then, to compare the results between the software and the

prompted recall, to see with what frequency the software gets the purpose correct. The

results are shown in Table 2. From this table, it can be seen that the GPS processing

software has been quite successful in identifying the origin type, with agreement between the

prompted recall and the software on 174 out of 206 comparable records, or over 84 percent.

Table 2: Cross-Tabulation of GPS Software and Prompted Recall Origin Types

GPS Software Prompted Recall

Home Primary

Workplace

School

(Daycare, K-12)

School (College,

Vocational)

Retail New Total

Home 61 3 0 0 0 2 66

Primary Workplace 0 13 0 0 1 4 18

School (Daycare, K-12) 0 0 2 0 0 0 2

School (College,

Vocational)

0 0 0 2 0 0 2

Retail 0 1 0 0 10 2 13

New 3 7 0 0 9 86 105

TOTAL 64 24 2 2 20 94 206

Page 13: COMPARING GPS AND PROMPTED RECALL DATA RECORDS€¦ · Comparing GPS and Prompted Recall Data Records STOPHER, Peter, PRASAD, Christine, and ZHANG, Jun 12th WCTR, July 11-15, 2010

Comparing GPS and Prompted Recall Data Records STOPHER, Peter, PRASAD, Christine, and ZHANG, Jun

12th WCTR, July 11-15, 2010 – Lisbon, Portugal

13

Table 3 shows the same type of cross-tabulation for the origin activity. In this case, where the

software is currently only able to distinguish the six options of home, work, school, social-

recreational, shop, and other, 65 of the 129 comparisons are exactly correct. However, given

that the software is currently not able to identify volunteer work, pick-up and drop-off,

personal business, eat meal, go for a drive, work-related and school-related, all of these

should have been identified as „other‟ by the software, with the possibility of work-related

being classified as work, school-related as school, and volunteer work as work, then another

22 origin activities can be considered as being correct, bringing the total to 87, or 67 percent.

Table 3: Cross-Tabulation of GPS Software and Prompted Recall Origin Activity

GPS Software Prompted Recall

Home Paid

Work

Vol.

Work

Pick

up/

Drop

Off

Social,

Rec.,

Relig.

Shop Pers.

Bus.

Eat

Meal

Go

for a

Drive

Work

Related

School

Related

Other Total

Home 38 1 1 1 0 0 2 0 0 2 1 0 46

Paid Work 0 11 0 1 0 0 0 0 0 0 0 1 13

School 0 0 1 1 0 0 0 0 0 0 2 0 4

Social, Rec., Rel. 2 2 0 1 1 3 1 1 1 1 0 0 13

Shop 0 0 0 0 0 9 0 0 0 0 0 0 9

Other 2 11 0 2 2 7 5 5 0 4 0 6 44

TOTAL 42 25 2 6 3 19 8 6 1 7 3 7 129

The same comparisons can be made for destination type and destination activity. These are

shown in Tables 4 and 5. In the case of the destination type, a similar result occurred to that

for the origin type, with the software computing the correct destination type on 161 out of 207

comparable trips, or almost 78 percent of cases being correctly identified. For the destination

activity, the results were also similar to the origin activity with the software giving a correct

result for 77 of 129 destinations, or almost 60 percent correct. These results indicate that the

software is doing a reasonably good job, although it would be desirable to try to improve it

further, which is one of the tasks to be undertaken from the main survey.

Table 4: Cross-Tabulation of GPS Software and Prompted Recall Destination Types

GPS Software Prompted Recall

Home Primary

Workplace

School

(Daycare, K-12)

School (College,

Vocational)

Retail New Total

Home 42 0 0 0 1 7 50

Primary Workplace 0 15 0 0 1 5 21

School (Daycare, K-

12)

0 0 2 0 0 0 2

School (College,

Vocational)

0 0 0 2 0 0 2

Retail 0 2 0 0 8 4 14

New 5 9 0 0 12 92 118

TOTAL 47 26 2 2 22 108 207

Page 14: COMPARING GPS AND PROMPTED RECALL DATA RECORDS€¦ · Comparing GPS and Prompted Recall Data Records STOPHER, Peter, PRASAD, Christine, and ZHANG, Jun 12th WCTR, July 11-15, 2010

Comparing GPS and Prompted Recall Data Records STOPHER, Peter, PRASAD, Christine, and ZHANG, Jun

12th WCTR, July 11-15, 2010 – Lisbon, Portugal

14

Table 5: Cross-Tabulation of GPS Software and Prompted Recall Destination Activity

GPS Software Prompted Recall

Home Paid

Work

Vol.

Work

Pick

up/

Drop

Off

Social,

Rec.,

Relig.

Shop Pers.

Bus.

Eat

Meal

Work

Related

School

Related

Other Total

Home 29 0 1 1 0 1 0 0 0 0 0 32

Paid Work 0 13 0 1 0 0 1 0 0 0 1 16

School 0 0 1 1 0 0 0 0 0 2 0 4

Social, Rec., Rel. 3 2 0 0 1 2 1 1 1 1 0 12

Shop 1 1 0 0 0 8 0 0 1 0 0 11

Other 7 12 0 4 2 9 6 5 4 0 5 54

TOTAL 40 28 2 7 3 20 8 6 6 3 6 129

One aspect of the prompted recall that has not been investigated so far is the estimation of

car occupancy and party size for non-car travel. It is planned that this will be developed

further in the main survey. In the pilot survey, there were 87 prompted recall responses that

showed only one person on the trip, with 23 having 2 persons and 12 having three persons.

In the main survey, one change that has been made is to ask for identification of the

members of the household that are accompanying each person. It is expected that software

will be developed to match people together within a household, where the travel recorded by

the GPS appears to be almost identical, and derive occupancy from this. However, this is a

future development at this time.

CONCLUSIONS

Based on the results of the pilot study, it was decided to emphasise to respondents at the

time of the recruitment that there was likely to be a prompted recall follow-up survey. In

addition, it was decided to offer an incentive for completion of the prompted recall survey.

Although such incentives are not considered to be the most effective, because of budgetary

restrictions, it was decided to offer a draw for prizes for completion of the prompted recall

survey. These strategies appear to be having a good effect, because, at the time of writing,

the response rate in the main survey is running at nearly 40 percent for the prompted recall.

This represents a major improvement on the 29 percent response rate of the pilot survey.

Changes have also been made in the training of map editors, so that the errors from map

editing of the pilot survey are unlikely to occur in the main survey. This training has included

instructions to not add trips at the beginning or end of the day, to look more carefully for

spurious GPS data points that do not represent travel, and to be particularly careful to not

delete GPS data that could potentially be a real trip. It is likely that this additional training will

reduce significantly the 8 percent error rate of the pilot map editing.

The GPS devices appear to be functioning extremely well, given an error level of about 1.8

percent in identifying travel. Compared to the underreporting of conventional travel surveys

Page 15: COMPARING GPS AND PROMPTED RECALL DATA RECORDS€¦ · Comparing GPS and Prompted Recall Data Records STOPHER, Peter, PRASAD, Christine, and ZHANG, Jun 12th WCTR, July 11-15, 2010

Comparing GPS and Prompted Recall Data Records STOPHER, Peter, PRASAD, Christine, and ZHANG, Jun

12th WCTR, July 11-15, 2010 – Lisbon, Portugal

15

of 20 to 30 percent, this represents a much higher level of accuracy than has been achieved

in previous travel surveys. In terms of the functioning of the software, it appears that the trip

identification is working very well. It is not possible to determine the exact accuracy of the trip

identification software, because of the map editing process that changes and corrects some

of the resulting trip identification. However, by comparing the results of the software with the

map edited results, it is hoped to be able to propose some further improvements in the

software that will reduce the amount of effort required for the map editing itself. Together,

however, it appears that trip identification is working well.

Mode identification is currently considered to be working relatively well, given an 88 percent

accuracy level. Again, by studying the results where the prompted recall and the GPS

software differed, it is expected that some improvements may be able to be made to the

software. The goal would be to achieve between 90 and 95 percent accuracy.

The results of purpose identification, which, in this application did not make use of any land

use information, are considered very reasonable, although it is hoped that it can be improved

beyond present levels. The results of this exercise showed that the type of origin and

destination were identified to around 80 percent, which is lower than would be desired, but

still quite reasonably high. Improvements in this will be sought. Several changes have

already been made to the purpose identification software which should show improvements

in this area. The match on activities was less favourable, with around 60-67 percent

matching. Again, the programming improvements that have already been made for the main

survey are likely to show some improvement here. Further study of the situations where the

software fails to identify the activity correctly may lead to further improvements in the

software.

Overall, this pilot study shows the value of the prompted recall in validating and helping to

improve the software, and also shows that even a relatively small sample of prompted recall

surveys, not randomly selected, can provide considerable assistance to improving the GPS-

only survey. The results also show the potential value of a GPS-based household travel

survey with the much higher trip rates measured and the detail that is available on the routes

and the geography and timing of travel. Once again, as in all of the GPS validation surveys of

conventional household travel surveys, this study shows that people still have difficulty in

reporting the travel that they undertake. People do not recall the times at which travel takes

place in many cases. Some people have little idea of their travel, as shown by those who

reported in the prompted recall something that did not resemble what the GPS recorded in

any way at all. Such records occur in conventional household surveys. In this case, the

prompted recall results can simply be ignored, while when they occur in a conventional

survey, they will result in the introduction of significant error.

REFERENCES

Auld, J.; Williams, C.; Mohammadian, A.; Nelson, P. (2009). An Automated GPS-Based

Prompted Recall Survey with Learning Algorithms. Transportation Letter, 1 (1), 59-79.

Page 16: COMPARING GPS AND PROMPTED RECALL DATA RECORDS€¦ · Comparing GPS and Prompted Recall Data Records STOPHER, Peter, PRASAD, Christine, and ZHANG, Jun 12th WCTR, July 11-15, 2010

Comparing GPS and Prompted Recall Data Records STOPHER, Peter, PRASAD, Christine, and ZHANG, Jun

12th WCTR, July 11-15, 2010 – Lisbon, Portugal

16

Bachu, P.K.; Dudala, T.; Kothuri, S.M. (2001). Prompted Recall in Global Positioning System

Survey: Proof-of-Concept Study. Transportation Research Record 1768, pp. 106-113.

Lee-Gosselin, M.E.; Doherty, S.T.; Papinski, D. (2006). An Internet-Based Prompted Recall

Diary with Automated GPS Activity-trip Detection: System Design. Proceedings of the

85th Annual Meeting of the Transportation Research Board, January 2006,

Washington, DC.

Li, Z.J.; Shalaby, A.S. (2008). Web-Based GIS System for Prompted Recall of GPS-Assisted

Personal Travel Surveys: System Development and Experimental Study.

Proceedings of the 87th Annual Meeting of the Transportation Research Board,

January 2008, Washington, DC.

Stopher, P.R.; Bullock, P.; Horst, F. (2002). Exploring the Use of Passive GPS Devices to

Measure Travel. Proceedings of the 7th International Conference on Applications of

Advanced Technologies to Transportation, edited by K.C.P. Wang, S. Medanat, S.

Nambisan, and G. Spring, ASCE, Reston VA, pp. 959-967.

Stopher, P.R.; Collins, A. (2005). Conducting a GPS Prompted Recall Survey over the

Internet. Proceedings of the 84th Annual Meeting of the Transportation Research

Board, January 2005, Washington, DC.

Stopher, P.; FitzGerald, C.; Zhang, J. (2008). In Search of a GPS Device for Measuring

Personal Travel. Special Issue of Transportation Research C, on Emerging

Commercial Technologies, 16, 350-369.

Stopher, P.R. (2009). Collecting and Processing Data from Mobile Technologies. In Bonnel,

P.; Lee-Gosselin, M.; Zmud, J.; Madre, J.-L. (editors), Transport Survey Methods:

Keeping Up With a Changing World, Emerald Group Publishing Ltd., England, pp.

361-391.

Stopher, P.; Wargelin, L. (2010). Conducting a Household Travel Survey with GPS. Paper to

be presented to the World Conference on Transport Research, Lisbon, Portugal, July.

TRB (1996). Conference on Household Travel Surveys: New Concepts and Research

Needs. Transportation Research Board Conference Proceedings No.10.

Wagner, D.P. (1997). Lexington Area Travel Data Collection Test: GPS for Personal Travel

Surveys. Final Report for OHIM, OTA, and FHWA, USDOT.

Wolf, J.; Guensler, R.; Bachman, W. (2001). Elimination of the Travel Diary: An Experiment

to Derive Trip Purpose from GPS Travel Data. Transportation Research Record

1768, pp. 124-134.

Wolf, J.; Loechl, M.; Thompson, M.; Arce, C. (2003). Trip Rate Analysis in GPS-Enhanced

Personal Travel Surveys. In Stopher, P.R.; Jones, P.M. (editors), Transport Survey

Quality and Innovation, Pergamon Press, Oxford, England, pp. 483-498.

Wolf, J. (2006). Applications of New Technologies in Travel Surveys. In Travel Survey

Methods: Quality and Future Directions, Stopher, P. and Stecher, C. (editors),

Elsevier, Oxford, pp. 531-544.

Wolf, J. (2009). Mobile Technologies: Synthesis of a Workshop. In Bonnel, P.; Lee-Gosselin,

M.; Zmud, J.; Madre, J.-L. (editors), Transport Survey Methods: Keeping Up With a

Changing World, Emerald Group Publishing Ltd., England, pp. 393-401.