pure.southwales.ac.uk · web viewdata from national mapping agencies, typically well-documented and...

68
Investigating geospatial data usability from a health geography perspective using sensitivity analysis: the example of potential accessibility to primary healthcare Robin Frew a* , Gary Higgs b , Jenny Harding c and Mitchel Langford b a Faculty of Computing, Engineering and Science, University of South Wales, Pontypridd CF37 1DL b GIS Research Centre, Wales Institute of Socio-Economic Research, Data and Methods (WISERD), University of South Wales, Pontypridd CF37 1DL c Ordnance Survey, Explorer House, Adanac Drive, Southampton SO16 0AS * Corresponding author. E-mail [email protected] 1

Upload: others

Post on 06-Jun-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: pure.southwales.ac.uk · Web viewData from national mapping agencies, typically well-documented and assumed to be of the highest quality available, is often expensive which makes

Investigating geospatial data usability from a health geography perspective

using sensitivity analysis: the example of potential accessibility to primary

healthcare

Robin Frewa*, Gary Higgsb, Jenny Hardingc and Mitchel Langfordb

a Faculty of Computing, Engineering and Science, University of South Wales, Pontypridd CF37 1DLb GIS Research Centre, Wales Institute of Socio-Economic Research, Data and Methods (WISERD), University of South Wales, Pontypridd CF37 1DLc Ordnance Survey, Explorer House, Adanac Drive, Southampton SO16 0AS

* Corresponding author. E-mail [email protected]

1

Page 2: pure.southwales.ac.uk · Web viewData from national mapping agencies, typically well-documented and assumed to be of the highest quality available, is often expensive which makes

Abstract

Network distance and travel times are two popular methods of measuring potential

geographic accessibility and networks are also used in gravity model-based approaches such

as floating catchment area (FCA) techniques. Although some research has been conducted to

assess the effectiveness of the representation of demand- (population) or supply-

(destinations) side characteristics within such models, there have been few attempts to assess

the implications of using alternative sources of network data. This study employs a

sensitivity analysis approach to assess accessibility to GP surgeries in south Wales using

proprietary and open sources of network data. Results suggest that there are significant

differences between access scores derived from the use of networks which purport to portray

the same features. Furthermore, the pattern of differences varies between urban and rural

areas. Case studies are used to show that the actual representation of network-based features,

often overlooked in previous research, can have important implications for the findings from

such studies. We conclude by suggesting that the use of sensitivity analysis to assess

geospatial data usability has a wider relevance for studies that involve the use of a range of

GIS-based techniques in different application areas.

Keywords

Accessibility analysis; floating catchment area (FCA) methodologies; alternative sources of

network data; data usability; sensitivity analysis.

2

Page 3: pure.southwales.ac.uk · Web viewData from national mapping agencies, typically well-documented and assumed to be of the highest quality available, is often expensive which makes

1. Introduction

This paper argues for the need to examine the usability of geospatial data sources when

applied to ‘typical’ GIS-based analytical tasks, such as those undertaken in health geography

studies. Our primary focus concerns the use of network data sets in examining spatial

variations in potential accessibility to General Practitioners (GP) surgeries. By taking the

novel approach of applying sensitivity analysis and comparing results obtained from using

different sources of spatial data within models typically used to assess geographical variation

in access to health facilities, assessments can be made on the usability, or appropriateness, of

such data sets in context. The number and variety of sources of spatial data has increased in

recent years leading to wider debates regarding the quality and usability of such data,

particularly in the light of the increased availability of Free and Open Source (FOS) or free-

to-use data including volunteered geographic information (VGI) for GIS modelling

applications (Goodchild and Li, 2012; Haklay, 2010a, Senaratne et al, 2016).

Data from national mapping agencies, typically well-documented and assumed to be of the

highest quality available, is often expensive which makes the option of cost-free data sets

tempting for many users; particularly those working in the public and third sectors in periods

of austerity and reduced IT financial budgets. Concerns over VGI data quality and trust may

be higher than those relating to proprietary GI (Goodchild, 2007), but advantages of other

usability aspects such as currency and cost have meant that crowd-sourced and VGI

geospatial data are increasingly used in GIS studies. This has led to a number of recent

studies comparing usability issues of VGI products, such as OpenStreetMap (OSM), with

3

Page 4: pure.southwales.ac.uk · Web viewData from national mapping agencies, typically well-documented and assumed to be of the highest quality available, is often expensive which makes

those of ‘official’ sources of digital data in GIS applications (e.g. Brovelli et al., 2016; Du et

al., 2016).

The concept of geospatial data usability is closely related to that of data quality, with Cai and

Zhu (2015) amongst others actually classifying usability as a data quality element. However,

with the user experience and the context of the use recognised as key, both in the publication

of International Standards (ISO 9241-210, 2010) and the academic literature (see for example

Haklay (2010b) and Brown et al (2012)), a wider range of usability factors beyond that of a

data-centric view of data quality are increasingly recognised (Figure 1). Drawing on ISO

9241 (2010) geospatial data usability can be defined as the extent to which geospatial data

can be used to achieve specified goals with effectiveness, efficiency and satisfaction, in a

specified context of use. These three key characteristics of usability (effectiveness, efficiency

and satisfaction) can be split into many component elements, all of which contribute to the

usability of the data (as shown in Figure 1), with the importance of each element varying

according to the particular context and task, and with the potential to be grouped and

classified in several different ways, again dependent on the particular context.

[FIGURE 1 INSERTED ABOUT HERE]

Previous usability studies involving geospatial data have tended to involve a battery of

techniques including: timing how long a task takes to complete, assessing how well a task is

completed, collating and assessing the resources needed to complete a task, and gauging user

satisfaction compared to expectations (Harding and Pickering, 2007). Much of this is a

subjective, qualitative process involving time-consuming interviews and questionnaires

(Harding, 2012). This study takes a quantitative approach to address one aspect of usability:

4

Page 5: pure.southwales.ac.uk · Web viewData from national mapping agencies, typically well-documented and assumed to be of the highest quality available, is often expensive which makes

namely effectiveness. The characteristics of each dataset all contribute differently to this

aspect. By conducting sensitivity analysis to the different permutations of spatial data for the

travel network representation in accessibility models, variations in results are highlighted in

order to draw attention to the advantages and limitations of such data sources in this context.

This objective approach involves examining how uncertainty in the output of a system

(numerical or otherwise) can be apportioned to different sources of uncertainty in its inputs.

The most significant sources of uncertainty can then become the focus for further research

(European Commission, 2015). Though rarely applied in geographical contexts, sensitivity

analysis is regularly used in the financial industry, in business planning and in the fields of

medicine and health (Czitrom, 1999).

This study draws on the findings from a study of spatial variations in potential accessibility to

primary health care facilities. There is a considerable literature on different approaches to

measuring accessibility, especially in health studies where the accessibility of a population to

a variety of medical facilities has come under considerable scrutiny (Higgs, 2004).

Traditionally, such approaches have included relatively straightforward container and

coverage methods which produce easily understood results from simple calculations but may

be less appropriate at certain spatial scales or for smaller geographical areas. More recently,

studies such as those of Burkey (2012), Delamater (2013) and Fransen et al. (2015) have

drawn attention to the potential of more sophisticated tools for measuring accessibility using

gravity-based approaches which incorporate sources of public transport data and networks

(Biba et al, 2010; Mao and Nekorchuk, 2013; Langford et al, 2016).

Despite the relative plethora of studies investigating the application of these techniques, the

use of health-based accessibility analysis to assess the usability of GI data is much less

5

Page 6: pure.southwales.ac.uk · Web viewData from national mapping agencies, typically well-documented and assumed to be of the highest quality available, is often expensive which makes

common. In particular, whilst previous studies such as Phibbs and Luft (1995), Bertazzon and

Olson (2008), Apparicio et al. (2008) and Boscoe et al. (2013) have compared various

distance-measurement methods such as Euclidean, Manhattan and true network distance, few

have examined the implications of using different sources of network-based data. In one of

the few examples to date, Jones (2010) used sensitivity analysis to compare walking times to

medical facilities in the West Midlands using networks based on three Ordnance Survey (OS)

products (OS MasterMap® Integrated Transport NetworkTM Layer, Meridian® 2, and OS

VectorMap® District) together with OpenStreetMap (OSM). Differences of up to 4% in

populations within walkzones (equivalent to 40,000 people) were identified between

networks, and these differences were investigated in order to identify causes. This found that

some routes were omitted from some products due to generalisation, while features such as

pedestrian bridges and footpaths were often only mapped in OSM.

The present study extends the research conducted by Jones by including alternative sources

of purpose-built network representations in an analysis of accessibility to primary health

services in South Wales. Each data source is considered in the role of deriving Closest

Distance measures as well as a more sophisticated measure of accessibility based on

enhanced two step floating catchment area (E2SFCA) techniques (Luo and Qi, 2009). Figure

2 illustrates the sensitivity analysis process adopted, wherein the use of multiple iterations

highlighted anomalies, the underlying causes of which were then investigated. Effectively,

sensitivity analysis was used to ‘stress test’ the geospatial data to derive a quantitative

assessment of its usability for the task rather than basing such an assessment on, for example,

the quality of the data alone, and this provided an objective method of comparing datasets.

[FIGURE TWO INSERTED ABOUT HERE]

6

Page 7: pure.southwales.ac.uk · Web viewData from national mapping agencies, typically well-documented and assumed to be of the highest quality available, is often expensive which makes

The E2SFCA method is a form of gravity model which incorporates levels of supply and

demand as additional accessibility factors (Wang and Luo, 2005). It has been extensively

studied and modified from its original derivation in the early 2000s (Luo and Wang, 2003).

The E2SFCA method measures population-to-provider ratios within a user-defined distance

threshold of each supply location, then sums these ratios for all supply locations found within

the distance threshold of each demand centre, giving an accessibility measure for each

demand centre. By taking a case study approach using Closest Distance and E2SFCA, we

draw on the results of such models of potential geographic accessibility to GP surgeries in

two local authority areas within south Wales to highlight the potential implications of

including different sources of network representation. Every theoretical approach to

measuring accessibility has its own advantages and disadvantages, and in the absence of

detailed data on service utilisation, each tries to represent the real-world experiences of those

accessing such services or features without going through an expensive and time-consuming

series of in-depth surveys and observations or long-term travel diary exercises.

This paper does not revisit the strengths and limitations of alternative methods of measuring

accessibility. Instead the focus is on the use of alternative sources of network-based features

using two techniques commonly used within the health geography literature, addressing the

extent to which the choice of data source affects the results. The remainder of this paper is

structured as follows; in section two the primary data sources, study area and methodological

approach are described in more detail. The results obtained from the approaches taken are

reported in section three of the paper. In order to try to explain such trends, a case study

approach is adopted to explore how different scenarios impacting on accessibility ‘on-the-

ground’ can be used to begin to understand the patterns that arise from using different

7

Page 8: pure.southwales.ac.uk · Web viewData from national mapping agencies, typically well-documented and assumed to be of the highest quality available, is often expensive which makes

network representations within such models. In section four, the implications of such

scenarios for those charged with understanding patterns of access to health facilities are

outlined before the policy implications of such studies are re-iterated in the conclusion.

2. Methods

2.1 Study Area and Datasets

Two neighbouring south Wales local authority areas were chosen for this study: the City and

County of Cardiff; and the Vale of Glamorgan County (Figure 3). Cardiff is the largest city

in Wales with a population of 346,090 in the Unitary Area at the time of the 2011 census.

The Vale has a population total just under a third of Cardiff’s (126,336), has just one main

urban area in the port town of Barry, and covers over twice the area (340km2 against

150km2). The study area thus contains various landscapes from inner city, to suburban,

through to rural, with which to compare findings from applying alternative network data sets

in E2SFCA models.

Four network datasets were used in the comparative analysis, selected to represent a range of

detail and perceived quality in terms of pedestrian accessibility: three came from the UK

national mapping agency, Ordnance Survey (OS), while OpenStreetMap (OSM) was the one

FOS dataset adopted. Two of the OS datasets are commercial products but are available free

of charge to public bodies via the OS Public Sector Mapping Agreement: OS MasterMap

Integrated Transport Network Layer (ITN), and ITN with Urban Paths (UP). The third OS

dataset was the so-called Open Data: OS Open Roads (OR). ITN is Britain’s most complete

road network (Ordnance Survey, 2016) and is frequently used for measuring accessibility in

GB by drive time/distance and walking. The UP dataset links with ITN to include pedestrian

routes in towns but has not been widely used in accessibility studies to date. Open Roads is a

8

Page 9: pure.southwales.ac.uk · Web viewData from national mapping agencies, typically well-documented and assumed to be of the highest quality available, is often expensive which makes

simplified network derived from the same data source as that of ITN. OSM’s website

(openstreetmap.org/) allows users to contribute to a map anywhere in the world, allowing

crowd-sourcing and unrestricted editing by users on the basis that multiple contributions

eventually achieve a correct outcome. OSM enables new roads or paths to be mapped using

the local population as citizen surveyors, potentially before a full ‘official’ ground or aerial

survey takes place, and the inclusion of paths as well as roads gives OSM the opportunity to

provide better coverage of pedestrian routes than proprietary datasets. In this study the

January 2014 version of ITN and UP network data were compared to OSM network data

obtained from a third-party website, Mapzen (metro.teczno.com), with a data currency date of

21 Dec 2013. Open Roads was launched in March 2015, with currency dating from this date.

[FIGURE THREE INSERTED ABOUT HERE]

The source of data for the features of interest (GP surgeries) was Points of Interest (PoI), a

location-based directory of business, transport, health, education and leisure services in

Britain created and maintained by PointX (Ordnance Survey, 2015) which is available free-

of-cost for academic use via Digimap (http://digimap.edina.ac.uk/). Esri ArcMap 10.2 and

ArcGIS Network Analyst extension were used for the GIS processes. The location of General

Practitioners (GP) were extracted using ArcGIS for the study area itself plus a 8km buffer

area to account for potential cross-border travel and to minimise edge effects as suggested by

previous research findings (e.g. Ngui and Apparicio, 2011, comparing E2SFCA scores and

distances to medical clinics in Montreal). The resulting distribution of features is shown in

Figure 4. At the time of the study, the Vale had 21 surgeries located within its boundary and

103 including the buffer; Cardiff had 63 within its boundary and 114 within the buffered area.

Supply-level figures required for E2SFCA calculations were the number of GPs located at

9

Page 10: pure.southwales.ac.uk · Web viewData from national mapping agencies, typically well-documented and assumed to be of the highest quality available, is often expensive which makes

each practice, with these figures obtained from the Welsh Government (WG) in December

2014 (http://wales.gov.uk/statistics-and-research/general-medical-practitioners/?lang=en). It

was acknowledged that some branch surgeries may offer a lesser level of accessibility than

that indicated through the use of GP numbers, as some GPs share their time between part-

time branch surgeries. The likelihood therefore is that levels of accessibility are artificially

increased for some service points in the study area.

[FIGURE FOUR INSERTED ABOUT HERE]

Census Output Areas (OAs) were chosen as the unit of population representation with

population totals recorded by the 2011 Census of Population; OAs are the smallest unit of

census aggregation in the UK, with an average population of approximately 300, and form

the building blocks for other spatial units. OAs are designed to have similar population sizes

and to be as socially homogenous as possible, according to tenure of household and dwelling

type (Office of National Statistics, 2011). Although Lloyd (2016) recommended OA-level

analysis for population studies their limitations were also noted, where very large OAs may

be a poor representation of what should be a continuous population surface. The population

of each OA polygon was represented by a population-weighted centroid, of which there were

1077 in Cardiff and 412 in the Vale. GIS-compatible OA polygons and centroids, along with

details of each OA’s usual night-time resident population (URP) at the census date of 27

March 2011, were available from the UK Data Service Census Support webpages of the

Edina website (http://census.edina.ac.uk/). The relevant demand figure for GP surgery usage

(as required for E2SFCA calculations) was obtained from Welsh Government statistics

(Welsh Government, 2013), which indicated that 17% of the population of Cardiff had made

recent use of GP services (i.e. within 2 weeks of the survey date) and 19% of the population

of the Vale of Glamorgan. The appropriate proportion of the URP of each polygon was

10

Page 11: pure.southwales.ac.uk · Web viewData from national mapping agencies, typically well-documented and assumed to be of the highest quality available, is often expensive which makes

therefore calculated for applying the GP usage figures for each unitary authority area to its

constituent OAs (an acknowledged limitation in the absence of detailed GP surgery utilisation

data at small area levels).

2.2 Closest Distance and Enhanced Two-Step Floating Catchment Area (E2SFCA)

analysis

Distances were calculated using ArcGISTM Origin-Destination (OD) Cost Matrix tools using

the Network Analysis extension, with OA centroids as origins and GP surgeries as

destinations. One of the main advantages of using distance to measure accessibility is that the

results, in absolute units, are easily understood by researchers and policy makers (Talen and

Anselin, 1998). For this measure it was assumed the population would use their nearest

destination feature, a typical assumption in this type of study, though difficult to confirm

without intensive and expensive study of actual GP patient travel behaviours. The E2SFCA

technique was also used to compare with the results from the distance measures. Floating

catchment area models incorporate the influence of supply capacity and demand population

levels within a catchment area around the points of population representation and the

destination features. E2SFCA calculations were made using a bespoke plug-in to ArcGISTM

developed by Langford et al. (2014) and provided by the authors. A simple, binary approach

was taken regarding the catchment areas: a facility was either within the threshold distance of

a demand centre and was therefore a potential destination; or it was outside the threshold

destination and therefore not a potential destination. It was acknowledged that use of a

distance decay element could reflect a more realistic view of the real-life situation, allowing a

closer facility to exert more of a ‘pull’ on a given population than a more distant one, but no

empirical studies have been carried out to validate suitable distance-decay models for specific

services.

11

Page 12: pure.southwales.ac.uk · Web viewData from national mapping agencies, typically well-documented and assumed to be of the highest quality available, is often expensive which makes

2.3 Statistical analysis

Statistical analysis was conducted on the results from all distance and E2SFCA calculations

in order to assess levels of similarity and difference. The statistical distribution of the metric

values was always highly skewed and not conforming to a normal distribution. As all results

arose from the repeated retesting of the same datasets the most appropriate statistical test for

similarity is Spearman’s rank correlation coefficient, and for levels of difference the

Friedman test and Wilcoxon tests were used. Spearman’s test had been used in previous

studies involving accessibility and proximity, including those of Burgoine et al. (2013) who

investigated proximity to food outlets as part of a study into obesity, and that of Ngui and

Apparicio (2011). Friedman tests (the non-parametric alternative to one-way ANOVA) were

used on sets of results to assess whether there were any differences within the scores. If the

results from the Friedman tests were significant, indicating the existence of differences

somewhere within the set of results, then Wilcoxon signed-rank tests (the non-parametric

alternative to one-way ANOVA with repeated measures) were used on a pair-by-pair basis to

identify the specific differences between each and every paired set of results. All statistical

analysis was carried out using IBM SPSS statistics software. Destination Overlap was an

additional measure to the use of Shortest Distance and E2SFCA as measures of accessibility.

This metric illustrated the extent that any change in network or population representation had

on the identity of the nearest destination feature. Whereas Closest Distance calculates the

actual distance to the nearest destination feature, Destination Overlap compares the identity

of the closest destination across the different networks. For example, if the same specific

destinations were identified using all four networks then the Destination Overlap would be

100% for each paired comparison.

12

Page 13: pure.southwales.ac.uk · Web viewData from national mapping agencies, typically well-documented and assumed to be of the highest quality available, is often expensive which makes

3. Results

3.1 Spearman’s Rank Correlation and Wilcoxon tests

A comparison of distances from population weighted OA centroids to their nearest GP

surgery location highlighted potential differences arising from the use of alternative network

data sets. Spearman’s Rank Correlation coefficients were significant at the 0.01 level for

both authorities, with those of the Vale of Glamorgan even higher. To explore further the

levels of similarity, the Friedman statistical test of difference was employed. And despite the

high correlations, Friedman tests implied some differences existed within the data for both

counties. Thus, Wilcoxon paired comparisons were carried out on all combinations for

Cardiff and the Vale in an attempt to identify which comparisons exhibited differences. The

results of the Spearman and Wilcoxon tests are shown in Tables 1 to 4. Results from this

statistical analysis suggest that not all comparisons from the Wilcoxon tests for Cardiff were

significant at the < .001 level, with the comparison between ITN and OSM significant at the

0.05 level. All results for the Vale, however, were significant at the < .001 level, indicating a

statistically highly significant difference between results when using different combinations

of networks, suggesting no two network datasets would be interchangeable in this context.

[TABLES ONE, TWO, THREE AND FOUR INSERTED ABOUT HERE]

The results derived from E2SFCA scores are presented in Tables 5 to 8. Again, a comparison

of OA level accessibility scores in the Vale of Glamorgan were more highly correlated than

those of Cardiff, with all correlations in both areas significant at the 0.01 level. The high

correlations for the Vale were also reflected in Wilcoxon paired comparison test results. All

four network combinations returned Z scores that were not significant at the 5% level,

indicating the lack of statistically significant differences between these combinations. All

13

Page 14: pure.southwales.ac.uk · Web viewData from national mapping agencies, typically well-documented and assumed to be of the highest quality available, is often expensive which makes

comparisons from Cardiff were statistically significant at the < .001 level. So the use of

E2SFCA as the measure of accessibility compared to measures of distance resulted in greater

similarity in results, as exhibited through the Wilcoxon scores for E2SFCA having a greater

number of non-significant results.

[TABLES FIVE, SIX, SEVEN AND EIGHT INSERTED ABOUT HERE]

3.2 Destination overlaps

Destination overlaps highlighted how these statistical results affected actual outcomes

involving the identification of the same destination depending on the combination of

networks used and are reported in Tables 9 and 10 for Cardiff and Vale GP surgeries,

respectively. The destination overlaps for the Vale are, in general, higher than those of

Cardiff. However, it is also notable that none of the datasets used produced identical results

to any other, the highest levels being 97.3% for Cardiff (ITN versus OR) and 98.8% for the

Vale (for ITN v OR).

[TABLES NINE AND TEN INSERTED ABOUT HERE]

3.3 Summary of results

Table 11 provides a summary of results for the distance measures, illustrating the range

within the results obtained using each of the network datasets. These provide more

information as to the levels of similarity and difference within results, which will be

discussed in the next chapter. The results of the Closest Distance and E2SFCA models of

accessibility were all significantly correlated, however tests of difference indicated that

significant differences also existed in the results of Closest Distance. Differences in E2SFCA

results for the Vale were not significant, indicating a greater level of similarity between

14

Page 15: pure.southwales.ac.uk · Web viewData from national mapping agencies, typically well-documented and assumed to be of the highest quality available, is often expensive which makes

outcomes. Even with these strong correlations and indications of similarity, none of the

compared network datasets produced an identical pattern of origin-destination journeys.

[TABLE ELEVEN INSERTED ABOUT HERE]

3.4 Case study investigations

Several case studies were conducted in order to illustrate the effects of different network

representations, showing where effects are considerable but also scenarios where differences

are less marked. Walkzones were created using ArcGIS Service Areas representing walking

times of 5, 10, 15, 20 and 30 minutes at a steady pace of 2.6kph, which represents the typical

pace of an infirm walker or that of a parent with small children (Road Research Laboratory,

1965). Walkzones provide visual indications of areas within defined limits, and in this study

are used to indicate differences or similarities between networks in different geographical

contexts. Figures 5, 6 and 7 show the walkzones for different contexts within the study areas

with graphical representation of the populations enclosed by the relevant walkzones provided

in Figure 8. The use of a Sensitivity Analysis approach enabled the identification of areas for

closer investigation in order to expand on the potential factors influencing results obtained in

the preceding accessibility analysis.

3.5 Inner city case study

Figure 5 shows the walkzones around the surgeries in a densely-populated inner city area of

Cardiff. The pattern of walkzones are similar, and although there are differences present

between the network representations, most are not easily discernible to the naked eye. The

populations detailed in Figure 8(a) confirm there is little difference in the populations within

the various walkzones. The performance of OSM in this context indicates there is little

difference between it and the proprietary datasets in such urban contexts.

[FIGURE FIVE INSERTED ABOUT HERE]

15

Page 16: pure.southwales.ac.uk · Web viewData from national mapping agencies, typically well-documented and assumed to be of the highest quality available, is often expensive which makes

3.6 Urban fringe case study

Figure 6 shows the situation in suburban areas on the fringes of Cardiff, with two GP

surgeries located in a large peripheral estate of predominantly social housing. Differences

between the networks are discernible, and the patterns of the walkzones here show

considerable variation, particularly those using the Urban Path dataset, Figure 6(b). These

differences are reflected in the population coverages reported in Figure 8(b), where the UP

network includes considerably more population than the others. The other three datasets

perform similarly up to the 30-minute walkzone, at which point the boundaries of the OSM

zones fall short of several OA centroids, which were included in the zones of the other

networks. As an example, one centroid with a high population (of over 900) was omitted

from the OSM zones due to one road being present in all three network databases but not

OSM. The location of this road is indicated at ‘A’ in Figure 6. As OSM aims to map

footpaths and cycle paths in addition to roads, in theory OSM results should be closer to

those of UP than ITN and OR, although this is not the case in this location. As in similar

areas, differences between ITN and UP may reduce as distances increase, particularly where

physical barriers to travel affect both roads and footpaths. In several of the suburban areas

around the study areas, the architecture of the road network exerted a considerable influence

on results, especially with Urban Paths. Road layouts involving crescents and cul-de-sacs

resulted in Closest Distance results being considerably reduced when footpaths linking the

‘closed’ ends of roads are included in the network. Relatively modern housing developments

in the study areas frequently featured such road layouts, therefore the inclusion of Urban

Paths resulted in lower Closest Distance figures and greater walkzone coverage.

[FIGURE SIX INSERTED ABOUT HERE]

3.7 Rural case study

16

Page 17: pure.southwales.ac.uk · Web viewData from national mapping agencies, typically well-documented and assumed to be of the highest quality available, is often expensive which makes

Figure 7 illustrates the pattern around two GP surgeries in a village setting in the more rural

Vale of Glamorgan. Although results from ITN and Open Roads appear similar, the UP and

OSM networks produced appreciably different results. In population terms (Figure 6(c)) UP

has considerably more coverage, with the other three datasets having similar results.

Although all UP zones are larger than the others, one example from the 30-minute walkzone

is provided, where a series of steps and paths down a steep hill in UP connects two relatively-

recent housing developments, pathways which do not feature in the other networks (with the

area noted at ‘B’ in Figure 7, and photographed in Figure 8).

[FIGURES SEVEN, EIGHT AND NINE INSERTED ABOUT HERE]

The graphs in Figure 9 illustrate the differences, or otherwise, in population coverage due to

network differences. In the rural context, for example, the results for OSM are almost

identical to those of ITN and Open Roads. The UP walkzone covers a much larger

population, despite OSM purporting to map more types of footpaths. The difference in

population covered by the 30-minute walkzone is over 1000 people, with the OSM

population figure 32% lower than UP, a considerable difference in the context of a small,

rural town. Even in this context the OSM figures were identical to those of ITN. The urban

fringe case study produced greater differences at the 30-minute level with OSM reaching

40% of the population covered by UP. In this context the 30-minute OSM area was over

20% below that of ITN, indicating the relatively poor coverage of OSM in this suburban area.

4. Discussion

Findings from the study of access to GP surgeries suggest that varying the network dataset in

the context of geographical accessibility has a statistically significant effect on the results of

both distance and FCA-type accessibility models as well as influencing the choice or

17

Page 18: pure.southwales.ac.uk · Web viewData from national mapping agencies, typically well-documented and assumed to be of the highest quality available, is often expensive which makes

identification of the nearest feature. The choice of accessibility model also affected results,

with patterns arising from the gravity model different from those of the distance measure.

The urban/rural context also affected results, with those from the rural Vale generally having

less marked differences than those in urban Cardiff. The examples provided in Section 3

highlight underlying issues with the network datasets, where not all networks perform the

same, and performance differs according to context and area. Detailed investigations of the

reasons for such trends in terms of the ways in which networks are presented by the different

sources used in this study enable a comparison of the advantages and limitations of each data

set (Table 12).

It was apparent through the examination of individual results and identification of underlying

issues that none of the network datasets used in this study adequately and accurately

represented pedestrian journeys. ITN, generally seen as the definitive proprietary network

dataset and treated as such in many studies, including those measuring pedestrian

accessibility (Jones, 2010), was not designed for pedestrian travel and only included lengths

traversable by motor vehicles. Urban Paths (UP) included footpaths in cities, towns and large

villages, but not in truly rural areas or footpaths which were not ‘permanent’ (and therefore

excluded tracks and rights of way across open spaces and through woodlands, etc.). Open

Roads, derived from the same source data as ITN, was also aimed at vehicular travel. OSM

coverage in areas outside the centre of Cardiff, particularly in suburban and urban fringes,

was poor, and many features which OSM purport to include in their maps, such as footpaths

and cycleways, were simply not recorded (or at least not yet).

Previously-expressed concerns regarding OSM urban drop-off (Haklay, 2010a; Zielstra and

Zipf, 2010) and issues relating to completeness and lack of thematic attributes (Maue and

18

Page 19: pure.southwales.ac.uk · Web viewData from national mapping agencies, typically well-documented and assumed to be of the highest quality available, is often expensive which makes

Shade, 2008) are largely confirmed by the findings of this study. Both quality and coverage

of OSM data in rural and urban fringe settings result in poor usability, whereas the purpose of

the datasets and data gathering policy serves to render ITN and UP less usable in pedestrian

accessibility contexts. These results suggest that ITN was not sufficiently usable in the

context of pedestrian accessibility largely because it was not designed to fulfil such a

purpose. Urban Paths, despite being designed to represent pedestrian travel and increasing

the pedestrian accessibility to certain destination locations, does not map certain features such

as ‘informal’ paths which people often actually use, and therefore does not reflect actual

pedestrian behaviour. OSM, while potentially incorporating many of these informal

pathways, had issues of quality, coverage and trust which affected its usability in pedestrian

accessibility contexts.

[TABLE TWELVE INSERTED ABOUT HERE]

Given the aims of both ITN-UP and OSM to map footpath networks, there was a potential for

results from these sources to be very similar. However, as illustrated by the low Destination

Overlap results this was not the case here. This indicates some of the main issues with both

these data sets: OSM coverage levels in rural areas (including towns within the Vale) and in

the suburbs of Cardiff are low, with many pedestrian features missing, resulting in routes

using UP being able to access destinations by a shorter route, and the differences involved

also resulting in alternative locations being identified as the nearest. The use of only one tool

to assess the differences in accessibility may indicate high levels of similarity (with high

correlations and non-significant differences being identified), however by using several

approaches the differences between the datasets are highlighted, illustrating the potential

dangers of choosing (or simply using) any one dataset without regard to context.

19

Page 20: pure.southwales.ac.uk · Web viewData from national mapping agencies, typically well-documented and assumed to be of the highest quality available, is often expensive which makes

The greater distances involved in the Vale of Glamorgan and the fewer number of potential

destinations meant that correlations of results and estimates of difference showed a greater

level of similarity compared to the results from Cardiff. This was particularly marked with

E2SFCA, where the results indicated that all networks returned statistically similar results in

the Vale, but not for Cardiff. This urban-rural divide illustrates the dangers of assuming

results from one area can be applied to others with dissimilar population and distribution of

features. This study highlights the influence network choice had on the outcomes of typical

accessibility measures obtained using GIS processes: different network datasets (ITN, UP or

OSM) produce different results. Different methods of measuring accessibility (nearest

distance or E2SFCA) also produce different results. These differences are not only

statistically significant, they also result in the identification of different destinations, as

shown by destination overlap results. This study also illustrates the benefits of using a

sensitivity analysis approach to identify specific causes of variations in network output.

Researchers should therefore think carefully about their choice of datasets to be used in

accessibility studies to ensure they are most appropriate for the context of use, as the impact

of such choices has not been widely examined in the health geography literature. Further

work is required to confirm if these results would apply equally in different contexts: whether

geographically (to ascertain if the results reported here are replicated for other areas, and

whether similar differences are identified in larger cities and more rural locations); for

different destination features (for example in assessing accessibility to hospitals or other

types of health facilities/services); or for a different mode of active travel, such as cycling.

With regard to accessibility issues, several instances were identified (similar to those relating

to pedestrian travel) where barriers were mapped which were traversable in the real world,

and where actual barriers to travel (such as flights of steps) were not identified as such.

20

Page 21: pure.southwales.ac.uk · Web viewData from national mapping agencies, typically well-documented and assumed to be of the highest quality available, is often expensive which makes

5. Conclusions

This study has made a number of contributions to the existing literature base: firstly, the case

is made for consideration of usability of geospatial data beyond that of data quality.

Secondly, the use of sensitivity analysis in the context of geospatial data usability has been

shown to illuminate issues of similarity and difference, and also to identify specific issues of

data usability and quality which would have otherwise been missed, thus confirming

sensitivity analysis as an objective, quantitative addition to the techniques used in usability

assessments. The results of this research should inform the decisions of policy makers and

health planners to consider carefully the sources of the data used in planning the provision of

services in different types of urban (and rural) areas, to consider the context carefully and

choose the sources of data that are most appropriate to the study in question. Improving the

accuracy of accessibility will have particular impact with respect to issues of socio-economic

status and areas of deprivation. For example, this study highlighted that coverage,

completeness and quality of VGI network data in large, peripheral housing estates may be

poor, but that such poor quality may be hidden in the ‘noise’ of neighbouring urban areas, so

putting residents of such areas at a relative disadvantage in city-wide studies. Similar

investigations considering different types of destination would also confirm the wider

applicability of these results. Accessibility to schools, sports and leisure facilities, green

space, essential services or transport networks all have relevance to health, welfare, exclusion

or environmental justice agendas.

No attempt has been made in this paper to compare network coverage or accessibility to, for

example, indices of deprivation, in order to investigate deeper social issues of accessibility.

Nor has any attempt been made to look at non-geographic factors of accessibility, such as the

21

Page 22: pure.southwales.ac.uk · Web viewData from national mapping agencies, typically well-documented and assumed to be of the highest quality available, is often expensive which makes

relative wealth or poverty of the populations in these areas. Potential geographic accessibility

was assessed rather than actual accessibility, which requires in-depth and lengthy

investigation into actual use levels of each type of feature, and into individual motivations for

choices made. A number of research questions follow on from the types of analyses

conducted here; for example, to what degree does the representation of demand affect

accessibility-type outcomes? Does the use of a finer, disaggregated population representation

also result in significant changes? What are the levels of OSM coverage in deeply rural

areas? All these questions have relevance to decision-makers who may be relying on and

using geospatial data that is inappropriate on many levels. If reliance is placed on geospatial

data which is used simply because it is free-to-use or already in the possession, ownership or

licensed by the organisation involved, with no regard to the context of use, the results could

vary widely from those using alternative data sets. Awareness amongst decision-makers

must be raised as to the implications and caveats over their choice of data, whether of

network, population representation, or method of locating features, all of which will be

dependent on the methods used to undertake journeys and the wider geographical context.

Further research is needed to explore such issues. For example, examining smaller, urbanised

areas in isolation rather than an entire metropolitan area may confirm the findings indicated

here, that in densely-populated cities and central areas of towns with comprehensive and

complex road networks, any of these four network datasets could produce similar results

when used interchangeably. This could be particularly relevant in cases where an expensive

proprietary dataset may be effectively replaced by an equally-usable free-to-use dataset

without having significant repercussions on the results from such analysis.

6. Acknowledgements

22

Page 23: pure.southwales.ac.uk · Web viewData from national mapping agencies, typically well-documented and assumed to be of the highest quality available, is often expensive which makes

The research for this study formed part of a PhD project sponsored by Ordnance Survey. This

paper has been prepared for information purposes only. It is not designed to constitute

definitive advice on the topics covered and any reliance placed on the contents of this paper is

at the sole risk of the reader.

7, References

Apparicio, P., Abdelmajid, M., Riva, M. and Shearmur, R., (2008) ‘Comparing alternative

approaches to measuring the geographical accessibility of urban health services: Distance

types and aggregation-error issues’, International Journal of Health Geographics, 7(1), p.1.

Bertazzon, S. and Olson, S. (2008) ‘Alternative distance metrics for enhanced reliability of

spatial regression analysis of health data’, in International Conference on Computational

Science and Its Applications, Berlin, Heidelberg, pp. 361-374.

Biba S., Curtin, K. and Manca, G. (2010) ‘A new method for determining the population with

walking access to transit’, International Journal of Geographical Information Science, 24(3),

pp. 347-364.

Boscoe, F., Henry, K. and Zdeb. M. (2013) ‘Nationwide comparison of driving distance

versus straight-line distance to hospitals’, Professional Geographer, 64(2), pp. 188-196.

Brovelli, M.A., Minghini, M., Molinari, M. and Mooney, P. (2016) Towards an automated

comparison of OpenStreetMap with authoritative road datasets, Transactions in GIS,

forthcoming.

23

Page 24: pure.southwales.ac.uk · Web viewData from national mapping agencies, typically well-documented and assumed to be of the highest quality available, is often expensive which makes

Brown, M., Sharples, S., Parker, C., Bearman, N., Maguire, M., Forrest, D., Haklay, M.,

Jackson, M. (2012) ‘Usability of Geographic Information: Current challenges and future

directions’, Applied Ergonomics, 44(6), pp. 855-865.

Burgoine, T., Alvanides, S. and Lake, A (2013) ‘Creating ‘obesogenic realities’; do our

methodological choices make a difference when measuring the food environment?’

International Journal of Health Geographics, 12(1), pp. 33-41.

Burkey, M. (2012) ‘Decomposing geographic accessibility into component parts: methods

and an application to hospitals’, Annals of Regional Science, 48, pp. 783-800.

Cai, L. and Zhu, Y. (2015) ‘The challenges of data quality and data quality assessment in the

big data era’, Data Science Journal, 14(2), pp. 1-10.

Czitrom, V. (1999) ‘One-Factor-at-a-Time versus Designed Experiments’, The American

Statistician, 53(2), pp. 126-131.

Delamater, P. (2013) ‘Spatial accessibility in suboptimally configured health care systems: a

modified two-step floating catchment area (M2SFCA) metric’, Health and Place, 24, pp. 30–

43.

Du, H., Alechina, N., Jackson, M. and Hart, G. (2016) A method for matching crowd-sourced

and authoritative geospatial data, Transactions in GIS, forthcoming.

24

Page 25: pure.southwales.ac.uk · Web viewData from national mapping agencies, typically well-documented and assumed to be of the highest quality available, is often expensive which makes

European Commission (2015) Joint research centre: sensitivity analysis. Available at:

https://ec.europa.eu/jrc/en/samo (Accessed: 3 June 2016).

Fransen, K., Neutens, T., De Maeyer, P. and Deruyter, G. (2015) ‘A commuter-based two-

step floating catchment area method for measuring spatial accessibility of daycare centers’,

Health and Place, 32, pp. 65-73.

Goodchild, M. (2007) ‘Citizens as sensors: the world of volunteered geography’,

GeoJournal, 69(4), pp. 211-221.

Goodchild, M. and Li, L. (2012) ‘Assuring the quality of volunteered geographic

information’, Spatial Statistics, 1, pp. 110-120.

Haklay, M. (2010a) 'How good is volunteered geographical information? A comparative

study of OpenStreetMap and Ordnance Survey datasets', Environment and Planning B:

Planning and Design, 37, pp. 682-703.

Haklay, M. (2010b) 'Usability of geographical information - the case of Code-point Open',

PoVeSham, 1 July. Available at: http://povesham.wordpress.com/2010/07/01/usability-of-

geographicalinformation-the-case-of-code-point-open/ (Accessed: 6 February 2013).

Harding, J. and Pickering, E. (2007) 'Spatial data usability: towards a usability assessment

framework based on usability factors in contexts of use', AGILE Pre Conference Workshop

on Spatial Data Usability, 10th AGILE Conference, Aalborg, Denmark.

25

Page 26: pure.southwales.ac.uk · Web viewData from national mapping agencies, typically well-documented and assumed to be of the highest quality available, is often expensive which makes

Harding, J. (2012) 'Usability of geographic information - factors identified from qualitative

analysis of task-focused user interviews', Applied Ergonomics, 44(6), pp. 940-947.

Higgs, G. (2004) ‘A literature review of the use of GIS-based measures of access to health

care services’, Health Services & Outcomes Research Methodology, 5, pp. 125-145.

International Standards Organisation (2010) ISO 9241-210:2010 - Ergonomics of human-

system interaction. Human-centred design for interactive systems. British Standards Online

[Online]. Available at: https://bsol-bsigroup-com.ergo.glam.ac.uk/en/My-BSI/My-

Subscriptions/BSOL/Search/Search-Results/?src=s&s=c&snc=Y&bwc=F&q=9241-

210&ib=1 (Accessed: 27 February 2013).

International Standards Organisation (2013) ISO 19157:2013 - Geographic information –

data quality. Geneva: International Organization for Standardization.

Jones, S. (2010) ‘Open geographical data, visualisation and dissemination in public health

information’, AGI Geocommunity 2010. Available at:

http://www.agi.org.uk/storage/geocommunity/presentations/SamuelJones.pdf

(Accessed: 5 February 2014).

Langford, M., Higgs G. and Fry, R. (2014) ‘USWFCA Installation and Usage Instructions’

Installation Instructions available at:

https://www.researchgate.net/publication/

261437885_USWFCA_Installation_and_Usage_Instructions

26

Page 27: pure.southwales.ac.uk · Web viewData from national mapping agencies, typically well-documented and assumed to be of the highest quality available, is often expensive which makes

(Accessed: 28 July 2014)

Langford, M., Higgs, G. and Fry, R. (2016) ‘Multi-modal Two-step Floating Catchment Area

Analysis of Primary Health Care Accessibility’, Health & Place, 38, pp. 70-81.

Lloyd, C. (2016) ‘Spatial scale and small area population statistics for England and Wales’,

International Journal of Geographical Information Science, 30(6), pp. 1187-1206.

Luo, W. and Qi, Y. (2009) An enhanced two-step floating catchment area (E2SFCA) method

for measuring spatial accessibility to primary care physicians, Health & Place, 15, pp. 110-

1107.

Luo, W. and Wang, F. (2003) 'Measures of spatial accessibility to health care in a GIS

environment: synthesis and a case study in the Chicago region', Environment and Planning

B: Planning and Design, 30, pp. 865–884.

Mao, L. and Nekorchuk, D. (2013) ‘Measuring spatial accessibility to healthcare for

populations with multiple transportation modes’, Health and Place, 24, pp. 115-122.

Maue, P. and Schade, S. (2008) ‘Quality of geographic information patchworks’, 11th

AGILE International Conference on Geographic Information Science, University of Girona,

Spain. Available at: http://plone.itc.nl/agile_old/conference/2008-girona/PDF/111_DOC.pdf

(Accessed: 18 April 2013).

27

Page 28: pure.southwales.ac.uk · Web viewData from national mapping agencies, typically well-documented and assumed to be of the highest quality available, is often expensive which makes

Ngui, N. and Apparicio, P. (2011) ‘Optimizing the two-step floating catchment area method

for measuring spatial accessibility to medical clinics in Montreal’, BMC Health Services

Research, 11(166).

Office of National Statistics (2011) Guidance and methodology: A beginners guide to UK

geography – Output Area. Available at:

http://www.ons.gov.uk/ons/guide-method/geography/beginner-s-guide/census/output-area--

oas-/index.html

(Accessed: 27 November 2015).

Ordnance Survey (2015) User guides: Points of Interest. Available at:

https://www.ordnancesurvey.co.uk/docs/user-guides/points-of-interest-user-guide.pdf

(Accessed: 5 October 2015).

Ordnance Survey (2016) OS MasterMap Integrated Transport Network Layer. Available at:

https://www.ordnancesurvey.co.uk/business-and-government/products/itn-layer.html

(Accessed: 7 June 2016).

Phibbs, C. and Luft, H. (1995) ‘Correlation of travel time on roads versus straight line

distance’, Medical Care Research and Review, 52(4), pp. 532-542.

Road Research Laboratory (1965) ‘Research on Road Traffic’, HMSO.

Senaratne, H., Mobasheri, A., Ali, A., Capineri, C. and Haklay, M. (2016) ‘A review of

volunteered geographic information quality assessment methods’, International Journal of

28

Page 29: pure.southwales.ac.uk · Web viewData from national mapping agencies, typically well-documented and assumed to be of the highest quality available, is often expensive which makes

Geographical Information Science. [Online]. Available at:

http://www.tandfonline.com/doi/abs/10.1080/13658816.2016.1189556

(Accessed: 10 June 2016).

Talen, E. and Anselin, L. (1998) 'Assessing spatial equity: an evaluation of measures of

accessibility to public playgrounds', Environment and Planning A, 30, pp. 595-613.

Wang, F. and Luo, W. (2005) ‘Assessing spatial and nonspatial factors for healthcare access:

towards an integrated approach to defining health professional shortage areas’, Health and

Place, 11, pp. 131-146.

Welsh Government (2013) Welsh Health Survey 2013: Health service use. Available at:

http://wales.gov.uk/docs/statistics/2014/140930-health-survey-la-lhb-2012-13-en.xls

(Accessed: 9 December 2014).

Zielstra, D. and Zipf, A. (2010) ‘A comparative study of proprietary geodata and volunteered

geographic information for Germany’, 13th AGILE International Conference on Geographic

Information Science, Guimarães, Portugal. Available at: http://koenigstuhl.geog.uni-

heidelberg.de/publications/2010/Zielstra/AGILE2010_Zielstra_Zipf_final5.pdf

(Accessed: 18 April 2013).

29

Page 30: pure.southwales.ac.uk · Web viewData from national mapping agencies, typically well-documented and assumed to be of the highest quality available, is often expensive which makes

LIST OF TABLES

Table 1. Correlations of distance results for Cardiff GP surgeries, using Spearman’s Rank

Correlation Coefficients. All coefficients significant at the 0.01 level.

Table 2. Differences between distance results for Cardiff GP surgeries. Below diagonal:

Wilcoxon Z scores. Above diagonal: significance level (plain = significant at < .001; bold =

significant at 0.05).

Table 3. Correlations of distance results for Vale GP surgeries, using Spearman’s Rank

Correlation Coefficients. All coefficients significant at the 0.01 level.

Table 4. Differences between distance results for Vale GP surgeries. Below diagonal:

Wilcoxon Z scores. Above diagonal: significance level (all results significant at the < .001

level).

Table 5. Correlations of E2SFCA results for Cardiff GP surgeries, using Spearman’s Rank

Correlation Coefficients. All coefficients significant at the 0.01 level.

Table 6. Differences between E2SFCA results for Cardiff GP surgeries. Below diagonal:

Wilcoxon Z scores. Above diagonal: significance level (all results significant at the < .001

level).

Table 7. Correlations of distance results for Vale GP surgeries, using Spearman’s Rank

Correlation Coefficients. All coefficients significant at the 0.01 level.

Table 8. Differences between E2SFCA results for Vale GP surgeries. Below diagonal:

Wilcoxon Z scores. Above diagonal: significance level (all results not significant at 0.05

(5%) level).

Table 9. Destination overlaps (%) for Cardiff GP surgeries.

Table 10. Destination overlaps (%) for Vale of Glamorgan GP surgeries.

Table 11. Distribution of distance results from census OA centroids to nearest GP surgery.

Table 12. Advantages and disadvantages of the featured network datasets.

30

Page 31: pure.southwales.ac.uk · Web viewData from national mapping agencies, typically well-documented and assumed to be of the highest quality available, is often expensive which makes

LIST OF FIGURES

Figure 1. Usability at the centre of quality, metadata, interface and utility assessments.

Based on ISO 9241-11 (ISO, 2010) and ISO 19157:2013 (ISO, 2013).

Figure 2. Sensitivity analysis as used in this study with an OFAT (one factor at a time)

approach, where one factor (the network) was altered in each iteration while the others

remained constant.

Figure 3. The location of the two study areas in south Wales

Figure 4. Distribution of GP surgeries within the study areas plus an 8km buffer.

Figure 5. Example of walkzones around GP surgeries (in green) in an inner city context.

Figure 6. Example of walkzones around GP surgeries (in green) in an urban fringe context.

Figure 7. Example of walkzones around GP surgeries (in green) in a village context

Figure 8. Steps and pathways in Urban Paths (as indicated in Figure 7 as ‘B’) connecting

housing developments on the periphery of a rural village, which other networks do not

feature.

Figure 9. Populations within the various walkzones shown in Figures 5 to 7.

31

Page 32: pure.southwales.ac.uk · Web viewData from national mapping agencies, typically well-documented and assumed to be of the highest quality available, is often expensive which makes

ITN

UP .915 UP

OR .995 .914 OR

OSM .966 .891 .967

Table 1. Correlations of distance results for Cardiff GP surgeries, using Spearman’s Rank Correlation Coefficients. All coefficients significant at the 0.01 level.Abbreviations:ITN – Ordnance Survey Integrated Transport Network;

UP – OS ITN with Urban Paths network dataset;OR – OS Open Roads network dataset;OSM – OpenStreetMap network dataset.

ITN UP OR OSMITN < .001 < .001 .030UP -18.705 < .001 < .001OR -8.004 -12.980 < .001OSM -2.168 -15.411 -4.345

Table 2. Differences between distance results for Cardiff GP surgeries. Below diagonal: Wilcoxon Z scores. Above diagonal: significance level (plain = significant at the < .001 level; bold = significant at 5%).Abbreviations:ITN – Ordnance Survey Integrated Transport Network;

UP – OS ITN with Urban Paths network dataset;OR – OS Open Roads network dataset;OSM – OpenStreetMap network dataset.

ITN

UP .954 UP

OR .994 .961 OR

OSM .988 .945 .981Table 3. Correlations of distance results for Vale GP surgeries, using Spearman’s Rank Correlation Coefficients. All coefficients significant at the 0.01 level.Abbreviations:ITN – Ordnance Survey Integrated Transport Network;

UP – OS ITN with Urban Paths network dataset;OR – OS Open Roads network dataset;OSM – OpenStreetMap network dataset.

32

Page 33: pure.southwales.ac.uk · Web viewData from national mapping agencies, typically well-documented and assumed to be of the highest quality available, is often expensive which makes

ITN UP OR OSMITN < .001 < .001 < .001UP -11.242 < .001 < .001OR -6.434 -8.117 < .001OSM -6.803 -4.158 -5.117

Table 4. Differences between distance results for Vale GP surgeries. Below diagonal: Wilcoxon Z scores. Above diagonal: significance level (all results significant at the < .001 level).Abbreviations:ITN – Ordnance Survey Integrated Transport Network;

UP – OS ITN with Urban Paths network dataset;OR – OS Open Roads network dataset;OSM – OpenStreetMap network dataset.

ITN

UP .896 UP

OR .976 .897 OR

OSM .940 .865 .923

Table 5. Correlations of E2SFCA results for Cardiff GP surgeries, using Spearman’s Rank Correlation Coefficients. All coefficients significant at the 0.01 level.Abbreviations:ITN – Ordnance Survey Integrated Transport Network;

UP – OS ITN with Urban Paths network dataset;OR – OS Open Roads network dataset;OSM – OpenStreetMap network dataset.

ITN UP OR OSMITN < .001 < .001 < .001UP -6.844 < .001 < .001OR -11.513 -5.509 < .001OSM -3.659 -5.633 -4.533

Table 6. Differences between E2SFCA results for Cardiff GP surgeries. Below diagonal: Wilcoxon Z scores. Above diagonal: significance level (all results significant at the < .001 level).Abbreviations:ITN – Ordnance Survey Integrated Transport Network;

UP – OS ITN with Urban Paths network dataset;OR – OS Open Roads network dataset;OSM – OpenStreetMap network dataset.

33

Page 34: pure.southwales.ac.uk · Web viewData from national mapping agencies, typically well-documented and assumed to be of the highest quality available, is often expensive which makes

ITN

UP .962 UP

OR .988 .960 OR

OSM .978 .956 .970

Table 7. Correlations of E2SFCA results for Vale GP surgeries, using Spearman’s Rank Correlation Coefficients. All coefficients significant at the 0.01 level.Abbreviations:ITN – Ordnance Survey Integrated Transport Network;

UP – OS ITN with Urban Paths network dataset;OR – OS Open Roads network dataset;OSM – OpenStreetMap network dataset.

ITN UP OR OSMITN .335 .284 .880UP - .965 .555 .726OR -1.070 - .591 .241OSM - .151 - .351 -1.172

Table 8. Differences between E2SFCA results for Vale GP surgeries. Below diagonal: Wilcoxon Z scores. Above diagonal: significance level (all results not significant at 0.05 (5%) level).Abbreviations:ITN – Ordnance Survey Integrated Transport Network;

UP – OS ITN with Urban Paths network dataset;OR – OS Open Roads network dataset;OSM – OpenStreetMap network dataset.

ITN

UP 89.6 UP

OR 97.3 88.7 OR

OSM 91.5 85.7 91.7

Table 9. Destination overlaps (%) for Cardiff GP surgeries.Abbreviations:ITN – Ordnance Survey Integrated Transport Network;

UP – OS ITN with Urban Paths network dataset;OR – OS Open Roads network dataset;OSM – OpenStreetMap network dataset.

34

Page 35: pure.southwales.ac.uk · Web viewData from national mapping agencies, typically well-documented and assumed to be of the highest quality available, is often expensive which makes

ITN

UP 96.1 UP

OR 98.8 95.4 OR

OSM 90.8 88.1 90.0

Table 10. Destination overlaps (%) for Vale of Glamorgan GP surgeries.Abbreviations:ITN – Ordnance Survey Integrated Transport Network;

UP – OS ITN with Urban Paths network dataset;OR – OS Open Roads network dataset;OSM – OpenStreetMap network dataset.

Distance (m) to nearest featureCardiff Vale

Mean Median SD Range IQR Mean Median SD Range IQRITN 813 716 512 3328 646 1366 1006 1356 8077 992UP 713 633 440 3013 601 1275 862 1358 8077 831OR 807 706 510 3326 648 1356 1005 1366 8150 987OSM 817 725 512 3004 643 1342 986 1357 8034 940

Table 11. Distribution of distance results from census OA centroids (Cardiff n = 1077; Vale n = 412) to nearest GP surgery.Abbreviations:ITN – Ordnance Survey Integrated Transport Network;

UP – OS ITN with Urban Paths network dataset;OR – OS Open Roads network dataset;OSM – OpenStreetMap network dataset;SD – standard deviation;IQR – interquartile range.

35

Page 36: pure.southwales.ac.uk · Web viewData from national mapping agencies, typically well-documented and assumed to be of the highest quality available, is often expensive which makes

Network Advantages Disadvantages

ITN

Relevant usability factors:

Comprehensive. The ‘gold standard’ of UK network data.High quality.Designed for travel by motor vehicle.

Not designed for journeys by cycle and on foot.Expensive to obtain by non-academic or public sector bodies.

Completeness; accuracy; consistency; error rate; purpose; trust.

Cost; purpose.

Urban Paths

Relevant usability factors:

Comprehensive where covered.High quality.

Not national coverage – only applies to urban areas of 5km2 and over.Supplied with ITN, not available as a stand-alone product.Does not include ‘informal’ paths.

Accuracy; consistency; trust Cost; content.; purpose

Open Roads

Relevant usability factors:

Comprehensive UK coverage.Open data (free to use).Simplified version of ITN. Good for travel by motor vehicle.

Limited for journeys by cycle and on foot.

Completeness; consistency; error rate; purpose; trust; cost; content.

Caveats on use; purpose.

OSM

Relevant usability factors:

Open data. VGI.Updated in real time. Cycle map layer.Aims to map paths used by pedestrian, whether permanent or not.

VGI, therefore uncertainty over content quality.Unclear classifications, lack of definitions.Data drop-off with distance from large urban areas.

Cost; content; popularity; currency. Completeness; consistency; accuracy; error rate; trust.

Table 12. Advantages and disadvantages of the featured network datasets.Abbreviations:ITN – Ordnance Survey Integrated Transport Network;

Urban Paths – OS ITN with Urban Paths network dataset;Open Roads – OS Open Roads network dataset;OSM – OpenStreetMap network dataset.

36

Page 37: pure.southwales.ac.uk · Web viewData from national mapping agencies, typically well-documented and assumed to be of the highest quality available, is often expensive which makes

Figure 1. Usability at the centre of quality, metadata, interface and utility assessments. Based on ISO 9241-11 (ISO, 2010) and ISO 19157:2013 (ISO, 2013).

37

Page 38: pure.southwales.ac.uk · Web viewData from national mapping agencies, typically well-documented and assumed to be of the highest quality available, is often expensive which makes

Figure 2. Sensitivity analysis as used in this study with an OFAT (one factor at a time) approach, where one factor (the network) was altered in each iteration while the others remained constant.

38

Page 39: pure.southwales.ac.uk · Web viewData from national mapping agencies, typically well-documented and assumed to be of the highest quality available, is often expensive which makes

© Crown copyright and database rights 2016 OS

Figure 3. The location of the two study areas in south Wales

39

Page 40: pure.southwales.ac.uk · Web viewData from national mapping agencies, typically well-documented and assumed to be of the highest quality available, is often expensive which makes

© Crown copyright and database rights 2016 OS

Figure 4. Distribution of GP surgeries within the study areas plus an 8km buffer.

40

Page 41: pure.southwales.ac.uk · Web viewData from national mapping agencies, typically well-documented and assumed to be of the highest quality available, is often expensive which makes

a) ITN b) UP

c) Open Roads d) OSM

Figure 5. Example of walkzones around GP surgeries (in green) in an inner city context.© Crown copyright and database rights 2016 OS

41

Page 42: pure.southwales.ac.uk · Web viewData from national mapping agencies, typically well-documented and assumed to be of the highest quality available, is often expensive which makes

a) ITN b) ITN with Urban Paths

c) Open Roads d) OSM

© Crown copyright and database rights 2016 OS

Figure 6. Example of walkzones around GP surgeries (in green) in an urban fringe context.© Crown copyright and database rights 2016 OS

42

A

A A

A

Page 43: pure.southwales.ac.uk · Web viewData from national mapping agencies, typically well-documented and assumed to be of the highest quality available, is often expensive which makes

a) ITN b) ITN with Urban Paths

c) Open Roads d) OSM

Figure 7. Example of walkzones around GP surgeries (in green) in a village context. © Crown copyright and database rights 2016 OS

43

B

BB

B

Page 44: pure.southwales.ac.uk · Web viewData from national mapping agencies, typically well-documented and assumed to be of the highest quality available, is often expensive which makes

Figure 8. Steps and pathways in Urban Paths (as indicated in Figure 7 as ‘B’) connecting housing developments on the periphery of a rural village, which other networks do not feature.

44

Page 45: pure.southwales.ac.uk · Web viewData from national mapping agencies, typically well-documented and assumed to be of the highest quality available, is often expensive which makes

Figure 9. Populations within the various walkzones shown in Figures 5 to 7.

45