avenues for developing the uk’s national geospatial metadata service
TRANSCRIPT
-
7/27/2019 Avenues for developing the UKs National Geospatial Metadata Service
1/37
1
Title: Avenues for developing the UKs National Geospatial Metadata Service
Authors:
James K. Batcheller
Bruce M. Gittings
Institute of Geography
School of GeoSciences
University of Edinburgh
Drummond Street
Edinburgh EH8 9XP
Tel: +44 (0) 131 650 2558
FAX: +44 (0) 131 650 2524
Corresponding author
Abstract:
The state of public sector geospatial data sharing and exchange in the UK, as facilitated
by the gigateway service, is currently at a crossroads. Ambiguities surrounding its
purpose, direction, funding and custodianship continue to persist in the face of
increasing demands placed upon the service, such as legal requirements (INSPIRE, PSI)
and rising user expectations. A well-defined strategy addressing the political,
commercial and technological considerations involved in advancing the service is
therefore needed if these uncertainties are to be countered and demands met. The
current work aims to provide for the technical aspects of such a strategy by considering
potential avenues for development. Accordingly, proprietary and open source
approaches are examined in the context of facilitating metadata publication (production,
integrity, delivery), enhancing the service infrastructure (interoperability, future-
proofing) as well as addressing end-user considerations (data visualisation, data access).
The resulting roadmap outlines a technical evolution of gigateway, proposing a service
better equipped to face the challenges of both the present and the future.
-
7/27/2019 Avenues for developing the UKs National Geospatial Metadata Service
2/37
2
Keywords: gigateway, geospatial metadata, metadata service, SDI.
-
7/27/2019 Avenues for developing the UKs National Geospatial Metadata Service
3/37
3
Introduction
The advent of the World Wide Web (WWW) and the Internet has revolutionised how
all kinds of information can be accessed and exchanged - geospatial information no less
than other forms. From modest beginnings as point-to-point transfer via FTP1
and e-
mail, through the origins of customised interactive web-based mapping, as seen in the
postings of Xeroxs Palo Alto Research Centre (PARC) in 1993 (Putz, 1994; Harder,
1998) to distributed online metadata services and clearinghouses offering catalogues of
records detailing geospatial dataset attributes and how to procure them, and geospatial
one-stop shops offering an integrated access point to disparate geospatial data resources,
widespread data dissemination is currently driven as never before. Sourcing, accessing
and retrieving data for analysis and display have been made easier, with implications for
public, private and academic sectors ranging from the stimulation of intellectual
endeavours, improved data management practices and enhanced visibility of potentially
marketable geospatial products.
In the public sector, efforts have been given further impetus through the introduction of
legislation at both national and international level. In the United States for instance,
President Clintons Executive Order 12906 (1994)2
demanded the creation of a
coordinated National Spatial Data Infrastructure (NSDI) to support public and private
sector applications of geospatial data with a key goals of avoiding wasteful
duplication of effort and promoting effective and economical management of
resources. More recently, European Union directives such as the sharing of Public
1File Transfer Protocol
2 http://govinfo.library.unt.edu/npr/library/direct/orders/20fa.html
-
7/27/2019 Avenues for developing the UKs National Geospatial Metadata Service
4/37
4
Sector Information (PSI, 2003) and the INfrastructure for SPatial InfoRmation in
Europe (INSPIRE, 2004) have formalised requirements that member states facilitate
location of and access to geospatial assets for the purpose of formulation,
implementation, monitoring and evaluation of Community policy-making3.
Such has been the perceived worth of web-enabling geospatial holdings that the
forerunning national initiatives have in recent times been augmented by local, regional
and international schemes, as well as those in the private and academic sectors (Guptill,
1999; Tulloch and Robinson, 2000; Higgins et al., 2003). Prime examples include the
UKs public sector geospatial metadata portal gigateway, its academic counterpart Go-
Geo!4, Environmental Systems Research Institutes (ESRI) Geography Network
5and
the Federal Geographic Data Committees (FGDC) National Geospatial Data
Clearinghouse6
(precipitated by Clintons Executive Order).
The benefits of web-enabling data assets are nevertheless not without their own
particular problems. Questions as to whether users can effectively find quality,
compatible and appropriate data for their needs are balanced by resource,
implementation and maintenance issues for data providers. Additional complications
arise on consideration of the political issues involved in supporting a geospatial data
sharing initiative, particularly in governmental sectors. Concerns as to where service
ownership lies, its strategic goals, its sources of revenue, how it is promoted and who
3http://www.ec-gis.org/inspire/
4http://www.gogeo.ac.uk/
5http://www.geographynetwork.com/
6 http://clearinghouse1.fdgc.gov/
-
7/27/2019 Avenues for developing the UKs National Geospatial Metadata Service
5/37
5
constitute the target community are just some of the factors which impact upon a
services performance.
Such are the challenges that currently face the UK's national geospatial data sharing
initiative gigateway. With the rapid and ongoing evolution of spatially aware software
and services offered over the Internet, it can be reasoned that end-user expectations
have also evolved, arguably passed what the service can currently offer. From a data
provider's perspective, active participation is arguably driven more by the desire to be
seen to contribute or through some form of compulsion (e.g. contractual obligations,
mandates from a higher authority, legislation) than the recognition of potential benefits
that may be accrued. As for the gigateway service itself, it is currently at a crossroads.
Ambiguities surrounding its purpose, technological expectations, ongoing source of
funding (as currently enshrined within the NIMSA7
agreement), coupled with doubts as
to whether the Association for Geographic Information (AGI) shall continue to act as
custodian have led to the national geospatial metadata service facing a somewhat
uncertain future.
It is in this light that the timeliness of a re-examination of how public geospatial
metadata is published in the UK via the gigateway service is argued. If confidence in
the service is to be maintained, particularly amongst those on whom gigateways
ongoing success is dependent (i.e. the contributing community), it is crucial that the
7The National Interest Mapping Services Agreement a contract between the Office of the Deputy Prime Minister
(ODPM now Department of Communities and Local Government) and the Ordnance Survey (OS) under which the
ODPM funds, or part funds, (mapping) activities that meet established criteria for being in the national interest (NIMSA
Review Group Report, 2004).
-
7/27/2019 Avenues for developing the UKs National Geospatial Metadata Service
6/37
6
initiative be seen to move forward with vision and purpose. A well-defined strategy
addressing the political, commercial and technical considerations involved in advancing
gigateway is therefore imperative if the investment and goodwill already accrued by the
national geospatial metadata service is to be maintained. It is the aim of the current
work to provide a basis for the technical aspects of such a strategy by analysing the
current service, identifying improvement opportunities and elaborating potential
development paths. Each stage of the geospatial metadata lifecycle, from production to
publication and beyond, is consequently investigated, with the goal of eliminating,
circumventing or diminishing barriers to metadata delivery.
Background
Metadata
The increased availability of geospatial computing technologies has not only fed the
demand for geospatial data with which to perform required analyses (Guptill, 1999;
Deng, 2002), it has resulted in large volumes of such data being produced - not only by
GIS professionals and organisations, but also by those not traditionally considered as
geodata producers (Schweitzer, 1998; Mathys, 2004). As data are clearly critical to the
functioning of GIS, enough so to be referred to as its fuel (Vermeij, 2001; ESRI, 2002),
this surfeit could be viewed positively. Nevertheless there are complications. As Tsou
(2002) observes, the storage and management of geospatial data are in themselves major
challenges. How data are located in what can amount to a needle in a geospatial
haystack; whether such geodata, once if located, are fit for the desired purpose;
whether they are compatible, up-to-date and of sufficient quality, all impart their own
-
7/27/2019 Avenues for developing the UKs National Geospatial Metadata Service
7/37
7
particular issues, even without contemplating data accessibility, copyright, licensing,
potential procurement costs and training.
Regardless of information medium or application domain, it is clearly important to
document data assets so as to facilitate efficient storage and management (Gbel and
Lutze, 1998). Geospatial data are documented by metadata, or data that describes data
(Hart and Phillips, 2001; Vermeij, 2001; Tsou, 2002; Hobona et al., 2004). Just as
geospatial data are abstractions of the real world, for requirements such as analyses and
representation, geospatial metadata are similar abstractions of the data itself. Used not
only to describe a range of dataset attributes, metadata also assist in the location,
evaluation, comparison, access and exploitation of geographical datasets (Luo et al.,
2003; OGC, 2005).
The gigateway metadata service
Arising from several predecessors, most notably the National Geospatial Data
Framework (NGDF) and askGIraffe, gigateways raisons dtre remain that of its
forerunners: to increase the use of geospatial data; to facilitate development of markets
for data and services; and to future proof investments and enhance decision-making
through use of better information (Gigateway, 2003). The service works towards these
objectives through the support of a distributed web-based network, focussed on serving
discovery metadata a subset or profile of a more elaborate metadata standard,
designed to provide a means of identifying where the data described might be found.
Users query metadata through a web-based form on a central portal (see Figure 1.) using
keywords and geographical extents. Queries are then transmitted to the clients (nodes)
-
7/27/2019 Avenues for developing the UKs National Geospatial Metadata Service
8/37
8
of participating organisations, which execute searches on indexed metadata. Results are
returned to the central gateway (portal) where they are collated and sent to the users
browser. Retrieved metadata specify where the original data may be located.
Figure 1. The distributed gigateway service architecture.
Context and rationale
The initial tenet of the NGDF was as a fully-fledged NSDI (Davey and Murray, 1996),
but budgetary constraints, the lack of integrated GI-centric solutions and the need for
progress led to the identification of a National Metadata Service as the priority technical
deliverable8. The first tangible service created was askGIraffe in mid 2000, based on a
distributed search standard developed by the library community and manifested in the
8 A fully-fledged NDSI is being revisited through the UK GI Strategy developed in 2006.
Browser
CentralPortal
Z39.50Search Engine
Z39.50Search Engine
Z39.50Search Engine
MetadataIndexes
MetadataIndexes
MetadataIndexes
Metadata Management Systems
-
7/27/2019 Avenues for developing the UKs National Geospatial Metadata Service
9/37
9
freely available Isite9
package previously deployed successfully in the US by the
FGDC. While Isite still forms the core of gigateway, a number of proprietary and open
source solutions have since become available, some of which were developed
specifically with the geospatial community in mind. Consequently, some of the
technical barriers to developing the UKs initiative beyond the basic metadata service
currently in operation have been removed.
Despite these circumstances and the resources afforded to the service since its inception,
there has been a notable deficit of comprehensive analyses aimed at reviewing the
technical options open to what has now evolved into gigateway. The deficit may be
considered even more curious in light of the aforementioned PSI and INSPIRE
directives, which are predicted to place of significant demands on member states
including the provision of metadata services (Rackham, 2004). Furthermore, as one of
INSPIREs goals is the establishment of an EU-wide data framework based upon the
SDIs of member states, there is a clear need to consider the technological options
relating to not only how gigateway may be moved forward, but also to what can be done
to address some of the challenges it currently faces.
The gigateway metadata publication workflow
The provision of geospatial metadata sustains gigateway; the continued success of the
service therefore relies on those who use it to publish their metadata records. To
safeguard existing contributions and attract new ones, perceived or actual barriers to
participation must be addressed. Currently the path from metadata production to
9 http://www.awcubed.com/Isite/index.html
-
7/27/2019 Avenues for developing the UKs National Geospatial Metadata Service
10/37
10
publication is characterised by a series of distinct steps punctuated by extensive human
intervention (Figure 2.). Whilst human input is important in assessing quality,
opportunities for increased automation certainly exist, speeding the publication process
and hence removing an obstacle to metadata contribution.
geospatial
dataset
detailed
metadata
metadata
repository
discoverymetadata
localgigateway node
remotegigateway node
create, update
document,
update
document,
update
store
subset
subset, retrieve
store
host post
index,
publish
Figure 2. The gigateway metadata publication workflow. Metadata are created /updated on creation / update of datasets. Datasets may be internally documented bydetailed or discovery metadata, but only discovery metadata are indexed and exposedto the gigateway service. A formal metadata repository may be used but is notcurrently assumed to be exposed for query directly.
-
7/27/2019 Avenues for developing the UKs National Geospatial Metadata Service
11/37
11
Gigateways communication infrastructure
The ISO Z39.5010
communication protocol, embodied in Isite, remains central to
gigateway. Influenced at the time by its success in the US and low cost of
implementation, the choice of Isite was also motivated by the lack of any workable
alternative. Subsequent developments have however seen an increase in the number of
commercial and non-commercial solutions. These potentially offer the opportunity to
reinvigorate the service, thus enhancing the number of metadata records available, as
well as providing a future development path. Accordingly, means for advancing the
service are examined in the context of metadata publication (production, integrity,
delivery), the service infrastructure (interoperability, future-proofing) as well as end-
user considerations (data visualisation, data access).
Metadata characteristics
The success of the service (or indeed any service which depends on metadata) relies on
three critical aspects: quality, quantity and accessibility. Metadata quality refers not
only to whether a metadata record is manifested in a way that is compliant with a
specific standard (and hence is exchangeable) but whether it is unambiguously
indicative of the dataset it depicts, is complete and up-to-date. Consistent provision of
quality, fit for purpose metadata helps to assure user confidence in the service,
providing impetus for return visits and in turn enhancing its reputation (Rackham,
2004).
10ANSI/NIS Z39.50-1995 Information Retrieval (Z39.50): Application Service Definition and Protocol Specification.
Also known as the OGC Web Catalog Services protocol Version 1 or ISO 23950
-
7/27/2019 Avenues for developing the UKs National Geospatial Metadata Service
12/37
12
For a service to be of any utility, the quantity of metadata records offered should meet
users expectations. A paucity of records provides little motivation to use the service, as
chances of locating appropriate data will be low.
Metadata records are of minimal utility if they are not accessible, regardless of quality
or quantity.Metadata accessibility in this context not only relates to the ability to locate
and retrieve the desired items, but that they are presented in a consistent format and
conform to employed standards. A combination of a well designed user interface and
effective underlying search engine are necessary to ensure that the user is presented
with the best-fit records, ordered appropriately. Metadata that users find complicated or
time-consuming to locate, access or understand will do little to popularise the hosting
service.
The aforementioned factors are clearly inter-related. A vast quantity of metadata is
pointless in the absence of assurance of quality, whilst a restricted set of high quality
metadata is of limited value11
.
Development approaches
Metadata generation
The perception of metadata generation as being a tedious, expensive or unnecessary
drain on time and resources presents a significant obstacle to the production process -
even where the need for quality metadata is recognised. Streamlining the overall process
would serve to alleviate such concerns and help counter the human bottleneck.
11 Gigateway Advisory Group Meeting, 17th November 2004: http://www.gigateway.org.uk/aboutus/aboutus.html
-
7/27/2019 Avenues for developing the UKs National Geospatial Metadata Service
13/37
13
The ability of modern GIS packages to handle metadata has enabled tighter integration
of data editing and metadata composition into standard workflows. Once completed,
metadata either resides with the data (easing the management and update of both), or it
is copied to a central organisational repository or database. Metadata destined for
exposure via gigateway should comply with the UK GEMINI12
standard a profile of
the ISO standard 19115/19139 Geographic Information: Metadata and the UKs e-
Government Metadata Standard (eGMS). Currently, metadata stored in most GIS
packages would need to be manually copied into an appropriate metadata editor (e.g. the
gigateway-sponsored MetaGenie13
), and / or manually augmented to achieve
compliance. Preparation of discovery metadata thus represents at least a duplication of
effort, as record elements existing elsewhere must be re-entered. The consequent
requirement to populate even the minimum required fields manually clearly tends
toward the tedious.
If geospatial datasets are created, manipulated and documented in proprietary GIS
software, then the development environments included within such packages can be
leveraged to programmatically populate metadata elements gleaned from the users
computing environment on dataset creation or update. Completed metadata can then be
output, validated automatically, complemented with human-mediated quality control
measures and exported for eventual publication on an organisational, sectoral or
national portal.
12GEo-spatial Metadata INteroperability Initiative
13 http://www.gigateway.org.uk/metadata/metagenie.html
-
7/27/2019 Avenues for developing the UKs National Geospatial Metadata Service
14/37
14
As the UKs current market leading GIS software used extensively across the public,
private and academic sectors, ESRIs ArcGIS suite is an obvious candidate for a
solution, oriented towards the gigateway service. In an approach similar to that outlined
by Vermeij (2001), the ArcCatalog component of ArcGIS can be tailored using of a
custom metadata editing screen. Metadata elements displayed for completion are
dictated by entries contained in an XML Stylesheet (XSL) conforming to a detailed
metadata standard (ISO, eGMS). Mandatory GEMINI fields can be made compulsory to
ensure that metadata later extracted for publication purposes comply with gigateways
discovery format.
Metadata items may be automatically populated through the programmatic
interpretation of dataset elements and system variables (inherent metadata),
complemented by pre-prepared metadata templates for commonly-used values (author
metadata) and completed manually by the metadata creator (descriptive metadata,
necessitating human intervention). The conceptual steps are outlined in Figure 3.
Completed metadata can then be validated against an appropriate schema that checks
compliance and verifies that all mandatory elements are populated. Thus what was once
was a time-consuming endeavour for the metadata creator can be reduced to a limited
authoring step and performing quality control.
-
7/27/2019 Avenues for developing the UKs National Geospatial Metadata Service
15/37
15
extract inherent metadata from dataset
+
complement with pre-defined author metadata
+
complete with descriptive metadata
=
a minimum set of mandatory fields
Figure 3:Stages of automating metadata production. Elements requiring userinput are reduced to those of recyclable author metadata and dataset specificdescriptive metadata.
Metadata integrity
Once preparation of standard compliant metadata is complete, the question of
management arises. For a contributing organisation, the importance of aligning the
provision of metadata to gigateway with internal metadata services is critical to ensure
the initiatives continuing success14
. Given the range of contributing organisations, this
alignment is not trivial: with their own particular internal procedures, resources and
guidelines, it is not surprising that storage techniques diverge from one organisation to
the next (Tyler, 2002). Metadata can be stored alongside the data they describe,
facilitating easy update; detached from the data within a DBMS in order to take
advantage of inherent data management features; as text-based files to enable upkeep
via simple text editors, or in any combination of the aforementioned. Additional
difficulties appear as metadata are infrequently authored or edited where they are
exposed, resulting in multiple metadata instances embodied in one or more standard.
Here, metadata must not only be copied to where they are indexed and exposed but also
transformed to conform to discovery metadata specifications. No matter the scenario,
14 Gigateway Advisory Group Meeting, 17th November 2004: http://www.gigateway.org.uk/aboutus/aboutus.html
-
7/27/2019 Avenues for developing the UKs National Geospatial Metadata Service
16/37
16
redundancy results in potential loss of integrity and the formation of discrete
information silos which suffer from update latency, requiring cascading updates and
related version control management.
Metadata integrity issues can be addressed by migrating storage in its entirety to the
database paradigm. By merging multiple metadata instances into one database
repository, or a formal distributed database, potential sources of inconsistency are
eliminated, while providing a secure, robust and manageable storage solution. With
most GIS vendors offering DBMS-driven solutions, metadata composition could be
closely integrated within data editing workflows.
Access to database-held discovery metadata necessary for participation in gigateway
can be achieved in two principle ways. Where organisations wish to exercise the full
benefits of formal database management (Date 2003), metadata can be exposed directly,
although this will involve the provision of a Z39.50 interface15
, with possible
performance implications. Exporting database-held metadata as text files which may
then be exposed remains more straightforward once the relevant database record is
updated, a new file is exported, indexed and made available to the Z39.50 service as
normal.
For organisations wishing to maintain current management practices based on a range
of unconnected tools, metadata integrity problems and data silos may be addressed
using a system based around formal synchronisation (Figure 4.). Developing the
15For example Compusults MetaManager Toolkit, ESRIs ArcIMS Metadata Service and Intergraphs GeoConnect
Metadata Management Server
-
7/27/2019 Avenues for developing the UKs National Geospatial Metadata Service
17/37
17
approach of Dunfey et al. (In press), a highly formalised procedure, a synchronisation
file which acts as a road map for the system or synchronisation daemon provide
means of reconciling otherwise unconnected metadata instances. However, complexities
associated with the synchronisation of multiple files would suggest that a way forward
based on a centralised DBMS is preferable.
Node hosting
Organisations can contribute metadata to gigateway by transferring records to an
existing node (e.g. gigateways centrally managed repository) or by exposing them on a
node of their own. A distributed service architecture, where organisations are
Figure 4:Metadata synchronisation. A metadata master copy isupdated / created and synchronisation is initiated. Pre-existing metadatais updated or overwritten according to storage strategy employed; newinstances are imported or copied. Discovery metadata can be directlycopied or exported to the gigateway node; detailed metadata must firstbe transformed.
synchronisationfile gigateway
nodemetadatasource(s)
import
update
export
transform
copy
overwrite
copy
transform
copy / export
transform
create
update
flat-file storage
database storage
query / response
daemon
-
7/27/2019 Avenues for developing the UKs National Geospatial Metadata Service
18/37
18
encouraged to host their own node, was always a design goal aimed to foster a sense of
proprietorship and participation amongst contributors (Nanson et al., 1995). Despite this
encouragement, there remain institutions with significant data holdings that cite
internal political problems and technical issues16
as the cause of their inability or
unwillingness to host a node. While surmounting these political obstacles may well
pose the greater challenge, options do exist to address the technological concerns
relating to node installation and maintenance.
Currently, mounting a node involves the installation and configuration of a number of
distinct software packages. Perceptions that the process requires a high-level of
expertise result in setup being left to IT departments, outsourced to consultancies or
indefinitely postponed where financial resources are insufficient. Adoption of the
solutions proposed here will further exacerbate this. To circumvent these problems, the
necessary components can be bundled into an automated installation, empowering non-
specialists to easily setup and configure contributory nodes. A barrier to contribution
amongst potential contributors can thereby be lowered and provide for the exposure of
previously untapped geospatial resources. Nevertheless, important preconditions such as
service level agreements and quality guarantees should be enforced to prevent against
casual participation which could negatively impact upon the gigateway service and
users confidence in it.
This model does not suit all however organisations may not have sufficient numbers
of metadata records to justify contributing in such a way, they may not have the
16 AGI gigateway Advisory Group Meeting Minutes, 18th May 2005: http://www.gigateway.org.uk/aboutus/aboutus.html
-
7/27/2019 Avenues for developing the UKs National Geospatial Metadata Service
19/37
19
resources to install and maintain the required hardware and software, or they may
simply be unwilling for reasons of cost, effort and so on. While transferring hosting
responsibility elsewhere may get round local issues of node maintenance and the related
costs, what will result is a further disconnect between the metadata and the data they
describe.
Hosting by proxy
Participating organisations opting not to host their own metadata may expose their
holdings from nodes mounted elsewhere e.g. the central repository managed by the
gigateway service. Submission may be by bulk transfer (e-mail, FTP, CD/DVD); those
choosing the gigateway repository have the further option of submitting via the
MetaGenie online editor. This comprises a web-based form which is completed to
describe each dataset, generating records which still need to be manually processed.
Regardless of approach, resources are necessary to assure the metadata is appropriate
for publication.
To counter these manual processing requirements and consequent update latency
concerns, an automated metadata harvesting facility could be introduced to the metadata
generation-publication workflow, using for instance the library communitys Open
Archives Initiatives Protocol for Metadata Harvesting (OAI-PMH)17
. Standing in
contrast to the approach employed by Z39.50 solutions, OAI-PMH retrieves metadata in
bulk into a central repository. Conceptually it may be viewed as substituting one node
type (Z39.50) for another (OAI-PMH), but as there are little maintenance overheads
17 http://www.openarchives.org/
-
7/27/2019 Avenues for developing the UKs National Geospatial Metadata Service
20/37
20
aside from creating a web accessible folder, the approach may mitigate some concerns
associated with its management. Moreover, as the protocol is HTTP-based, no
additional configuration and security measures are necessary beyond those necessary
for standard web servers (Amin, 2003). Using this as a method for contribution,
participating organisations can at will deposit validated metadata into the web
accessible folder from where they will be automatically harvested no dialogue need be
opened between supplier and host.
Update latency concerns meanwhile can be alleviated by scheduling frequent harvests.
Furthermore, as long as metadata quality, validity and adherence to GEMINI can be
assured, processing resources at the host site are spared and metadata can be exposed
immediately.
Metadata currency and quality
The role of providers does not end with metadata submission they are responsible for
ensuring that their metadata continue to accurately reflect the associated data. Within
the GEMINI standard, currency is partially catered for through the Date of update of
metadata element. There is an argument that, no matter how frequently a dataset is
revised indeed, if at all a regularly maintained Date of update of metadata field
confers confidence in the currency of the metadata record. For static or infrequently
updated datasets however an oldDate of update of metadata is likely to suggest that the
data asset is outmoded and therefore less useful. Given that it would be inappropriate to
update the Date of update of metadata field where a review has been undertaken, but no
-
7/27/2019 Avenues for developing the UKs National Geospatial Metadata Service
21/37
21
actual update has taken place, this highlights the need for aDate of metadata reviewed
field, currently absent from GEMINI.
Evidence from observations of the service18
nevertheless suggest that such elements are
mostly ignored by producers after publishing their metadata. This could be tackled
using a regular automated email notification mechanism based upon the Date element
andEmail address of distributormetadata fields. A quality stamp19
associated with each
metadata item would complement this and enhance user confidence. Providing a system
by which metadata can be rated for quality, either independently or by user feedback,
would allow records to be evaluated at a glance as well as place an onus on contributors
to maintain metadata quality, thus assuring their reputation. Additionally, using the
Date elements as criteria for evaluating quality provides impetus to distributors to
review and maintain records on a systematic basis.
Data access
Complications relating to metadata not accurately reflecting the underlying data extend
beyond contributors, affecting the end-users of the service. There is a twofold problem
do the records returned unambiguously represent the data sought, and if so, how are the
data obtained? Whilst the some standards (e.g. gigateways Discovery Metadata
Specification, the forerunner to GEMINI) attempt to address these concerns by
providing a sample field (containing a visual representation of the data) as well as the
18AGI gigateway Advisory Group Meeting Minutes, 17
thNov 2004: http://www.gigateway.org.uk/aboutus/aboutus.html
19Guidelines for creating gigateway approved metadata exist but the proposed quality stamp has yet to be effectively
associated with hosted records. Further details may be found in:
http://www.gigateway.org.uk/metadata/downloads/Gigateway_metadata_guidelines_ukgemini.pdf
-
7/27/2019 Avenues for developing the UKs National Geospatial Metadata Service
22/37
22
contact details for the data distributor, neither fully address this issue. Evidence from
existing records suggests that the sample field is rarely used, arguably as it requires
more effort. Similarly, supplier contact details may not guide the user directly to the
data even when traditional contact routes (telephone, FAX, postal address) are
supplemented by a web URL20
. The latter typically signposts the distributors
homepage, where the data must be again be located. The degree of separation between
data and metadata consequently disrupts workflow efficiency for the prospective user
and can result in the ordering of an inappropriate product.
Providing an efficient means of accessing a more current representation of the data prior
to procurement will go some way towards alleviating this problem. The UK GEMINI
standard presents an improved treatment for visualisation by providing a field21
for a
URL pointing directly to a representation of the data, not currently exploited by
gigateway. Whether licensed or freely available, presenting a shop window for data
via a live preview enhances the probability that the data shall be pursued.
Complementing this with a facility for immediate download will boost workflow
efficiency as well as help realise one the basic objectives of gigateway to promote
geospatial asset exchange.
The use of theBrowse graphic element could be extended to contain a URL pointing to,
for instance, and OGC22
-compliant Web Feature Server (WFS) such that custodians of
unlicensed data have an opportunity to deliver the actual data via the same means as a
20Uniform Resource Locator
21The Browse graphicelement
22 Open Geospatial Consortium
-
7/27/2019 Avenues for developing the UKs National Geospatial Metadata Service
23/37
23
visualisation. For the provider, resources required to administer such a service are offset
by the time spared from fulfilling data requests; for the user, waiting times associated
with data procurement are significantly reduced.
Providing access to commercial data clearly requires the inclusion of a transaction
model, which takes account both of direct payment and those subject to service-level
licensing agreements (Figure 5.). Visualisation can be permitted via an OGC-compliant
Web Map Service (WMS), which renders a picture of the data, and not the data itself.
Subscribing organisations could download the data following a secure login, enabling
retrieval in volumes or units dictated by the licensing model agreed. Individual users
can be catered for through solutions provided by Internet Payment Service Providers
(IPSPs e.g. PayPal).
-
7/27/2019 Avenues for developing the UKs National Geospatial Metadata Service
24/37
24
Underlying architecture
While provided in the context of the architecture of gigateway, none of the solutions
elaborated herein are considered tightly-bound to the current service infrastructure. The
reality is that all geospatial data-sharing initiatives based on metadata will encounter
similar issues regarding metadata authoring, quality, currency, accessibility and
integrity. With this in mind, the beginnings of a more extensive overhaul of gigateway
can be contemplated.
Since the rollout ofaskGIraffe, the UK metadata service has exclusively employed the
Z39.50-based Isite. Flexible and efficient for near transparent querying of multiple
metadata repositories, it is nevertheless argued by some that the Z39.50 protocol is
useruser / corporate
account
transaction processorWMSWFS
metadata
dataset
approve transaction
license agreement
account creation
visualise,download visualise
procure
accountverification
Figure 5:Adding data visualisation, access and transaction support togigateway. OGC-compliant Web Feature Services (WFS) providevisualisation and access to free and purchased data. Web Map Services(WMS) provide a means of visualising licensed content withoutproviding access to the underlying data. Purchased vector data can bedownloaded as a feature set via WFS, imagery can be downloaded incompressed file format.
-
7/27/2019 Avenues for developing the UKs National Geospatial Metadata Service
25/37
25
functionally limited in a number of ways, particularly when it comes to the
representation of results (Tsou, 2002). Troll and Moen (2001) question Z39.50s
ongoing utility given its complexity and interoperability handicaps, while different
flavours provide varying degrees of support for spatial searching and its ability to scale
is also called into question (Medyckyj-Scott et al., 2001; Amin, 2003). Rocha and
Henriques (2004) meanwhile argue that the changing face of geographical information
services, with increased demand for mobile solutions, real-time, data-ready applications
and the long-term aim of data retrieval in the absence of human mediation dictates the
adoption of a different paradigm.
The emerging OGC Catalogue Service Specification 2.x (OGC, 2005) aims to provide
for such a different paradigm. Adhering to the trend in which the development of
geographical information technologies continue to be more closely aligned with the
mainstream IT industry and interoperability efforts (Higgins et al., 2005), the
Specification details an open, standard interface that enables diverse but conformant
applications to perform discovery, browse and query operations against distributed and
potentially heterogeneous catalog servers23
. Defining a number of communication
protocols (bindings) based on CORBA, HTTP and a new iteration of Z39.50, adherence
to the Specification enables creation of custom applications through the use of
application profiles. Interoperability between different bindings is enabled through the
use of a minimal abstract OGC_Common Catalogue Query Language, providing further
support for spatial query constructs including DISJOINT, INTERSECT, WITHIN and
OVERLAP (OGC, 2005).
23 OGC Press release http://www.opengeospatial.org/press/?page=pressrelease&prid=188
-
7/27/2019 Avenues for developing the UKs National Geospatial Metadata Service
26/37
26
Despite outlining a more sophisticated, yet open, treatment for geospatial resource
discovery, the Catalogue Service Specification 2.x remains an abstract specification
with few well-tested or mature implementations - the communication protocol
predominantly relied upon remains a legacy version of Z39.50. OAI-PMH is a notable
exception, but is promoted as a complementing rather than an alternative technology
(Breeding, 2002). Commercially-developed solutions24
meanwhile do provide
sophisticated alternatives that integrate data storage, querying, middleware, desktop and
Web clients into a coherent software stack, but concerns relating to cross-platform
support and community acceptance may preclude their adoption.
However, the availability of the OGC Geospatial Portal Reference Architecture (OGC,
2004, Figure 6.) provides a new basis for commercial and open source solutions. The
architecture offers specifications which allow a core system to be implemented, for
example the GeoNetwork Metadata Catalogue Server, a collaborative development
effort led by the FAO, UNEP and WFP25
. Implementing the architectures portal and
catalogue components, GeoNetwork continues to be based on Z39.50, thus offering the
potential for incorporation within or replacement of the current gigateway architecture.
Whilst not offering a departure from the protocol as espoused by Tsou (2002) or Rocha
and Henriques (2004), its open, modular architecture provides scope to replace the
communication protocol as laid out in the OGCs Catalogue Service Specification, as
well as allowing interoperability with national and international schemes, as
propounded by the INSPIRE directive.
24For instance ESRIs GIS Portal Toolkit and products from MapInfo and Intergraph
25 Food and Agriculture Organisation, the United Nations Environment Programme and World Food Programme
-
7/27/2019 Avenues for developing the UKs National Geospatial Metadata Service
27/37
27
Further considerations
Prospective efforts to reinvigorate the gigateway service will predictably be fraught
with difficulty. Future visions of how the service is manifested aside, questions as to the
prudence of jeopardising a long history of investment in the current technology,
infrastructure and expertise certainly arise. Even with its oft-perceived limitations, the
current infrastructures track record is proven within the UK context, underpinning what
remains a popular and dependable service. Considering the diversity of gigateways
stakeholders and the resistance to change witnessed in some quarters, strong reasoning
for any proposed modifications will be necessary. Even if a consensus is forthcoming,
damage to gigateways reputation could prove fatal if an enhanced service proves
Portal Servicesviewers
web query interfacesaccess management
Data Servicescontent accessdata processing
Catalog Servicesdata discovery
service discoverydata querying
Portrayal Servicesfeatures
coveragesmaps
Internet
Figure 6: The OGCs Portal Reference Architecture (adapted fromGeoNetwork homepage http://193.43.36.138/.) GeoNetwork provides forcore Portal and Catalog Services, into which existing Portrayal Services(e.g. MapServer, GeoServer) and emerging Data Services may beincorporated.
-
7/27/2019 Avenues for developing the UKs National Geospatial Metadata Service
28/37
28
unreliable or does not live up to user expectations. Of course, initiating and maintaining
prospective changes in service paradigm are contingent on whether the necessary
financial and human resources are forthcoming.
Some of the development paths elaborated above raise further issues. With respect to
coupling automated metadata generation with dataset editing workflows, the lack of
open, standard geo-interfaces or Application Programming Interfaces (APIs) across the
GIS industry currently precludes the creation of a universal solution, thereby
necessitating the development of package-specific strategies.
Automating metadata management and submission processes will serve to reduce the
resources necessary for contribution to gigateway, but do underline the need for quality
and validation safeguards to ensure that inappropriate records are not exposed on the
service. Any implementation of the solutions suggested above should therefore be
supplemented with systematic human-mediated quality control performed by
appropriately trained users, whether on a spot-check basis or brute force evaluation of
all metadata items processed. Similar deliberations are necessary if there is to be a
system supporting the independent accreditation of metadata posted on the service.As
for quality benchmarks, steps aimed at converging the current gigateway approved
stamp with international accreditation schemes (such as those guided by ISO) should be
made to facilitate cross-application compatibility and adoption.
For organisations with few records, or datasets that change infrequently, manually
generating, updating and submitting records may well represent the preferred way
-
7/27/2019 Avenues for developing the UKs National Geospatial Metadata Service
29/37
29
forward. Similarly, preference for retaining some manual control over automated
processes should not be discounted, particularly for those already with well-defined
protocols in place or those reluctant to yield control to what may be perceived as a
black box procedure. In any case, focus should remain on promoting quality
contribution to gigateway, not the excessive imposition of further layers of complexity
on the process where it is not wanted nor warranted.
DBMS techniques would by their nature provide for better management and integrity of
metadata within the gigateway service. While there is an argument that suggests this
would significantly add to the complexity of the system, the well-established interfaces
to DBMS based on SQL (Structured Query Language) should render such components
appropriately modular. Although issues of cost may be raised as concerns, free and open
source software (FOSS) such as MySQL and Postgres are viable options.
Z39.50 has been criticised for its failings in relation to geographical metadata. However,
the advent of geo-centric extensions, together with more recent developments associated
with the OGC Catalogue Service Specifications, overcome some of these concerns. The
ability to access metadata and aggregate search results through more modern protocols
such as HTTP GET and POST requests integrate metadata access more closely with
standard web-based systems. The key, for gigateway, is to provide a transparent
transition from the old to the new.
Both proprietary and FOSS solutions have been discussed each has its place, with
their own particular advantages and disadvantages. Proprietary systems can be argued to
-
7/27/2019 Avenues for developing the UKs National Geospatial Metadata Service
30/37
30
offer stability, less risk and provide buy-in to a ready-made, hopefully well-tested
product complete with support. Yet they can prove expensive. FOSS can provide a less
expensive alternative, although are rarely completely free, often requiring specialised
expertise whether in-house or out-sourced. What is crucial is to ensure the modularity of
components linked by standardised interfaces such that there should be no dependence
on either proprietary or FOSS because these components can be readily replaced.
Additional flexibility can be conferred by providing the aforementioned software
complete with their source code, whether crafted in proprietary or open environments.
While universally applicable solutions are presented, enabling access to the inner
workings of such software will ease integration efforts with incumbent configurations
that invariably differ between organisations. Moreover, by providing support for
facilities similar to those of the online open source communities (e.g. SourceForge),
namely a code repository and a user forum, enthusiastic participants can further
develop, discuss and distribute provided solutions in a collaborative setting to the
benefit of the wider participating community.
While the current work is presented in the context of gigateway and the GEMINI
discovery schema, it is important that any implemented solution not be tightly bound to
any one particular standard. The state of flux and delays associated with standard
stabilisation efforts (GEMINI itself is yet to be finalised), the need to implement
metadata profile extensions and the emergence of new profiles and schemas all make
the ability to substitute one standard for another a functional requirement for the
adopted solutions.
-
7/27/2019 Avenues for developing the UKs National Geospatial Metadata Service
31/37
-
7/27/2019 Avenues for developing the UKs National Geospatial Metadata Service
32/37
32
accessing the data such surrogates depict and to provide linkages to other, similar
schemes at national and international level.
Considering the diverse nature of stakeholders involved in gigateway, any decision on
how to evolve the service will never be based purely on the technological. Indeed, there
remains an urgent need to resolve the aforementioned political issues and to garner
consensus amongst both those directing the service and contributing to it not only
regarding a future direction but also where the funding for service upkeep, improvement
and potential overhaul shall be sourced from. Fundamental decisions must be made
relating to the overall objectives of the service and how it should be manifested, such as
whether it should persist as a metadata service, or whether opportunities presented by
promising technologies should be taken to broaden gigateways scope, as suggested
above. Any assessment shall clearly be tempered by a number of considerations.
Interoperability with other services must remain a critical factor, particularly in light of
legislative requirements at national and European level. The need to maintain the
services standing in the face of emerging initiatives more in tune with both contributor
and consumer expectations is also crucial to avoid perceptions of complacence and the
resulting implications on numbers contributing to and exploiting gigateway. Whatever
path ultimately taken, the overall objective should not only be the realisation of a
service befitting that of an internationally visible initiative, but one that its users view as
being fit for purpose.
-
7/27/2019 Avenues for developing the UKs National Geospatial Metadata Service
33/37
33
References
Amin, S. (2003). The Open Archives Initiative Protocol for Metadata Harvesting: An
Introduction.DRTC Workshop on Digital Libraries: Theory and Practice, Bangalore,
India: DRTC.
Breeding, M. (2002). The Open Archives Initiative.
Accessed 04-06-2006.
Date, C. J. (2003). An Introduction to Database Systems, Eighth Edition. Boston, MA:
Addison Wesley.
Davey, A. and Murray, K. 1996. Update on the National Geospatial Database -
Collaboration between Organisations. InAGI 96 Conference Proceedings:Geographic
Information Towards the Millenium, Birmingham, UK. AGI.
Deng, Y. (2002). The Metadata Architecture for Data Management in Web-based
Choropleth Maps.
Accessed 04-08-
2006.
Dunfey, R. I., Gittings, B. M. and Batcheller, J. K. (In press). Towards an Open
Architecture Vector GIS. Computers and GeoSciences.
ESRI. (2002).Metadata and GIS. Available from http://www.esri.com/.
-
7/27/2019 Avenues for developing the UKs National Geospatial Metadata Service
34/37
34
Gigateway 2003.Discovery Metadata Specifications. Available from
http://www.gigateway.org.uk/.
Gbel, S. and Lutze, K. (1998). Development of meta databases for geospatial data in
the WWW. In Proceedings of the 6th international symposium on Advances in
geographic information systems, Washington, United States. ACM Press.
Guptill, S. G. (1999). Metadata and data catalogues. In P. Longley, M. F. Goodchild, D.
J. Maguire and D. W. Rhind, Geographical Information Systems (pp.677-692).
Chichester: Wiley.
Harder, C. (1998). Serving Maps on the Internet -Geographic Information on the World
Wide Web. Redlands, CA: ESRI, Inc.
Hart, D. and Phillips, H. (2001).Metadata Primer - A "How To" Guide on Metadata
Implementation. Accessed 04-08-2006.
Higgins, C., Medyckyj-Scott, D. and Reid, J. (2003). A Community Specific SDI - the
Case of UK Academia. In Geodaten- und Geodienste-Infrastrukturen - von der
Forschung zur praktischen Anwendung, Mnster, Germany. University of Mnster.
Higgins, C., Robertson, A. and McGarva, G. (2005).Edinburgh University Data
Library Geographic Information Standards: Final Report. Available from
http://www.edina.ac.uk/.
-
7/27/2019 Avenues for developing the UKs National Geospatial Metadata Service
35/37
35
Hobona, G., James, P. and Fairbairn, D. (2004). Facilitating Data Discovery In
Environmental Data Clearinghouses Through Spatial Data Mining. In Proceedings of
the GIS Research UK 12th Annual Conference, Norwich, UK. University of East
Anglia.
Luo, Y., Wang, X. and Xu, Z. (2003). Extension of Spatial Metadata and Agent-based
Spatial Data Navigation Mechanism. In GIS'03: Proceedings of the 11th ACM
international symposium on Advances in geographic information systems, New Orleans,
LA, USA. ACM.
Mathys, T. (2004). The Go-Geo! Portal Metadata Initiatives. In Proceedings of the GIS
Research UK 12th Annual Conference, Norwich, UK. University of East Anglia.
Medyckyj-Scott, D., Chappell, C., Pradhan, A. and O'Hanlon, C. (2001.)A geo-spatial
data resource discovery tool for UK Further and Higher Education - Project Overview
and Recommendations. Available from http://www.edina.ac.uk/.
Nanson, B., Smith, N. and Davey, A. (1995). What is the British National Geospatial
Database? InAGI 95 Conference Proceedings:Expanding Your World, Birmingham,
UK. AGI.
OGC (2004). Geospatial portal reference architecture: a community guide to
implementing standards-based geospatial portals (OGC Draft Report No OGC 04-039).
Open Geospatial Consortium.
-
7/27/2019 Avenues for developing the UKs National Geospatial Metadata Service
36/37
36
OGC (2005). OGC Catalogue Services Specification 2.0.1. (OGC implementation
specification 04-02lr3).
Putz, S. (1994). Interactive information services using world-wide web hypertext.
Computer networks and ISND System, 27, 273-280.
Rackham, L. (2004).An Independent Review of the Sustainability of a UK Metadata
Service for Geographically Related Information. Available from
http://www.gigateway.org.uk/.
Rocha, J. G. and Henriques, P. R. (2004). Towards XML Web Services based
Clearinghouses. In Proceedings 7th Global Spatial Data Infrastructure Conference,
Bangalore, India.
Schweitzer, P. N. (1998). GIS and Metadata - Putting Metadata in Plain Language.
Accessed 04-06-2006.
Troll, D. and Moen, B. (2001).Report to the DLF on the Z39.50 Implementers' Group -
Moving Towards the Future of Z39.50. Issues and Options Based on ZIG Meeting
Discussions December 6-7, 2000. Available from http://www.diglib.org/.
Tsou, M.-H. (2002). An Operational Metadata Framework for Searching, Indexing, and
Retrieving Distributed Geographic Information Services on the Internet. In Egenhofer
-
7/27/2019 Avenues for developing the UKs National Geospatial Metadata Service
37/37
M. and Mark, D. Geographic Information Science (GIScience 2002): Lecture Notes in
Computer Science Vol. 2478(pp.313-332). Berlin: Springer-Verlag.
Tulloch, D. L. and Robinson, M. (2000). A progress report on a U.S. National Survey of
Geospatial Framework Data.Journal of Government Information 27, 285-298.
Tyler, G. T. (2002).Managing Metadata: Developing technical solutions for the
askGIraffe geospatial metadata gateway. Unpublished MSc. Thesis, University of
Edinburgh, Edinburgh.
Vermeij, B. (2001).Implementing European Metadata Using ArcCatalog: ArcUser
July-September 2001. Available from http://www.esri.com/news/arcuser/.