spatial data on the web tools and guidance for data providers · elise initiative - spatial data on...

56
Commission européenne, B-1049 Bruxelles / Europese Commissie, B-1049 Brussel - Belgium. Telephone: (32-2) 299 11 11. Office: 05/45. Telephone: direct line (32-2) 2999659. Commission européenne, L-2920 Luxembourg. Telephone: (352) 43 01-1. Joint Research Centre (JRC) Spatial Data on the Web tools and guidance for data providers ELISE initiative Date: 12/10/2017 Doc. Version: v1.0

Upload: nguyenkhuong

Post on 21-Apr-2018

222 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: Spatial Data on the Web tools and guidance for data providers · ELISE initiative - Spatial Data on the Web tools and guidance for data providers Date: 12/10/2017 2 / 56 Doc. Version:

Commission européenne, B-1049 Bruxelles / Europese Commissie, B-1049 Brussel - Belgium. Telephone: (32-2) 299 11 11. Office: 05/45. Telephone: direct line (32-2) 2999659.

Commission européenne, L-2920 Luxembourg. Telephone: (352) 43 01-1.

Joint Research Centre (JRC)

Spatial Data on the Web tools and guidance for data providers

ELISE initiative

Date: 12/10/2017 Doc. Version: v1.0

Page 2: Spatial Data on the Web tools and guidance for data providers · ELISE initiative - Spatial Data on the Web tools and guidance for data providers Date: 12/10/2017 2 / 56 Doc. Version:

ELISE initiative - Spatial Data on the Web tools and guidance for data providers

Date: 12/10/2017 2 / 56 Doc. Version: v1.0

Document Control Information

Settings Value

Document Title: Spatial Data on the Web tools and guidance for data providers

Project Title: ELISE initiative

Document Author: Clemens Portele

Project Owner: Francesco Pignatelli

Project Manager: Constance Vervalcke

Doc. Version: v1.0

Sensitivity: Public

Date: 12/10/2017

Status: Work in progress

Document Approver(s) and Reviewer(s):

NOTE: All Approvers are required. Records of each approver must be maintained. All

Reviewers in the list are considered required unless explicitly listed as Optional.

Name Role Action Date

NN Quality Assurance

Manager

Review

Alexander Kotsev, Michael Lutz Review v0.1 16/05/2017

Robin S. Smith, Alexander Kotsev,

Andrea Perego

Review of v0.2 and

creation of v0.3

24/07/2017

Francesco Pignatelli Project Owner

Document history:

The Document Author is authorized to make the following types of changes to the doc-

ument without requiring that the document be re-approved:

Editorial, formatting, and spelling

Clarification

To request a change to this document, contact the Document Author or Owner.

Configuration Management: Document Location

The latest version of this controlled document is stored on the CITnet ELISE wiki at the

following link: https://webgate.ec.europa.eu/CITnet/confluence/x/LJLNJg

Page 3: Spatial Data on the Web tools and guidance for data providers · ELISE initiative - Spatial Data on the Web tools and guidance for data providers Date: 12/10/2017 2 / 56 Doc. Version:

ELISE initiative - Spatial Data on the Web tools and guidance for data providers

Date: 12/10/2017 3 / 56 Doc. Version: v1.0

TABLE OF CONTENTS

1 SCOPE ................................................................................................................................. 6

2 MOTIVATION AND OVERVIEW ............................................................................................ 6

3 TERMS AND DEFINITIONS ................................................................................................... 7

3.1 General remarks .............................................................................................................. 7

3.2 Spatial Thing .................................................................................................................... 7

4 SHARING SPATIAL DATA – THE CHALLENGE......................................................................... 8

4.1 Why are traditional Spatial Data Infrastructures not enough? ....................................... 8

4.2 Datasets and distributions ............................................................................................. 10

4.2.1 The status quo ........................................................................................................ 10

4.2.2 Publishing datasets in the Data on the Web Best Practices ................................... 11

4.2.3 Publishing datasets in the ISO and OGC standards ................................................. 11

4.2.4 Publishing datasets in INSPIRE ................................................................................ 13

4.3 “INSPIRE datasets” ......................................................................................................... 13

4.4 The “INSPIRE – What If …?” workshop .......................................................................... 14

5 BEST PRACTICES FOR SHARING SPATIAL DATA .................................................................. 15

5.1 Overview ........................................................................................................................ 15

5.2 Web principles for spatial data ...................................................................................... 16

5.2.1 Spatial data Identifiers ............................................................................................ 16

5.2.2 Indexable data ........................................................................................................ 18

5.2.3 Linking data ............................................................................................................. 19

5.3 Spatial data .................................................................................................................... 20

5.3.1 Spatial data encoding .............................................................................................. 20

5.3.2 Geometries and coordinate reference systems ...................................................... 22

5.3.3 Relative positioning................................................................................................. 24

5.3.4 Spatial links ............................................................................................................. 24

5.3.5 Data versioning ....................................................................................................... 25

5.4 Spatial data access ......................................................................................................... 27

5.5 Spatial metadata ............................................................................................................ 34

5.6 Non-spatial aspects ........................................................................................................ 36

5.6.1 Data Vocabularies ................................................................................................... 36

5.6.2 Data Licenses .......................................................................................................... 38

5.6.3 Data Provenance and Quality ................................................................................. 38

5.6.4 Data Preservation ................................................................................................... 39

5.6.5 Feedback ................................................................................................................. 39

5.6.6 Data Enrichment ..................................................................................................... 40

5.6.7 Republication .......................................................................................................... 41

6 ADDRESSING THE CHALLENGE ........................................................................................... 42

6.1 The approach ................................................................................................................. 42

6.1.1 General remarks ..................................................................................................... 42

6.1.2 Using a proxy ........................................................................................................... 42

6.1.3 ldproxy .................................................................................................................... 44

6.1.4 Evolution of INSPIRE ............................................................................................... 46

6.2 Download services supporting a proxy layer ................................................................. 46

6.2.1 General .................................................................................................................... 46

6.2.2 Web Feature Service (WFS) .................................................................................... 47

6.2.3 Atom ....................................................................................................................... 47

Page 4: Spatial Data on the Web tools and guidance for data providers · ELISE initiative - Spatial Data on the Web tools and guidance for data providers Date: 12/10/2017 2 / 56 Doc. Version:

ELISE initiative - Spatial Data on the Web tools and guidance for data providers

Date: 12/10/2017 4 / 56 Doc. Version: v1.0

6.2.4 Sensor Observation Service (SOS) ........................................................................... 47

6.2.5 Web Coverage Service (WCS) ................................................................................. 48

6.3 Access to a dataset distributed using WFS via a “landing page” ................................... 49

6.4 Data formats for vector data - HTML ............................................................................. 50

6.5 Data formats for vector data - GeoJSON ....................................................................... 51

6.6 Data formats for vector data – RDF / JSON-LD .............................................................. 51

6.7 Data access API for vector data ..................................................................................... 51

6.8 Coordinate reference systems ....................................................................................... 52

6.9 Sitemaps ........................................................................................................................ 52

6.10 Fault-tolerant proxy ....................................................................................................... 52

6.11 Datasets and distributions ............................................................................................. 52

6.12 “INSPIRE datasets” ......................................................................................................... 53

6.13 Improve metadata content ............................................................................................ 53

6.14 Additional ideas ............................................................................................................. 54

6.14.1 Ideas for the INSPIRE geoportal ......................................................................... 54

6.14.2 Ideas for “INSPIRE in Practice” ........................................................................... 54

APPENDIX 1: REFERENCES AND RELATED DOCUMENTS .......................................................... 55

Page 5: Spatial Data on the Web tools and guidance for data providers · ELISE initiative - Spatial Data on the Web tools and guidance for data providers Date: 12/10/2017 2 / 56 Doc. Version:

ELISE initiative - Spatial Data on the Web tools and guidance for data providers

Date: 12/10/2017 5 / 56 Doc. Version: v1.0

EXECUTIVE SUMMARY

Access to data is increasingly happening over the Web and this trend will continue. Web technologies are key to several other technological trends which are important in the context of spatial data, e.g. Cloud offerings or the Internet-of-Things (IoT). There-fore, sharing data should be done in ways that are consistent with the practices that reflect how the Web is used today.

Spatial Data Infrastructures (SDIs) started to use web services and open formats al-ready in the late 1990s / early 2000s. The technical architectures of SDIs have been quite stable since then. In parallel, the Web and the related practices and expectations with respect to data “on the Web” have evolved.

This report starts by describing and illustrating some known challenges with respect to using SDIs to share spatial data on the Web. These are cases where the concepts and approaches that are used in INSPIRE – and other SDIs – are not compatible with the ex-pectations of developers and users familiar with the Web, but not SDIs.

Two new Best Practice documents related to publishing spatial data on the Web, i.e. the Data on the Web Best Practices (W3C, 2017) and the Spatial Data on the Web Best Practices (W3C/OGC, 2017), are analysed how they relate to the current INSPIRE Tech-nical Guidance documents as well as the INSPIRE implementation in practice.

The results of the analysis are the basis for recommendations for experiments and technical measures to explore how INSPIRE could better support developers and users on the Web.

The recommendations are based on the assumption that any changes to INSPIRE will follow an evolutionary path. INSPIRE should build on the existing legal and guidance framework, and – at the same time – create the capability to explore and test new technical options. A useful approach to such experimentation is to build additional components for testing new approaches on top of the existing infrastructure, e.g. us-ing a layer with proxies.

Updating technical guidelines, or the legal framework, should only be considered once the benefits of the change for users of INSPIRE have been proven in practice and that they clearly outweigh the cost of adapting/migrating the infrastructure for all data pro-viders, service providers, users and software vendors/developers to new revisions.

This report is part of the work on “Location Interoperability tools” in the ELISE initiative of the European Commission.

It targets everyone interested in the technical evolution of INSPIRE and the ways in which spatial data is published and used. This includes those responsible for Spatial Data Infrastructures (SDI) as well as providers and users of spatial data. A general un-derstanding of the Web architecture and the INSPIRE technical guidance is assumed.

Page 6: Spatial Data on the Web tools and guidance for data providers · ELISE initiative - Spatial Data on the Web tools and guidance for data providers Date: 12/10/2017 2 / 56 Doc. Version:

ELISE initiative - Spatial Data on the Web tools and guidance for data providers

Date: 12/10/2017 6 / 56 Doc. Version: v1.0

1 SCOPE

This report is part of the work on “Location Interoperability tools” in the ELISE initia-tive. It

analyses how the emerging best practices and tools for sharing Spatial Data on the Web can complement the existing INSPIRE technical guidelines;

provides recommendations how to remove obstacles for 'mainstream' Web devel-opers and users that limit the discovery, access and use of INSPIRE data.

The document targets everyone interested in the technical evolution of INSPIRE and the ways in which spatial data is published and used. This includes those responsible for Spatial Data Infrastructures (SDI) as well as providers and users of spatial data. A general understanding of the Web architecture and the INSPIRE technical guidance is assumed.

This work builds on the results of the “Spatial Data on the Web” testbed commissioned by Geonovum (Netherlands) [1] and the related work by W3C and OGC on Best Prac-tices for sharing Data on the Web [2] and Spatial Data on the Web [3]. It also includes results of the discussions at the “INSPIRE – What if…?” workshop at the March 2017 OGC TC meeting [4].

This document does not consider how to address mechanisms for authentication and access control.

This document is intended to facilitate a discussion about additional and voluntary guidance in order to improve the readiness of INSPIRE data for the Web. Any poten-tial change proposals to Technical Guidance documents should only be considered by the INSPIRE Maintenance and Implementation Group after thorough experimenta-tion and evidence that new solutions bring clear benefits to users.

2 MOTIVATION AND OVERVIEW

Access to data is increasingly happening over the Web and this trend will continue. Web technologies are also key to several other technological trends which are im-portant in the context of spatial data, e.g. Cloud offerings or the Internet-of-Things (IoT). Therefore, sharing data should be done in ways that are consistent with the prac-tices that reflect how the Web is used today.

SDIs started to use web services and open formats already in the late 1990s / early 2000s. However, the Web and the related practices and expectations with respect to data “on the Web” have evolved during this period, too.

Chapter 4 describes and illustrates some challenges with sharing spatial data on the Web.

Chapter 4.4 is an analysis of the two relevant Best Practice documents related to pub-lishing spatial data on the Web and how they relate to the current INSPIRE Technical Guidance documents as well as the INSPIRE implementation in practice:

Data on the Web Best Practice, W3C Recommendation, 31 January 2017 [DWBP]

Spatial Data on the Web Best Practice, W3C Working Group Note, 11 May 2017 [SDWBP]

This analysis focuses on the emerging broader picture and is intended as a basis to dis-cuss and agree on which gaps / incompatibilities / opportunities should be explored in

Page 7: Spatial Data on the Web tools and guidance for data providers · ELISE initiative - Spatial Data on the Web tools and guidance for data providers Date: 12/10/2017 2 / 56 Doc. Version:

ELISE initiative - Spatial Data on the Web tools and guidance for data providers

Date: 12/10/2017 7 / 56 Doc. Version: v1.0

more detail. A result can be, for example, ideas for experiments, proposals for poten-tial changes to the existing technical guidelines or proposals for new technical guide-lines or documentation.

The current focus is on improvements related to discovery (e.g. indexing by search en-gines), linkability (persistent URIs for datasets and spatial objects) and sharing mean-ingful content for humans and non-geo-developers (useful HTML, easy to use "API" / formats / CRS).

Chapter 6 discusses recommendations for technical measures, by considering how to address key gaps identified in the assessment of the Best Practice documents and the current INSPIRE Technical Guidelines.

The recommendations are based on the assumption that any changes to INSPIRE will follow an evolutionary path. INSPIRE should build on the existing legal and guidance framework, and – at the same time – create the capability to explore and test new technical options.

A useful approach to such experimentation is to build additional components for test-ing new approaches on top of the existing infrastructure as this will also expose com-patibility issues, while providing a re-usable implementation option.

Updating technical guidelines, or the legal framework, should only be considered once the benefits of the change for users of INSPIRE have been proven in practice and that they clearly outweigh the cost of adapting/migrating the infrastructure for all data pro-viders, service providers, users and software vendors/developers to new revisions. This applies in particular in the case of changes that would require modifications in applica-tions using data from INSPIRE.

Success criteria need to be agreed and used to measure value. Conformance or in-teroperability should, in themselves, not be the ultimate goals, but the ready access to and use of the content of INSPIRE.

3 TERMS AND DEFINITIONS

3.1 General remarks

This document assumes familiarity with the INSPIRE framework including its legal doc-uments and the Technical Guidelines. It also assumes a general understanding of the architecture of the Web and the terminology used in this context. However, some of the terms used in the Best Practice documents are relatively new and need an intro-duction.

3.2 Spatial Thing

The Spatial Data on the Web Best Practices basically use the term “Spatial Thing” where ISO/TC 211 and OGC use “feature” and INSPIRE uses “spatial object”. The main reason is that “feature” is understood by many outside of the geo-community as a ca-pability of a system, application or component.

Another subtle point is that the term “Spatial Thing” is meant to refer to both the real-world phenomenon and its abstraction (i.e., the feature). However, this aspect has al-most no practical impact as it is normal in models of spatial data to associate both properties of the real-world phenomenon (e.g. its height) and of the digital record (e.g., timestamp of the last update) to a feature.

Page 8: Spatial Data on the Web tools and guidance for data providers · ELISE initiative - Spatial Data on the Web tools and guidance for data providers Date: 12/10/2017 2 / 56 Doc. Version:

ELISE initiative - Spatial Data on the Web tools and guidance for data providers

Date: 12/10/2017 8 / 56 Doc. Version: v1.0

A consequence of this in INSPIRE is the distinction between external object identifiers (the identifier of a feature) and thematic identifiers (the identifier of a real-world phe-nomenon). A feature in INSPIRE may carry both identifiers. Multiple features of the same real-world phenomenon may exist in different datasets (e.g., at different scales or maintained by different authorities) and, although they will have different external object identifiers, they should use the same thematic identifier.

4 SHARING SPATIAL DATA – THE CHALLENGE

4.1 Why are traditional Spatial Data Infrastructures not enough?

The Data on the Web Best Practices (DWBP) provide a set of recommendations that are applicable to the publication of all types of data on the Web. Those best practices cover aspects including data formats, data access, data identifiers, metadata, licensing and provenance.

The Spatial Data on the Web Best Practices (SDWBP) adds detail and additional best practices to provide more specific guidance for spatial data with the goal to integrate spatial data better within the wider Web of data. The Spatial Data on the Web Best Practice discusses why “traditional” SDIs – like INSPIRE – are not enough to meet the needs or expectations of developers and users familiar with today’s Web:

Finding, accessing and using data disseminated through SDIs based on OGC Web ser-vices is difficult for non-expert users. There are several reasons, including:

In SDIs, catalog services are intended to be used for discovering spatial assets, not the general-purpose search engines of the Web. OGC Web services do not address indexing of their content by those search engines.

By design, the catalog services only provide access to metadata - and in general metadata that is focused on the needs of expert users - not the data itself.

Users cannot just “follow links” to access data, it is typically necessary to construct some kind of query to access data. Often these queries are complex to define, re-quiring in-depth knowledge of both data structure and the domain-specific query language.

In addition, it is often difficult for non-expert users to understand and use the data. This can be caused, in part, by complexities specific to the geospatial domain that are difficult for non-experts (e.g., the handling of coordinates in different coordinate reference systems), but hard to avoid entirely. At the same time, datasets often ad-dress requirements of expert communities with diverse needs, resulting in compre-hensive, but complex specifications that cover many edge cases, too. Moreover, ge-ospatial data is typically available in formats that are not easy to process for non-expert users.

However, SDIs are a key component of the broader spatial data ecosystem. Such infra-structures typically include policies, workflows and tools related to the management and curation of spatial datasets, and provide mechanism to support the rich set of capabili-ties required by the expert community of data providers and users. Our goal is to help spatial data publishers build on these foundations to enable the spatial data from SDIs to be fully integrated with the Web of data.

Page 9: Spatial Data on the Web tools and guidance for data providers · ELISE initiative - Spatial Data on the Web tools and guidance for data providers Date: 12/10/2017 2 / 56 Doc. Version:

ELISE initiative - Spatial Data on the Web tools and guidance for data providers

Date: 12/10/2017 9 / 56 Doc. Version: v1.0

When your starting point is a SDI, you should at least read the following best practices. These provide the most important extra steps that should be taken to bring spatial data from SDIs to the Web:

Best Practice 1: Use globally unique persistent HTTP URIs for Spatial Things

Best Practice 2: Make your spatial data indexable by search engines

Best Practice 3: Link resources together to create the Web of data

Best Practice 12: Expose spatial data through 'convenience APIs'

The rest of the best practices provide more detail on specific aspects of publishing spa-tial data on the Web, such as metadata, geometries, CRS information, versioned data, and so on.

To illustrate the difficulties for non-expert users, Table 1 compares the typical steps for discovering and evaluating data in INSPIRE with how most people would expect to find data on the web.

Table 1: Discovering, evaluating and using datasets

Typical steps in INSPIRE1 Typical steps expected by Web users

Open the geoportal in a web browser using its URL

Navigate to dialog that supports searching datasets

Enter search parameters, typically us-ing structured search parameters (us-ing the free text search “addresses Netherlands” returns no results, but an address dataset can be found us-ing the structured search selecting the data theme “addresses” and the origin “Netherlands”)

Browse through results and deter-mine dataset to evaluate in more de-tail

Copy WFS GetCapabilities URL, open a WFS client and use the client to ac-cess the data

Analyse the dataset to determine, if it provides the necessary information and if yes, use the client to download it or use the WFS interface to access the data

Enter search terms, e.g. “address da-taset Netherlands” in the ad-dress/search bar of a web browser

Browse through results and deter-mine, if one is an address dataset covering the Netherlands or has a link to such a dataset

Browse through the dataset to deter-mine if it provides the necessary in-formation and if yes, click to down-load it or study the online API docu-mentation and examples to access the data

Use the address data in the applica-tion

1 The future INSPIRE geoportal under development is expected to support these steps in a more seam-less way.

Page 10: Spatial Data on the Web tools and guidance for data providers · ELISE initiative - Spatial Data on the Web tools and guidance for data providers Date: 12/10/2017 2 / 56 Doc. Version:

ELISE initiative - Spatial Data on the Web tools and guidance for data providers

Date: 12/10/2017 10 / 56 Doc. Version: v1.0

Use the address data in the applica-tion

For the non-expert, the process in INSPIRE is much harder and almost impossible to complete without additional help for several reasons:

Prior knowledge is required that data has to be searched for by using a geoportal.

Browsing through the dataset descriptions based on the structured ISO 19115 metadata is helpful for GIS / SDI experts, but can be confusing for non-experts.

How to use an OGC web service GetCapabilities URL requires prior knowledge as the URL just returns an XML document, but no information about how to process it. In addition, a specific client is needed to access the dataset as there is no link to the data from the capabilities document.

To access and evaluate the dataset, a client supporting the OGC web service URL is needed and is, in general, not readily available to non-experts.

Spatial data, including in INSPIRE, is often presented in ways that is not easy to un-derstand by non-experts. Often it is necessary to study several documents (INSPIRE technical guidance documents, standards) to understand the data.

It should be noted that there have been attempts in the past to improve the search for content that is shared via OGC web services. An example is the project BOLEGWEB (BOrderLEss Geospatial WEB)2. However, most of these activities focussed on the metadata only. A key difference is that harvesting catalogues and the capabilities of OGC Web Services and publishing them in HTML is only a small part of what is de-scribed in the Spatial Data on the Web Best Practices. The main challenges of making the data itself discoverable are not addressed by any approach restricted to metadata about datasets or services. For example, the BOLEGWEB catalog will help a user, if they know how to handle OGC Web Services, but it does not let them (or web crawlers) dis-cover nor access any data.

4.2 Datasets and distributions

4.2.1 The status quo

Searching in most geoportals for some keyword will lead to receiving several results for a single dataset, typically at least:

The result for the dataset

One or more results for view services3

One or more results for download services4

The relationship between these results – i.e., that they represent the same dataset – is typically not obvious to the user.

2 https://bolegweb.geof.unizg.hr/

3 In INSPIRE, view services allow users to have a sample of what the dataset looks like without being able to query it.

4 In INSPIRE, download services allow users to select data in a dataset and download it.

Page 11: Spatial Data on the Web tools and guidance for data providers · ELISE initiative - Spatial Data on the Web tools and guidance for data providers Date: 12/10/2017 2 / 56 Doc. Version:

ELISE initiative - Spatial Data on the Web tools and guidance for data providers

Date: 12/10/2017 11 / 56 Doc. Version: v1.0

From a user perspective, in particular a user that is not familiar with SDIs, this is con-fusing. For most users, the services are “just” different ways to share the dataset.

In most cases, users will search for data – after all, it is a spatial data infrastructure. Metadata and view/download services only exist to support discovering, evaluating and accessing the data.

An expert user or a tool may be interested only in the subset of datasets that may be distributed via a WFS or WMS, because this is what suits their intended use and their tooling. In general, however, no one outside of the SDI community will search for a ser-vice but, rather, only for data.

4.2.2 Publishing datasets in the Data on the Web Best Practices

The Data on the Web Best Practices distinguish between “datasets” and “distribu-tions”. The concepts and their definitions are taken from the Data Catalog Vocabulary (DCAT)5 with the following definitions:

Dataset: a collection of data, published or curated by a single agent, and available for access or download in one or more formats; a dataset does not have to be available as a downloadable file

Distribution: represents a specific available form of a dataset; each dataset might be available in different forms; these forms might represent different formats of the da-taset or different endpoints; examples of distributions include a downloadable CSV file, an API or an RSS feed

This implies that, to most users, a service (such as a WFS or a WMS) is simply an addi-tional distribution of a dataset.

See Figure 1 for the role of these concepts in the context of publishing a dataset.

Figure 1 – Publication of a dataset (from the Data on the Web Best Practices)

4.2.3 Publishing datasets in the ISO and OGC standards

Today’s SDIs follow similar concepts, but with subtle differences. While the DCAT defi-nition of dataset is, in general, compatible with the definition in INSPIRE (a spatial data set is an identifiable collection of spatial data), which in turn was taken from ISO

5 https://www.w3.org/TR/vocab-dcat/

Page 12: Spatial Data on the Web tools and guidance for data providers · ELISE initiative - Spatial Data on the Web tools and guidance for data providers Date: 12/10/2017 2 / 56 Doc. Version:

ELISE initiative - Spatial Data on the Web tools and guidance for data providers

Date: 12/10/2017 12 / 56 Doc. Version: v1.0

19115, the concept of distributions is different. While “distributions” are also part of ISO 19115:2003 (see Figure 2), there are differences:

Only one distribution exists per dataset, meaning that ISO 19115:2003 clearly has a different understanding of “distributions” than DCAT or the W3C Data on the Web Best Practices6.

Another clear difference is that information about available formats is not related to information about referenced online resources and it is unclear which link sup-ports which format(s).

The ISO 19115 model is mainly designed to support offline distributions via hard-ware media like magnetic disks or tapes. See the data type MD_Medium and the values of the code lists MD_MediumFormatCode and MD_MediuamNameCode. The data type CI_OnlineResource is basically just a URI with some optional descrip-tive free text fields that are of limited use to software agents.

Figure 2 - Distributions in ISO 19115:2003

In addition, the OGC web service standards do not use the concept of a dataset, so it is unclear whether an OGC web service operates on one or more datasets, and in the lat-ter case which of the data belongs to which dataset.

6 The new revision ISO 19115-1:2014 supports multiple distributions per dataset.

Page 13: Spatial Data on the Web tools and guidance for data providers · ELISE initiative - Spatial Data on the Web tools and guidance for data providers Date: 12/10/2017 2 / 56 Doc. Version:

ELISE initiative - Spatial Data on the Web tools and guidance for data providers

Date: 12/10/2017 13 / 56 Doc. Version: v1.0

4.2.4 Publishing datasets in INSPIRE

The INSPIRE Regulation on metadata7 does not make explicit use of the concept of dis-tributions and only requires the capability to associate URLs that link to the dataset or to information about the dataset. I.e., some of the links may point to distributions, but there is no way for software to determine which ones.

Version 2.0 of the Technical Guidelines for metadata recommends to provide addi-tional information for humans (e.g., name and description) and that one of the follow-ing resource locators is used (TG Recommendation 1.8):

direct access for downloading the described data set,

a service metadata (capabilities) document of a Spatial Data Service used for providing this data set,

a service WSDL document of a Spatial Data Service used for providing this data set (if SOAP binding is available),

a client application that directly accesses the described data set, or

a web page with further instructions for accessing the described data set.

This implies that dataset metadata in INSPIRE may provide one or more links to related resources, but it does not require information about the semantics of the link (a file, a web service or an API, an application, a web page, etc.) nor about the formats / media types of the linked resource.

On the other hand, spatial data services that operate on one or more datasets must reference the dataset using the “coupled resource” element.

In the sense of the Data on the Web Best Practices, the distributions in INSPIRE are the view and download services (or other spatial data services providing some kind of ac-cess to a dataset).

As a result, INSPIRE metadata also includes the relationship between a dataset and its distributions (the download and view services), but it is the distributions that reference the dataset, not the other way round. In addition, a service (distribution) may operate on more than one dataset.

It should be noted that GeoDCAT-AP [5] seems to take a different view and does not consider the download and view services as distributions.

4.3 “INSPIRE datasets”

This gets more confusing to users in the cases where a data provider has decided to publish the distribution of a dataset that conforms to the INSPIRE data specifications as a new dataset.

If a data provider transforms the information in a dataset to provide a representation that is based on the INSPIRE application schemas and, for example, encodes the data according to the INSPIRE GML application schemas, this data is not a new dataset, it is a new distribution of the existing dataset8.

7 See http://inspire.ec.europa.eu/Legislation/Metadata/6541

8 In other cases, where INSPIRE is used to reorganise the existing datasets of the organisation (for exam-ple, merging information from two original datasets to cover the scope of a data theme as good as possible), that dataset can be considered as a new dataset.

Page 14: Spatial Data on the Web tools and guidance for data providers · ELISE initiative - Spatial Data on the Web tools and guidance for data providers Date: 12/10/2017 2 / 56 Doc. Version:

ELISE initiative - Spatial Data on the Web tools and guidance for data providers

Date: 12/10/2017 14 / 56 Doc. Version: v1.0

The INSPIRE Directive, as well as the Implementing Rules and Technical Guidelines, dis-tinguish whether a dataset is in scope of INSPIRE or not, but publishing an existing da-taset that is in scope of INSPIRE in accordance with the requirements of the Imple-menting Rules and/or Technical Guidelines does not create a new dataset.

Sometimes the term “INSPIRE dataset” is used (which is not defined by INSPIRE itself). When this is the case, it is typically used in one of the following meanings:

for a dataset that is within the scope of INSPIRE;

for a new dataset distribution that has been created for INSPIRE.

The second meaning arguably differs at least from how it was foreseen in the INSPIRE Directive, which states that “spatial data sets shall be made available in conformity with the implementing rules either through the adaptation of existing spatial data sets or through transformation services [enabling spatial data sets to be transformed with a view to achieving interoperability]”. That is, a transformed, interoperable distribution is still considered to be the same dataset.

This also implies that the INSPIRE view and download services should be coupled to the existing dataset (in the “coupled resource” metadata element).

In practice, not all data providers seem to follow this practice, but publish the repre-sentation of an existing dataset meeting the INSPIRE interoperability requirements as a new dataset in their catalogues.

Note that in that case, a geoportal user will get even more results for the same dataset from the geoportal search, as existing data in an organisation’s format and the INSPIRE dataset are both likely to appear.

In the terminology of the Data on the Web Best Practices, the INSPIRE view and down-load services are simply new distributions of the existing dataset. This should be kept in mind when considering how to share spatial data on the Web.

4.4 The “INSPIRE – What If …?” workshop

INSPIRE implementation is now progressing across the EU, and INSPIRE data, services and principles are being proposed for ensuring interoperability across sectors. At the same time, new digital technologies (smartphones, 5G mobile networks, cloud compu-ting, Internet of Things, e-platforms …) are transforming the economy and society and are imposing new policy challenges and opportunities. In this context, the workshop aimed at having an open discussion and brainstorming about what the INSPIRE infra-structure could look like if we had to design it today, integrating new data sources and exploiting new ICT opportunities (with a time horizon for implementation by 2025-2030).

10 position papers9 were received before the workshop addressing one or several of the following questions:

What standards and technologies should the infrastructure be based on?

What architectural pattern would you recommend? What should be the main components of the infrastructure?

9 The call is available here: http://inspire.ec.europa.eu/sites/default/files/inspire_what_if._call_for_po-sition_papers.pdf

Page 15: Spatial Data on the Web tools and guidance for data providers · ELISE initiative - Spatial Data on the Web tools and guidance for data providers Date: 12/10/2017 2 / 56 Doc. Version:

ELISE initiative - Spatial Data on the Web tools and guidance for data providers

Date: 12/10/2017 15 / 56 Doc. Version: v1.0

How would you organise the implementation process and make it cost-effi-cient?

How would you ensure a wide adoption and use of the infrastructure?

The workshop was attended by more than 40 participants and features short position statements from the authors of the position papers and discussions in break-out groups on the 4 topics listed above.

The main conclusions and recommendations from the discussions were:

There were no revolutionary ideas (in the discussions or position papers).

No-one knows what will be the future will bring, so rather than focusing on a reorientation on a particular set of new technologies, we need to make sure that the infrastructure is flexible enough to allow for technological change (whatever it may be).

Any changes in the infrastructure should be step-wise and based on experimen-tation and proven benefits (compared with the current approach).

We need to define success factors for INSPIRE, and it should not be just "com-pliance".

We may need a more flexible system for technical compliance to lower the en-try barrier, e.g. similar to the 5-star system for linked open data.

Use more web standards (e.g. dereferenceable http URIs, RESTful APIs and JSON) in addition to, or on top of, the current SDI technologies and standards.

Good examples and reference implementations will make implementation eas-ier.

5 BEST PRACTICES FOR SHARING SPATIAL DATA

5.1 Overview

This Chapter highlights information from the two Data on the Web (DWBP) and Spatial Data on the Web (SDWBP) best practices that is relevant for INSPIRE. The section is or-ganised by the topics that are used by the Spatial Data on the Web Best Practice docu-ment to group the practices.

For each topic, the best practices are listed and an assessment is provided, outlining if and how INSPIRE implements the best practice.

In addition, where this document includes a recommendation how to address a gap, the recommendation is referenced.

In order to provide links to approaches and tools stakeholders are using to implemen-tation, links are made to the INSPIRE in Practice platform10 which provides vocabular-ies of tasks for implementing INSPIRE and for using INSPIRE data. Where applicable, the relevant tasks in the vocabularies are identified, too.

10 https://inspire-reference.jrc.ec.europa.eu/

Page 16: Spatial Data on the Web tools and guidance for data providers · ELISE initiative - Spatial Data on the Web tools and guidance for data providers Date: 12/10/2017 2 / 56 Doc. Version:

ELISE initiative - Spatial Data on the Web tools and guidance for data providers

Date: 12/10/2017 16 / 56 Doc. Version: v1.0

5.2 Web principles for spatial data

5.2.1 Spatial data Identifiers

Identifiers are a key aspect and using persistent HTTP URIs that can be dereferenced are a pre-condition for sharing data on the Web.

This topic is related to the implementation task “implement identifier management and life-cycle rules” (dsi-iop-ids).

Such considerations should be seen alongside the management issues for persistent identifiers such as their governance, organisational, finance and architecture, as previ-ously discussed in the context of ARE3NA (ISA Action 1.17)11.

Table 2: DWBP 9

DWBP 9: Use persistent URIs as identifiers of datasets

Identify each dataset by a carefully chosen, persistent URI.

Key statements:

Persistent identifiers are an essential pre-condition for proper data management and reuse.

Developers may build URIs into their code and so it is important that those URIs persist and that they dereference to the same resource over time without the need for human intervention.

Datasets or information about datasets will be discoverable and citable through time, regardless of the status, availability or format of the data.

Assessment:

The dataset metadata must contain a “unique resource identifier”, but its value does not have to be and in general is not a HTTP URI. It cannot be dereferenced and per-sistency is not required.

In some sense, there are other potential candidates like the URL of the download service but, in general, these are not persistent either and are not identifiers of the dataset. In general, such identifiers may (and probably will) change with time due to, for example, new service URLs, changes in how the data is organised and changes in the service version supported by a service.

Links to related recommendations:

6.3 Access to a dataset distributed using WFS via a “landing page”

Table 3: DWBP 10, SDWBP 7

DWBP 10: Use persistent URIs as identifiers within datasets

Reuse other people's URIs as identifiers within datasets where possible.

SDWBP 1: Use globally unique persistent HTTP URIs for Spatial Things

11 See https://joinup.ec.europa.eu/asset/are3na-reuse/document/governance-persistent-identifiers

Page 17: Spatial Data on the Web tools and guidance for data providers · ELISE initiative - Spatial Data on the Web tools and guidance for data providers Date: 12/10/2017 2 / 56 Doc. Version:

ELISE initiative - Spatial Data on the Web tools and guidance for data providers

Date: 12/10/2017 17 / 56 Doc. Version: v1.0

Use stable HTTP URIs to identify Spatial Things, re-using commonly used URIs where they exist and it is appropriate to do so.

Key statements:

Intended Outcome: Spatial Things become part of the Web’s global information space enabling them be linked with other Spatial Things and other resources and for those links to be durable. In other words, spatial data becomes part of the Web of Data.

Assessment:

While INSPIRE recommends to implement this practice for spatial objects12, i.e., the items in a spatial dataset, this guidance is typically not followed:

While a property “inspireId” exists in most spatial object types in the INSPIRE ap-plication schemas, typically these are optional. In practice, many datasets pub-lished using the INSPIRE application schemas do not provide persistent identifi-ers.

If an identifier is provided most data comes with persistent identifiers for spatial objects that are not URIs – and it is unclear how persistent these identifiers re-ally are.

When data is published using a Direct Access Download Service using WFS 2.0, the gml:id attribute of each feature must be stable over time, too, to comply with the WFS requirements. This means that these identifiers may be used to persistently identify a spatial object, too. It is unclear, however, if data publish-ers implement their workflows accordingly.

Links to related recommendations:

6.2 Download services supporting a proxy layer

6.3 Access to a dataset distributed using WFS via a “landing page”

Table 4: DWBP 11

DWBP 11: Assign URIs to dataset versions and series

Assign URIs to individual versions of datasets as well as to the overall series.

Key statements:

Intended Outcome: Humans and software agents will be able to refer to specific versions of a dataset and to concepts such as a 'dataset series' and 'the latest version'.

Assessment:

Currently no guidance exists how to identify individual versions of a dataset in IN-SPIRE.

For large or frequently updated datasets it may often be unrealistic to provide more than the latest version of a dataset.

12 http://inspire.ec.europa.eu/ids

Page 18: Spatial Data on the Web tools and guidance for data providers · ELISE initiative - Spatial Data on the Web tools and guidance for data providers Date: 12/10/2017 2 / 56 Doc. Version:

ELISE initiative - Spatial Data on the Web tools and guidance for data providers

Date: 12/10/2017 18 / 56 Doc. Version: v1.0

INSPIRE does not have any recommendations related to long-term data curation and best practices are not being shared.

Links to related recommendations:

None, it would need to be clarified first, if there is a need for such URIs.

5.2.2 Indexable data

Search engines are the common starting point for people looking for content on the Web. However, as far as search engines are concerned, something is only 'on the Web', if it has an HTTP URI and when this URI is dereferenced, information is returned – usually in the form of a HTML page.

This topic is not related to any INSPIRE implementation tasks, but it can be related to the generic tasks related to making use of INSPIRE, namely:

“use other ways to discover data, map layers or services” (u-ps-fe-soth, not yet online); and

“evaluate if the data, map layers and services are suited for the objectives of use” (u-ps-fe-eval, not yet online).

Table 5: SDWBP 2

Best Practice 2: Make your spatial data indexable by search engines

Search engines should be able to crawl spatial data on the Web and index Spatial Things for direct discovery by users.

Key statements:

To enable humans and Web-crawlers to find HTML pages for the Spatial Things, the "landing page" [of a dataset] needs to include hyperlinks that can be fol-lowed. Where you have a larger collection of Spatial Things, you should support paging through the collection.

A pre-condition for this best practice is SDWBP 1: Use globally unique persistent HTTP URIs for Spatial Things as persistent identifiers are essential to support reli-able indexing and linking. Traditionally spatial datasets have not been main-tained with stable identifiers for Spatial Things but, rather, to share spatial data on the Web stable identifiers are a must. Sharing spatial data is more than "just" making the dataset available on the Web.

It is important to keep in mind that the HTML representations should not mainly be designed for search engines. Instead, they should present the data in a clear and understandable way to human users. The webpage about a given Spatial Thing should be useful to a user so that they link to it when they share infor-mation about that Spatial Thing. This behaviour will, typically, also improve the ranking of these pages in search results.

Assessment:

This is not implemented by INSPIRE but could be implemented on top of the current INSPIRE service network (with limitations).

Page 19: Spatial Data on the Web tools and guidance for data providers · ELISE initiative - Spatial Data on the Web tools and guidance for data providers Date: 12/10/2017 2 / 56 Doc. Version:

ELISE initiative - Spatial Data on the Web tools and guidance for data providers

Date: 12/10/2017 19 / 56 Doc. Version: v1.0

A planned ELISE deliverable, Improved interoperability tools for publishing spatial data on the Web (D2.1.2), continues investigating this best practice by building on the results of the Geonovum testbed and the discussions in the W3C/OGC Spatial Data on the Web Working Group.

Links to related recommendations:

6.2 Download services supporting a proxy layer

6.3 Access to a dataset distributed using WFS via a “landing page”

6.4 Data formats for vector data - HTML

6.9 Sitemaps

5.2.3 Linking data

Links are important. They enable humans and software tools to discover other, related resources. Search engines use links to prioritise and refine search results.

So far, linking data to other data is an unusual practice for most organisations main-taining spatial data. If links are included, they mostly occur between resources in the same dataset. For this reason, the Spatial Data on the Web Best Practices considers “the widespread use of links within data … as one of the most significant departures from contemporary practices used within SDIs”.

Linking requires the implementation of the Best Practices discussed in the previous sections, i.e. persistent identifiers for all resources that may be targets of links.

Table 6: SDWBP 3

SDWBP 3: Link resources together to create the Web of data

Bind Spatial Things into the Web of data using links to other resources, providing sufficient information for a user to determine whether the target resource speci-fied in a link will be of use.

Key statements:

Use formats that support Web linking.

Links can be identified and traversed by humans and software agents.

Sufficient information is provided to help humans and software agents deter-mine whether the traversal of a given link meets their goals.

Assessment:

While GML is a format that is designed to support Web linking, those capabilities are often used only to a limited extent:

Most links foreseen in the INSPIRE application schemas are within a dataset.

In addition, in practice, where links to spatial objects in other datasets are fore-seen, those are rarely populated.

Adding additional links beyond those included in the INSPIRE application sche-mas requires a schema extension, which is an obstacle for providing additional links to other resources.

Page 20: Spatial Data on the Web tools and guidance for data providers · ELISE initiative - Spatial Data on the Web tools and guidance for data providers Date: 12/10/2017 2 / 56 Doc. Version:

ELISE initiative - Spatial Data on the Web tools and guidance for data providers

Date: 12/10/2017 20 / 56 Doc. Version: v1.0

This can imply that while linking is or can be supported, it is not common practice. Providers of spatial data very rarely add, maintain and publish links to related re-sources, in particular to information outside of the dataset.

An exception to some extent is the use of thematic identifiers, which in some way reference the real-world object – and other information resources that also use that identifier. However, since these are not persistent HTTP URIs that can be de-refer-enced, their usability from a Web perspective is limited.

Links to related recommendations:

6.4 Data formats for vector data - HTML and 6.6 Data formats for vector data – RDF / JSON-LD includes recommendations for adding some dynamic links. Other links would need to be added first (and maintained) in the datasets.

5.3 Spatial data

5.3.1 Spatial data encoding

The Best Practices do not recommend any particular format. Different formats suit dif-ferent purposes and it is important to consider how the use of the data can be made as simple as possible for developers and end users.

This topic is related to the INSPIRE implementation tasks “use encoding specified in technical guidance” (dsi-iop-enc-tg) and “use additional encodings” (dsi-iop-enc-add).

Table 7: DWBP 12, DWBP 14, SDWBP 4

DWBP 12: Use machine-readable standardized data formats

Make data available in a machine-readable, standardized data format that is well suited to its intended or potential use.

DWBP 14: Provide data in multiple formats

Make data available in multiple formats when more than one format suits its in-tended or potential use.

SDWBP 4: Use spatial data encodings that match your target audience

Represent spatial data in a way that matches the needs of the target audiences.

Key statements:

Machines will easily be able to read and process data published on the Web and humans will be able to use computational tools typically available in the relevant domain to work with the data.

Make data available in a machine-readable standardized data format that is eas-ily parseable including but not limited to CSV, XML, HDF5, JSON and RDF seriali-sation syntaxes, such as RDF/XML, JSON-LD or Turtle.

Spatial data is used by a range of user communities, each with their own pur-poses, knowledge and preferred tools. Data publishers should consider which communities and purposes they want to mainly serve and make appropriate choices for the approach to encoding data. In general terms, data’s usefulness

Page 21: Spatial Data on the Web tools and guidance for data providers · ELISE initiative - Spatial Data on the Web tools and guidance for data providers Date: 12/10/2017 2 / 56 Doc. Version:

ELISE initiative - Spatial Data on the Web tools and guidance for data providers

Date: 12/10/2017 21 / 56 Doc. Version: v1.0

and value is increased when it can be used for more purposes. This might involve providing data in several different formats.

As many users as possible will be able to use the data without first having to transform it into their preferred format.

Assessment:

DWBP 12 is implemented using XML-based, standardised formats as the default en-coding. GML is used as the default encoding for data (including standardised GML application schemas like OM-XML for observations and measurements), the ISO/TS 19139 schemas for metadata.

For raster data with 2D grids (elevation, orthophotos), INSPIRE uses TIFF / GeoTIFF.

These are encodings that are commonly used in SDIs and by the expert community. Tools and libraries exist to handle the data, although limitations exist, for example, if XML structures become complex.

The discussions at the “INSPIRE - What if …”-workshop in March 2017 indicated that INSPIRE should explore additional, simpler feature encodings (at a level of complex-ity of the GML Simple Feature Level 0 profile / GeoJSON), possibly dropping some in-formation that is rarely available or rarely used, to simplify the use of the data.

Additional formats beyond the standard encoding are explicitly allowed by INSPIRE, although they are rarely being provided by communities and individual data provid-ers.

However, in the “Fitness for purpose” activity that is part of the current INSPIRE Maintenance and Implementation Work Programme13, the development of addi-tional simplified encodings are being discussed as part of a new work item.

As many developers currently favour JSON over XML, providing data also in a JSON-based format like GeoJSON would help developers with a preference for JSON / Ja-vaScript.

For those developers, image formats other than TIFF (e.g., JPEG or PNG) would be easier to handle and render in a JavaScript environment, too.

In addition, RDF-based formats are under investigation as potential additional for-mats in INSPIRE to target the open data / e-government communities and simplify the use of spatial data for them14.

In order for spatial data to be used easily and reliably by target users, it is important that they have access to libraries or other software components that support the formats that are easy for them to use. In general, their availability should be consid-ered before adding an additional encoding.

Links to related recommendations:

6.5 Data formats for vector data - GeoJSON

6.6 Data formats for vector data – RDF / JSON-LD

13 https://ies-svn.jrc.ec.europa.eu/projects/2016-1

14 See https://joinup.ec.europa.eu/asset/are3na-reuse/description

Page 22: Spatial Data on the Web tools and guidance for data providers · ELISE initiative - Spatial Data on the Web tools and guidance for data providers Date: 12/10/2017 2 / 56 Doc. Version:

ELISE initiative - Spatial Data on the Web tools and guidance for data providers

Date: 12/10/2017 22 / 56 Doc. Version: v1.0

Table 8: DWBP 13

DWBP 13: Use locale-neutral data representations

Use locale-neutral data structures and values, or, where that is not possible, pro-vide metadata about the locale used by data values.

Key statements:

Humans and software agents will be able to interpret the meaning of strings rep-resenting dates, times, currencies and numbers etc. accurately.

Assessment:

This is one of the design guidelines in the Generic Conceptual Model of INSPIRE15, which has been implemented in the INSPIRE data specifications.

For dates, times, numbers, measurements, etc., the basic data types from ISO 19103 are used.

In addition, free text attributes are avoided whenever possible. The use of code lists and enumerations is recommended whenever possible.

However, it should be understood that when presenting the data in HTML pages for both human users and search engines, the locale-neutral data representations should be translated to human readable labels, potentially supporting different lo-cales.

Links to related recommendations:

None, this practice is fully implemented in INSPIRE. However, given issues about the HTML representation, this issue needs to be considered in terms of 6.4 Data formats for vector data - HTML.

5.3.2 Geometries and coordinate reference systems

Location information is a common constituent of spatial data and can be an important 'hook' for finding information and for integrating different datasets. The INSPIRE Di-rective captures this in Article 7(4) by requiring that “the definition and classification of spatial objects” as well as “the way in which those spatial data are geo-referenced” are specified in the INSPIRE implementing rules.

Table 9: SDWBP 5, SDWBP 6, SDWBP 7, SDWBP 8

SDWBP 5: Provide geometries on the Web in a usable way

Geometry data should be expressed in a way that allows its publication and use on the Web.

SDWBP 6: Provide geometries at the right level of accuracy, precision, and size

Geometry data should be provided at levels of accuracy, precision, and size fit for their use on the Web.

SDWBP 7: Choose coordinate reference systems to suit your user's applications

15 See http://inspire.ec.europa.eu/documents/inspire-generic-conceptual-model

Page 23: Spatial Data on the Web tools and guidance for data providers · ELISE initiative - Spatial Data on the Web tools and guidance for data providers Date: 12/10/2017 2 / 56 Doc. Version:

ELISE initiative - Spatial Data on the Web tools and guidance for data providers

Date: 12/10/2017 23 / 56 Doc. Version: v1.0

Consider your user's intended application when choosing the coordinate reference system(s) used to publish spatial data.

SDWBP 8: State how coordinate values are encoded

Provide enough information for users to determine how coordinate values are en-coded.

Key statements:

The format chosen to express geometry data should:

o Support the dimensionality of the geometry

o Support the coordinate reference system you need

o Be supported by the software tools used within particular data user com-munities - the geospatial and Web communities use different tools often suited to different geometry formats

o Keep geometry definitions to a level of detail and size that is appropriate for the intended applications - Web applications do not typically require detailed geometries.

Check that geospatial data is available, as a minimum, in a global coordinate ref-erence system: for vector data, this should be WGS 84 Lat/Long (EPSG:4326) or WGS 84 Lat/Long/Elevation (EPSG:4979); for raster data this should be Web Mercator (EPSG:3857).

Assessment:

GML is used as the standard encoding in most INSPIRE data specifications. Through its use of XML and Xlink, GML is designed for use on the Web and has clear rules about how to encode coordinate values and state the CRS used.

The restriction to Simple Feature geometries in most INSPIRE application schemas already simplifies the use of the geometries, as those geometries are typically sup-ported by libraries.

Currently, INSPIRE download services do not support the retrieval of simplified ge-ometries, e.g. for rendering or analysis at a small scale.

For Web developers, it would also be beneficial, if all data would be published as WGS 84 (latitude, longitude, plus elevation, where available) due to the use of that datum in GPS and its broad support in data formats and APIs. A similar situation can be found for the Web Mercator projection for raster data using 2D grids, as this pro-jection is the de facto standard for web mapping.

Links to related recommendations:

6.4 Data formats for vector data - HTML

6.5 Data formats for vector data - GeoJSON

6.6 Data formats for vector data – RDF / JSON-LD

6.7 Data access API

6.8 Coordinate reference systems

Page 24: Spatial Data on the Web tools and guidance for data providers · ELISE initiative - Spatial Data on the Web tools and guidance for data providers Date: 12/10/2017 2 / 56 Doc. Version:

ELISE initiative - Spatial Data on the Web tools and guidance for data providers

Date: 12/10/2017 24 / 56 Doc. Version: v1.0

5.3.3 Relative positioning

Sometimes, instead of using geometry and coordinates to describe a location, a loca-tion may be best described in relation to another location by giving a relative position.

Table 10: SDWBP 9

SDWBP 9: Describe relative positioning

Provide a relative positioning capability in which one entity can be positioned rela-tive to another entity.

Key statements:

Intended Outcome: It should be possible to describe the location of an entity in relation to one or more other entities or places, instead of specifying its own ge-ocentric position or geometry.

Assessment:

The INSPIRE Generic Conceptual Model discusses the benefits of using relative posi-tioning capabilities in detail (see chapter 13 and annex D). However, such a capabil-ity has been rarely used in the INSPIRE application schemas. The reason for this is probably that such a capability is difficult to handle for users of GIS tools and map viewers, which often require explicit geometries.

The INSPIRE Generic Network Model (and the application schemas for transport net-works and utility networks) supports such a capability using linear referencing. Lin-ear referencing is commonly used by some thematic domains, but the data is diffi-cult to handle for many users, as mentioned above.

Links to related recommendations:

None. Currently, more support for relative positioning capabilities does not seem to be important for increasing the usability of INSPIRE data on the Web.

5.3.4 Spatial links

This topic builds on section 5.2.3 (Linking data). It looks at the resources to use as both the source and target of links in spatial data, as well as the common categories of link relation types that might be used.

Table 11: SDWBP 10

SDWBP 10: Use appropriate relation types to link Spatial Things

Ensure that hyperlinks between Spatial Things and related resources use appropri-ate semantics.

Key statements:

Spatial (often topological) relationships can often be derived mathematically based on geometry - but this can be computationally expensive. Such relation-ships can be asserted (e.g. stating that a particular object is contained within an-other object, such as a particular island being in a given lake), thereby removing the need to do geometry-based calculations. A useful secondary benefit is that these relationships are easier for humans to understand.

Page 25: Spatial Data on the Web tools and guidance for data providers · ELISE initiative - Spatial Data on the Web tools and guidance for data providers Date: 12/10/2017 2 / 56 Doc. Version:

ELISE initiative - Spatial Data on the Web tools and guidance for data providers

Date: 12/10/2017 25 / 56 Doc. Version: v1.0

Where Spatial Things are of common interest to multiple agents, it is almost in-evitable that a given Spatial Thing will end up being identified with several URIs. Given necessary due diligence, multiple identifiers may be linked, thereby sup-porting the combination of multiple sets of information and yielding new per-spectives on Spatial Things.

Assessment:

In general, INSPIRE application schemas do not include spatial links. For example, GIS tools can derive the topological relationship where needed, so there is no need for the expert community to have those represented explicitly.

The wider use of spatial data on the web, however, may benefit from knowing about such relationships in particular cases. For example, that a fire service can know that a certain building is contained within a fenced area on an industrial site.

Links to related recommendations:

6.4 Data formats for vector data - HTML and 6.6 Data formats for vector data – RDF / JSON-LD includes recommendations for adding some dynamic links. Other links would need to be added first (and maintained) in the datasets.

5.3.5 Data versioning

Datasets and the spatial objects in them may change over time.

Table 12: DWBP 7

DWBP 7: Provide a version indicator

Assign and indicate a version number or date for each dataset.

Key statements:

Humans and software agents will easily be able to determine which version of a dataset they are working with.

Assessment:

Implemented, revision information is included in the metadata.

However, there may be a risk that datasets are updated without updating the metadata as these are often separate processes.

The current separation of metadata from the datasets in SDIs – metadata is stored and managed in catalogues, often managed under different governance arrange-ments makes this prone to inconsistencies. This is in particular complicated in cases when data change frequently.

In addition, the concept of versions of datasets is not well-defined when the dataset is comprised of near-real-time data, such as observations from sensor networks.

Links to related recommendations:

6.13 Improve metadata content

Page 26: Spatial Data on the Web tools and guidance for data providers · ELISE initiative - Spatial Data on the Web tools and guidance for data providers Date: 12/10/2017 2 / 56 Doc. Version:

ELISE initiative - Spatial Data on the Web tools and guidance for data providers

Date: 12/10/2017 26 / 56 Doc. Version: v1.0

Table 13: DWBP 8

DWBP 8: Provide version history

Provide a complete version history that explains the changes made in each ver-sion.

Key statements:

Humans and software agents will be able to understand how the dataset typi-cally changes from version to version and how any two specific versions differ.

Assessment:

Currently not implemented. No information about the change history in a dataset is provided.

Links to related recommendations:

None at the moment, although in the long term it would be useful to have a clearer mechanism to identify how datasets in INSPIRE change. It would be inter-esting to receive comments from stakeholders that have an opinion why this should or should not be addressed in INSPIRE.

Table 14: SDWBP 11

SDWBP 11: Provide information on the changing nature of spatial things

Spatial data should include metadata that allows a user to determine when it is valid for.

Key statements:

Users are provided with the most recent version of information about a Spatial Thing and its attributes by default.

Users can determine the time-period for which data is applicable.

If a version history of changes is available, users can browse through a set of changes to see how a Spatial Thing and its attributes have changed over time.

When publishing information about a Spatial Thing that is subject to change there are four approaches to consider in response to a change:

o simply updating the description of the Spatial Thing;

o republish the entire dataset with a new URI;

o providing a series of immutable snapshots that describe the Spatial Thing at various points in its lifecycle; and

o capturing a time-series of data values within an attribute of the Spatial Thing.

Assessment:

Partly implemented through the INSPIRE object life-cycle information.

This does not cover all aspects discussed in the Best Practice due to limitations in the base standards. The INSPIRE Generic Conceptual Model states: “The topic of manag-ing and publishing multiple versions of a spatial object in a consistent way is not fully

Page 27: Spatial Data on the Web tools and guidance for data providers · ELISE initiative - Spatial Data on the Web tools and guidance for data providers Date: 12/10/2017 2 / 56 Doc. Version:

ELISE initiative - Spatial Data on the Web tools and guidance for data providers

Date: 12/10/2017 27 / 56 Doc. Version: v1.0

addressed by the relevant international standards […]. The current INSPIRE data specifications are therefore only fully specified for spatial data sets that only publish the last version of a spatial object (valid or retired). If historic versions are main-tained and provided, additional specification work is needed with regard to the con-sistency of the spatial objects at any time.”

In practice, most spatial datasets seem to follow the approach to “update the de-scription of the Spatial Thing” in response to a change, but the approach to publish a series of immutable snapshots is also used frequently (e.g. yearly snapshots).

In some specific cases, a time-series is also used in INSPIRE application schemas, mostly for data observed by sensors.

Links to related recommendations:

None at the moment, although in the long term it would be useful to have a clearer mechanism how datasets in INSPIRE change. It would be interesting to receive comments from stakeholders that have an opinion why this should or should not be addressed in INSPIRE.

5.4 Spatial data access

Both Best Practice documents include an in-depth discussion about access to data on the Web and the range of available options. The Spatial Data on the Web Best Prac-tices states:

“Making data available on the Web requires data publishers to provide some form of access to the data. There are numerous mechanisms available, each providing varying levels of utility and incurring differing levels of effort and cost to implement and main-tain. Publishers of spatial data should make their data available on the Web using af-fordable mechanisms to ensure long-term, sustainable access to their data.

“When determining the mechanism to be used to provide Web access to data, publish-ers need to assess utility against cost. In order of increasing usefulness and [increasing] cost[s]:

Bulk-download or streaming of the entire or pre-defined subsets of a dataset

Generalized spatial data access API

Bespoke API designed to support a particular type of use“

INSPIRE follows a similar approach by requiring support for downloading a dataset or a pre-defined subset of a dataset (although also via an API, not just using HTTP) and only requiring support for user-defined queries (“direct access”) where feasible.

This topic is related to the INSPIRE implementation task “provide and operate down-load service” (dls) and the ‘use task’ “use INSPIRE resources” (u-u, not yet online).

Table 15: DWBP 17, DWBP 18

DWBP 17: Provide bulk download

Enable consumers to retrieve the full dataset with a single request.

DWBP 18: Provide Subsets for Large Datasets

Page 28: Spatial Data on the Web tools and guidance for data providers · ELISE initiative - Spatial Data on the Web tools and guidance for data providers Date: 12/10/2017 2 / 56 Doc. Version:

ELISE initiative - Spatial Data on the Web tools and guidance for data providers

Date: 12/10/2017 28 / 56 Doc. Version: v1.0

If your dataset is large, enable users and applications to readily work with useful subsets of your data.

Key statements:

Bulk access provides a consistent means to handle the data as one dataset. Indi-vidually accessing data over many retrievals can be cumbersome and, if used to reassemble the complete dataset, can lead to inconsistent approaches to han-dling the data.

Large datasets can be difficult to move from place to place. It can also be incon-venient for users to store or parse a large dataset. Users should not have to download a complete dataset if they only need a subset of it. Moreover, Web applications that tap into large datasets will perform better if their developers can take advantage of “lazy loading”, working with smaller pieces of a whole and pulling in new pieces only as needed. The ability to work with subsets of the data also enables offline processing to work more efficiently. Real-time applications benefit in particular, as they can update more quickly.

Assessment:

INSPIRE download services must support bulk download (“pre-defined dataset download service”) and a dataset can be split into subsets. Currently four options are specified in Technical Guidelines: an Atom feed, WFS (for vector data), WCS (for coverage/raster data) and SOS (for observation data).

A WCS-based pre-defined dataset download service supports also user-defined sub-sets of a coverage (dataset) by providing an envelope / bounding box to restrict the domain of the coverage.

The best practice states that the download should include the dataset metadata, but typically this is not the case in the INSPIRE pre-defined dataset download service op-tions.

Links to related recommendations:

6.3 Access to a dataset distributed using WFS via a “landing page”

Table 16: DWBP 19

DWBP 19: Use content negotiation for serving data available in multiple formats

Use content negotiation in addition to file extensions for serving data available in multiple formats.

Key statements:

A resource, such as a dataset, can have many representations. The same data might be available as JSON, XML, RDF, CSV and HTML. These multiple represen-tations can be made available via an API, but these should be made available from the same URL using content negotiation to return the appropriate repre-sentation.

Assessment:

Page 29: Spatial Data on the Web tools and guidance for data providers · ELISE initiative - Spatial Data on the Web tools and guidance for data providers Date: 12/10/2017 2 / 56 Doc. Version:

ELISE initiative - Spatial Data on the Web tools and guidance for data providers

Date: 12/10/2017 29 / 56 Doc. Version: v1.0

In general, this is not implemented. OGC web services typically do not support con-tent negotiation, but support this capability using other mechanisms only, for exam-ple, as a parameter “outputFormat”.

Links to related recommendations:

6.7 Data access API

Table 17: DWBP 20

DWBP 20: Provide real-time access

When data is produced in real time, make it available on the Web in real time or near real-time.

Key statements:

Applications will be able to access time-critical data in real time or near real time, where real-time means a range from milliseconds to a few seconds after the data creation.

Assessment:

Most INSPIRE datasets are not produced or provided in real-time.

Exceptions may be datasets where observations from sensors are made available, which may include near real-time data creation and access. For those cases, the use of the Sensor Observation Service is foreseen using the Observation and Measure-ments model.

However, strictly speaking, the implementing rules only require that updates are made available at the latest 6 months after the change16, i.e. there is no legal re-quirement to provide near real-time access to spatial data.

Links to related recommendations:

Probably not necessary. If the data is used, data providers would see value in keeping the INSPIRE distributions up-to-date with the latest data available in other distributions of the dataset – including access to near real-time data where it is in the scope of the INSPIRE Directive.

Table 18: DWBP 21

DWBP 21: Provide data up to date

Make data available in an up-to-date manner, and make the update frequency ex-plicit.

Key statements:

16 This is the default requirement which may be changed for a spatial data theme. However, the option to change this was not used for any spatial data theme.

Page 30: Spatial Data on the Web tools and guidance for data providers · ELISE initiative - Spatial Data on the Web tools and guidance for data providers Date: 12/10/2017 2 / 56 Doc. Version:

ELISE initiative - Spatial Data on the Web tools and guidance for data providers

Date: 12/10/2017 30 / 56 Doc. Version: v1.0

The availability of data on the Web should closely match the data creation or col-lection time, perhaps after it has been processed or changed. Carefully synchro-nising data publication to the update frequency encourages consumer confi-dence and data reuse.

Data on the Web will be updated in a timely manner so that the most recent data available online generally reflects the most recent data released via any other channel.

Assessment:

As mentioned in the assessment of DWBP 20 above, updates are only required 6 months after the update in the data set. Currently, the data provider determines how up-to-date a dataset is.

In this context, if the interoperable representation is managed in an INSPIRE-specific data store, special attention is required to keep the source dataset and the INSPIRE distribution synchronised.

Support for change-only updates could make update frequency transparent and sim-plify frequent updates.

Links to related recommendations:

This is probably not necessary. If the data is used, data providers would see value in keeping the INSPIRE distributions up-to-date with the latest data available in other distributions of the dataset.

Table 19: DWBP 22

DWBP 22: Provide an explanation for data that is not available

For data that is not available, provide an explanation about how the data can be accessed and who can access it.

Key statements:

Consumers will know that data that is referred to from the current dataset is un-available or only available under different conditions.

Assessment:

INSPIRE supports the concept on a property level (“voidable” and reasons for nil val-ues), but no analysis has taken place to understand the extent to which data provid-ers make such explanations available in a consistent way.

The “Fitness for Purpose” activity (see 5.3.1) has also discussed the "voidable" con-cept. It is very likely that information that is unavailable would simply be dropped in a simplified encoding (again, see 5.3.1).

On a feature level, no mechanism exists to provide this information.

Links to related recommendations:

None. This would need to be included in the dataset metadata in some form first, but it is unclear how valuable it would be. It would be interesting to receive comments from stakeholders that have an opinion why this should or should not be addressed in INSPIRE.

Page 31: Spatial Data on the Web tools and guidance for data providers · ELISE initiative - Spatial Data on the Web tools and guidance for data providers Date: 12/10/2017 2 / 56 Doc. Version:

ELISE initiative - Spatial Data on the Web tools and guidance for data providers

Date: 12/10/2017 31 / 56 Doc. Version: v1.0

Table 20: DWBP 23

DWBP 23: Make data available through an API

Offer an API to serve data if you have the resources to do so.

Key statements:

An API offers the greatest flexibility and means for consumers to process data. It can enable real-time data usage, filtering on request, and the ability to work with the data at an atomic level. If your dataset is large, frequently updated, or highly complex, an API is likely to be the best option for publishing your data.

Assessment:

Implemented through the INSPIRE spatial data services that operate on spatial data sets, in particular download and view services.

Download services make the data itself available.

View services process the data and render it as geo-referenced bitmap images.

Additional spatial data services operating on one of more spatial data sets can make other capabilities available, including for spatial analysis or other processing tasks.

Links to related recommendations:

6.7 Data access API

Table 21: DWBP 24

DWBP 24: Use Web Standards as the foundation of APIs

When designing APIs, use an architectural style that is founded on the technolo-gies of the Web itself.

Key statements:

APIs that are built on Web standards leverage the strengths of the Web. For ex-ample, using HTTP verbs as methods and URIs that map directly to individual re-sources helps to avoid tight coupling between requests and responses. This makes APIs that are easy to maintain that are readily understood and used by many developers. The statelessness of the Web can be a strength in enabling quick scaling, and using hypermedia enables rich interactions with your API.

Assessment:

Most of the current OGC web service standards predate the Web as it is used today and, thus, are founded only partially on the technologies of the Web. The services often do not conform to HTTP 1.1.

A “REST binding” is now part of the WMTS standard and one will be part of the next revision of WFS, but in general the KVP, XML and - in some standards - SOAP bind-ings are what is used in practice.

Links to related recommendations:

6.7 Data access API

Page 32: Spatial Data on the Web tools and guidance for data providers · ELISE initiative - Spatial Data on the Web tools and guidance for data providers Date: 12/10/2017 2 / 56 Doc. Version:

ELISE initiative - Spatial Data on the Web tools and guidance for data providers

Date: 12/10/2017 32 / 56 Doc. Version: v1.0

Table 22: DWBP 25

DWBP 25: Provide complete documentation for your API

Provide complete information on the Web about your API. Update documentation as you add features or make changes.

Key statements:

Developers are the primary consumers of an API and the documentation is the first clue about its quality and usefulness. When API documentation is complete and easy to understand, developers are probably more willing to continue their journey to use it.

Check the Time To First Successful Call (i.e. being capable of doing a successful request to the API within a few minutes will increase the chances that the devel-oper will stick to your API).

Assessment:

All versions of the INSPIRE technical guidance and the standards are published on the INSPIRE website and the websites of the standards organisations.

A challenge is that understanding the API (including the data content and the for-mats) requires a potential user to study many complex documents, which at this time are typically available as PDF only, not as HTML. As standards, they are mostly focussed on stating requirements clearly, but are typically weaker on providing ex-amples or tutorials.

However, additional artefacts have been created, such as feature catalogues and other registers made accessible through the INSPIRE Registry17 and spatial objects via the Interactive Data Specifications toolkit18, are already available in HTML / as web applications.

For APIs on the Web, the most commonly used specification format is the OpenAPI Specification (OAS), originally known as the Swagger Specification. Use of these tools and specifications will help others to use the API quickly. However, OpenAPI is de-signed for APIs that are consistent with the Web (see DWBP 24 above), so this is not a good fit for the current INSPIRE network services.

Links to related recommendations:

6.3 Access to a dataset distributed using WFS via a “landing page”

6.7 Data access API

Table 23: DWBP 26

DWBP 26: Avoid Breaking Changes to Your API

Avoid changes to your API that break client code, and communicate any changes in your API to your developers when evolution happens.

17 http://inspire.ec.europa.eu/registry/

18 http://inspire-regadmin.jrc.ec.europa.eu/dataspecification/

Page 33: Spatial Data on the Web tools and guidance for data providers · ELISE initiative - Spatial Data on the Web tools and guidance for data providers Date: 12/10/2017 2 / 56 Doc. Version:

ELISE initiative - Spatial Data on the Web tools and guidance for data providers

Date: 12/10/2017 33 / 56 Doc. Version: v1.0

Key statements:

When developers implement a client for your API, they may rely on specific char-acteristics that you have built into it, such as the schema or the format of a re-sponse. Avoiding breaking changes in your API minimises breakage to client code. Communicating changes when they do occur enables developers to take advantage of new features and, in the rare case of a breaking change, take ac-tion.

Assessment:

This will be more important in the future, as INSPIRE is still in the implementation phase.

However, the revision of the GML application schemas from version 3 to version 4 already was a breaking change.

Links to related recommendations:

None at this time. This is a general recommendation that the Member State ex-perts in the INSPIRE Maintenance and Implementation Group (MIG) need to con-sider when discussing the technical evolution of INSPIRE.

Table 24: SDWBP 12

Best Practice 12: Expose spatial data through 'convenience APIs'

If you have a specific type of application in mind for your data, tailor a spatial data access API to meet that goal.

Key statements:

Providing access to spatial data via bulk download or generalised spatial data ac-cess APIs [like WFS, WCS or SOS] may be too complex for application developers with relatively simple requirements.

Convenience APIs are tailored to meet a specific goal; enabling a user to engage with complex data structures using (a set of) simple queries, including spatial search.

Help users to get working with the data quickly to achieve common tasks.

The API provides both machine readable data and human readable HTML mark-up. The human-readable mark-up will also support search engines’ Web crawlers to enable the indexing of spatial data.

Assessment:

Considering the DWBP 24 and 25 and the request for simpler encodings, an addi-tional simple data access API for INSPIRE data that also addresses the provision of data in HTML, simple spatial queries and paged access, should be worth exploring.

The best practice has details on what such an API should support.

Links to related recommendations:

6.7 Data access API

Page 34: Spatial Data on the Web tools and guidance for data providers · ELISE initiative - Spatial Data on the Web tools and guidance for data providers Date: 12/10/2017 2 / 56 Doc. Version:

ELISE initiative - Spatial Data on the Web tools and guidance for data providers

Date: 12/10/2017 34 / 56 Doc. Version: v1.0

5.5 Spatial metadata

The provision of metadata is a fundamental requirement for sharing data on the Web as “data will not be discoverable or reusable by anyone other than the publisher, if in-sufficient metadata is provided” [DWBP].

This topic is related to the implementation tasks “create/maintain metadata for spatial data set” (dsi-md) and “provide and operate discovery service” (dis), as well as the ‘use tasks’ “use other ways to discover data, map layers or services” (u-ps-fe-soth, not yet online) and “evaluate if the data, map layers and services are suited for the objectives of use” (u-ps-fe-eval, not yet online).

Table 25: DWBP 1, DWBP 2, SDWBP 13

DWBP 1: Provide metadata

Provide metadata for both human users and computer applications.

DWBP 2: Provide descriptive metadata

Provide metadata that describes the overall features of datasets and distributions.

SDWBP 13: Include spatial metadata in dataset metadata

The description of datasets that have Spatial Things should include explicit metadata about their spatial extent, coverage, and representation

Key statements:

Dataset metadata should include the information necessary to enable spatial queries within catalog services, such as those provided by SDIs.

Dataset metadata should also include the information required for a user to evaluate whether a spatial dataset is suitable for their intended application.

Assessment:

The provision of metadata has played a central role in the INSPIRE architecture. The descriptive metadata elements listed in DWBP 2 are included in INSPIRE.

In this sense, one could argue that INSPIRE implements these best practices well. However, there are also several issues that may need wider debate:

Metadata for human users is available only via the INSPIRE geoportal, but the ge-oportal is difficult to use and the metadata entries cannot be browsed. In addi-tion, the metadata is not indexed by search engines.

Metadata about datasets and the services to access them is, in general, available in two ways: ISO 19115 metadata records via discovery services and OGC Web Service Capabilities documents in INSPIRE network services. There is an overlap between both metadata formats.

In practice, the metadata in the Capabilities documents is often inconsistent with the ISO 19115 dataset metadata (e.g. different contact information, bounding boxes, etc.).

The ISO 19115 metadata records are not provided with the dataset itself, but are shared separately via the discovery services. Sometimes the data and the

Page 35: Spatial Data on the Web tools and guidance for data providers · ELISE initiative - Spatial Data on the Web tools and guidance for data providers Date: 12/10/2017 2 / 56 Doc. Version:

ELISE initiative - Spatial Data on the Web tools and guidance for data providers

Date: 12/10/2017 35 / 56 Doc. Version: v1.0

metadata are maintained in separate processes, which is prone to lead to incon-sistencies between the data and its metadata.

ISO 19115 metadata and OGC Web Service Capabilities documents are metadata that are specific to the spatial data expert community and are not easily under-stood by others. In addition, DCAT and schema.org could be additional options to target other communities.

Capabilities documents provide metadata for applications, not for human users. The metadata about the data is, in general, limited to information that is needed to compile requests to retrieve the data.

Most offerings in INSPIRE are available only in the official language(s) of the Member State (although some automatic translation tools are made available through the INSPIRE geoportal).

5.6 As explained in 4.2 (Datasets and distributions

) and 4.3 (“INSPIRE datasets”), the way in which resources are managed in IN-SPIRE through metadata records differs to some extent from the best practices on the Web.

Links to related recommendations:

6.3 Access to a dataset distributed using WFS via a “landing page”

0

5.7 Fault-tolerant proxy

From experience, the two main reasons for failure when setting up proxy services using ldproxy are:

The schema documents advertised by the DescribeFeatureType operation of the WFS are either not accessible (outdated URLs, file or jar URLs, relative URLs, etc.) or not valid. While these are fundamental issues that should be corrected in the WFS configuration, it would also be possible to attempt to try to “reverse engi-neer” the schema information from the data in such cases. Those schemas may be incomplete and not precise, but this does not really impact the HTML repre-sentation and would allow to create proxy services also in those cases.

Since support for paging is not required by the current technical guidelines (see 6.2.2), a significant number of the services will not have this capability enabled. A fall-back could be implemented that downloads the maximum number of features supported by the WFS and provide at least paged access to those fea-tures and display a warning about the limitation.

Datasets and distributions

6.12 “INSPIRE datasets”

6.13 Improve metadata content

Table 26: DWBP 3

DWBP 3: Provide structural metadata

Provide metadata that describes the schema and internal structure of a distribu-tion.

Page 36: Spatial Data on the Web tools and guidance for data providers · ELISE initiative - Spatial Data on the Web tools and guidance for data providers Date: 12/10/2017 2 / 56 Doc. Version:

ELISE initiative - Spatial Data on the Web tools and guidance for data providers

Date: 12/10/2017 36 / 56 Doc. Version: v1.0

Key statements:

Providing information about the internal structure of a distribution is essential for others wishing to explore or query the dataset. It also helps people to under-stand the meaning of the data.

Assessment:

In the INSPIRE network services, this is implemented through the Capabilities docu-ments (list of feature types, layers, etc.) and through additional operations (De-scribeFeatureType, DescribeRecord, DescribeCoverage, etc.). If (and how) this infor-mation is presented to a human user depends on the application accessing the net-work service.

Detailed information about the schemas of “interoperable” data in INSPIRE is pub-lished on the INSPIRE website, both as data specification documents and in other forms (including UML models, a schema registry, online feature catalogues and the “Find your scope” application).

The same comment about “studying many complex documents, which are typically available as PDF only” from DWBP 25 applies here, too.

As the structural metadata is not linked from the metadata (or, for most parts, from the data), it requires knowledge about the standards and their architecture to un-ambiguously find the right documents.

That said, datasets that use the default GML encoding make reference to the rele-vant XML schemas and machines can analyse this information.

As noted above, the ongoing work on INSPIRE in RDF explores the feasibility and means to provide the structural metadata along with semantic information in a ma-chine-readable way.

It would be beneficial if the data and structural metadata could be also linked in HTML, i.e. for human users.

Links to related recommendations:

6.4 Data formats for vector data - HTML

6.13 Improve metadata content

5.8 Non-spatial aspects

5.8.1 Data Vocabularies

Clear semantics and the use of existing, standardised concepts will help others to un-derstand the data. This topic is related to the implementation task “make interopera-ble representation of spatial data set available” (dsi-iop).

Table 27: DWBP 15

DWBP 15: Reuse vocabularies, preferably standardized ones

Use terms from shared vocabularies, preferably standardized ones, to encode data and metadata.

Key statements:

Page 37: Spatial Data on the Web tools and guidance for data providers · ELISE initiative - Spatial Data on the Web tools and guidance for data providers Date: 12/10/2017 2 / 56 Doc. Version:

ELISE initiative - Spatial Data on the Web tools and guidance for data providers

Date: 12/10/2017 37 / 56 Doc. Version: v1.0

The use of vocabularies already in use by others captures and facilitates consen-sus in communities. It increases interoperability and reduces redundancies, thereby encouraging reuse of your own data. In particular, the use of shared vo-cabularies for metadata (especially structural, provenance, quality and version-ing metadata) helps the comparison and automatic processing of both data and metadata. In addition, referring to codes and terms from standards helps to avoid ambiguity and clashes between similar elements or values.

Assessment:

This is essentially what the INSPIRE data specifications and associated resources are about. For data, INSPIRE application schemas are specified based on ISO 19109 and the associated standards like ISO 19107 or ISO 19123. For metadata, ISO 19115 is used. While ISO 19107 and ISO 19123 focus on spatial topics, many of the aspects in ISO 19115 are not specific to spatial data (e.g., lineage/provenance).

The INSPIRE application schemas re-use existing terms and definitions where possi-ble.

Links to related recommendations:

6.3 Access to a dataset distributed using WFS via a “landing page”

Table 28: DWBP 16

DWBP 16: Choose the right formalization level

Opt for a level of formal semantics that fits both data and the most likely applica-tions.

Key statements:

Formal semantics help to establish precise specifications that convey detailed meaning and using a complex vocabulary (ontology) that may serve as a basis for tasks such as automated reasoning. On the other hand, such complex vocabular-ies require more effort to produce and understand, which could hamper their re-use, comparison and linking to the datasets that use them.

Choosing a very simple vocabulary is always attractive but there is also a danger: the drive for simplicity might lead the publisher to omit some data that provides important information, such as the geographical location of the bus stops that would, in turn, prevent them from being shown on a map. Therefore, a balance has to be struck, remembering that the goal is not simply to share your data, but make it available for others to reuse it.

Assessment:

During INSPIRE’s development, a conscious decision has been made for the formali-sation level - consistent with the ISO/OGC standards and the practices of the stake-holders.

The INSPIRE application schemas (the vocabularies) are conceptual schemas in UML according to the standard meta-model for modelling spatial data – the General Fea-

Page 38: Spatial Data on the Web tools and guidance for data providers · ELISE initiative - Spatial Data on the Web tools and guidance for data providers Date: 12/10/2017 2 / 56 Doc. Version:

ELISE initiative - Spatial Data on the Web tools and guidance for data providers

Date: 12/10/2017 38 / 56 Doc. Version: v1.0

ture Model of ISO and OGC. As a conceptual schema, it is intended to reflect the se-mantics without the artefacts of any implementation environment (like XML Schema, SQL DDL, Java, etc.).

This conceptual schema is the basis to derive implementation artefacts, like the GML application schemas to encode the data for the exchange between systems. These artefacts are derived using formalised encoding rules. This means that the derivation process can be automated and that there is also a well-defined relationship between the elements of different implementation artefacts.

If technologies and implementation preferences change with time, but the view of the thematic domains would not change, there would be no impact of the techno-logical evolution on the INSPIRE application schemas (or the text in the Implement-ing Rules that was derived from the INSPIRE application schemas).

Links to related recommendations:

None, no gap identified.

5.8.2 Data Licenses

This topic is related to the implementation task “specify data sharing conditions for spatial data set” (dsi-lic) and the ‘use task’ “find out about the licence conditions for using the data, map layers or services” (u-ps-fe-lic, not yet online).

Table 29: DWBP 4

DWBP 4: Provide data license information

Provide a link to or copy of the license agreement that controls use of the data.

Key statements:

The presence of license information is essential for data consumers to assess the usability of data. User agents may use the presence/absence of license infor-mation as a trigger for (potentially automated) inclusion or exclusion of data pre-sented to a potential consumer.

Assessment:

Implemented in general, but in a weak way that is difficult to process for software agents. Fees, use and access constraints are free text fields in both ISO 19115 and Capabilities documents.

From experience, this information is not always complete or consistent within IN-SPIRE metadata.

Links to related recommendations:

6.13 Improve metadata content

5.8.3 Data Provenance and Quality

Without provenance, consumers have no inherent way to trust the integrity and credi-bility of the data being shared. The same is true for information about the data quality.

Page 39: Spatial Data on the Web tools and guidance for data providers · ELISE initiative - Spatial Data on the Web tools and guidance for data providers Date: 12/10/2017 2 / 56 Doc. Version:

ELISE initiative - Spatial Data on the Web tools and guidance for data providers

Date: 12/10/2017 39 / 56 Doc. Version: v1.0

This topic is related to the implementation task “create/maintain metadata for spatial data set” and the ‘use task’ “evaluate if the data, map layers and services are suited for the objectives of use” (u-ps-fe-eval, not yet online).

Table 30: DWBP 5, DWBP 6

DWBP 5: Provide data provenance information

Provide complete information about the origins of the data and any changes you have made.

DWBP 6: Provide data quality information

Provide information about data quality and fitness for particular purposes.

Key statements:

Provenance is one means by which consumers of a dataset judge its quality. Un-derstanding its origin and history helps one determine whether to trust the data and provides important interpretive context.

Documenting data quality significantly eases the process of dataset selection, in-creasing the chances of reuse.

Assessment:

Not implemented in general. While the metadata includes fields for lineage infor-mation and data quality, information about the origins, the processing (including the transformation to the INSPIRE application schemas) and the data quality are often not included in the dataset metadata.

Links to related recommendations:

6.13 Improve metadata content

5.8.4 Data Preservation

Data preservation is out-of-scope of INSPIRE. In the Data on the Web Best Practices, this is covered by two best practices:

DWBP 27: Preserve identifiers. When removing data from the Web, preserve the iden-tifier and provide information about the archived resource.

DWBP 28: Assess dataset coverage. Assess the coverage of a dataset prior to its preser-vation.

5.8.5 Feedback

A mechanism for feedback has benefits for both publishers and consumers.

Table 31: DWBP 29, DWBP 30

DWBP 29: Gather feedback from data consumers

Provide a readily discoverable means for consumers to offer feedback.

DWBP 30: Make feedback available

Page 40: Spatial Data on the Web tools and guidance for data providers · ELISE initiative - Spatial Data on the Web tools and guidance for data providers Date: 12/10/2017 2 / 56 Doc. Version:

ELISE initiative - Spatial Data on the Web tools and guidance for data providers

Date: 12/10/2017 40 / 56 Doc. Version: v1.0

Make consumer feedback about datasets and distributions publicly available.

Key statements:

Obtaining feedback helps publishers understand the needs of their data consum-ers and can help them improve the quality of their published data.

By sharing feedback with consumers, publishers can demonstrate to users that their concerns are being addressed and they can avoid submission of duplicate bug reports. Sharing feedback also helps consumers understand any issues that may affect their ability to use the data and it can foster a sense of community among them.

Assessment:

Not addressed currently.

ELISE deliverable Approach and prototype for user feedback on geospatial metadata (D2.1.3) will investigate this best practice.

New work items in OGC (the proposed OGC Quality of Service and Experience Do-main Working Group and the Geospatial User Feedback Standards Working Group) are looking into this topic, too.

Links to related recommendations:

None at this time, wait for results of ELISE deliverable D2.1.3.

5.8.6 Data Enrichment

This topic is related to the implementation task “provide and operate view service” (vs) and the ‘use task’ “handle the difference between the objectives and the available re-sources” (u-ps-diff, not yet online).

Table 32: DWBP 31

DWBP 31: Enrich data by generating new data

Enrich your data by generating new data when doing so will enhance its value.

Key statements:

Publishing more complete datasets can enhance trust, if done properly and ethi-cally. Deriving additional values that are of general utility saves users time and encourages more kinds of reuse. There are many intelligent techniques that can be used to enrich data, making the dataset an even more valuable asset.

Assessment:

This will mainly be done by others using INSPIRE data as “spatial reference data” (such as background/contextual mapping, enriching the data by establishing links to other data related to the features in the spatial datasets or by applications that mix and combine data from different sources).

The work on practices to extend INSPIRE specifications is related to this best prac-tice. One particular pattern in the use of INSPIRE envisages a cooperation of a user

Page 41: Spatial Data on the Web tools and guidance for data providers · ELISE initiative - Spatial Data on the Web tools and guidance for data providers Date: 12/10/2017 2 / 56 Doc. Version:

ELISE initiative - Spatial Data on the Web tools and guidance for data providers

Date: 12/10/2017 41 / 56 Doc. Version: v1.0

community with one (or more) INSPIRE data providers, who would then set up data in the extended form that the user community needs ("retroactive approach").

However, in this context it is important to emphasise that INSPIRE does not require the collection on new data.

Links to related recommendations:

None at this time. However, richer data will be useful for users including more useful HTML pages for end-users (6.4 Data formats for vector data - HTML).

Table 33: DWBP 32

DWBP 32: Provide Complementary Presentations

Enrich data by presenting it in complementary, immediately informative ways, such as visualizations, tables, Web applications, or summaries.

Key statements:

Data published online is meant to inform others about its subject. But only post-ing datasets for download or API access puts the burden on consumers to inter-pret it. The Web offers unparalleled opportunities for presenting data in ways that let users learn and explore without having to create their own tools.

Assessment:

The view services provide such presentations by offering georeferenced bitmap im-ages from the spatial data.

Beside the view services, this will mainly be done by others providing applications and value-added content.

The work on ELISE deliverable D2.1.2 also addresses complementary presentations.

Links to related recommendations:

None at this time, but there is potential in presenting the data in other ways to end-users, e.g. diagrams, charts, etc. In that sense, it is related to 6.4 Data for-mats for vector data - HTML.

5.8.7 Republication

This topic is related to the ‘use task’ “set up web resources” (u-u-su-web, not yet online).

Table 34: DWBP 33, DWBP 34, DWBP 35

DWBP 33: Provide Feedback to the Original Publisher

Let the original publisher know when you are reusing their data. If you find an er-ror or have suggestions or compliments, let them know.

DWBP 34: Follow Licensing Terms

Find and follow the licensing requirements from the original publisher of the da-taset.

Page 42: Spatial Data on the Web tools and guidance for data providers · ELISE initiative - Spatial Data on the Web tools and guidance for data providers Date: 12/10/2017 2 / 56 Doc. Version:

ELISE initiative - Spatial Data on the Web tools and guidance for data providers

Date: 12/10/2017 42 / 56 Doc. Version: v1.0

DWBP 35: Cite the Original Publication

Acknowledge the source of your data in metadata. If you provide a user interface, include the citation visibly in the interface.

Key statements:

Publishers generally want to know whether the data they publish has been use-ful. Reporting your usage helps them justify putting effort toward data releases. Providing feedback repays the publishers for their efforts by directly helping them to improve their dataset for future users.

By adhering to the original publisher’s requirements, you keep the relationship between yourself and the publisher friendly.

Understanding the initial license will help you determine what license to select for your reuse.

Data is only useful when it is trustworthy. Identifying the source is a major indi-cator of trustworthiness.

Assessment:

These practices are mainly addressing users, and not the data providers in INSPIRE.

Links to related recommendations:

None, as the recommendations in this document are about improving the pro-cess of initially sharing data, not the re-publication of that data.

6 ADDRESSING THE CHALLENGE

6.1 The approach

6.1.1 General remarks

This chapter discusses potential technical measures for addressing gaps identified in the previous chapter. Each table with an assessment of one or more best practices in-cludes links to the sections in this chapter where the identified gaps are addressed.

This document assumes – and proposes – that any changes to INSPIRE will follow an evolutionary path. The idea reflects that INSPIRE should build on the existing legal and guidance framework, and – at the same time – create the capability to explore and test new technical options (“INSPIRE labs”).

Only gaps are addressed that

are considered to be addressable in experiments in such INSPIRE labs;

are expected to have the potential to increase the use and utility of data from the INSPIRE service network.

In this chapter, the proposed measures with the highest priority for experimentation are written in bold and blue. Implementing those is expected to be a large step to-wards sharing those datasets in a way that is consistent with the best practices. These measures would be candidates for implementation as part of ELISE deliverable D2.1.2 (Improved interoperability tools for publishing spatial data on the Web). Only a subset

Page 43: Spatial Data on the Web tools and guidance for data providers · ELISE initiative - Spatial Data on the Web tools and guidance for data providers Date: 12/10/2017 2 / 56 Doc. Version:

ELISE initiative - Spatial Data on the Web tools and guidance for data providers

Date: 12/10/2017 43 / 56 Doc. Version: v1.0

of all these measures can be addressed in that deliverable. Measures that are already implemented are shown in blue using the regular font.

6.1.2 Using a proxy

A useful approach to such experimentation is to build additional components for test-ing new ways of sharing data on top of the existing infrastructure, as this will also ex-pose compatibility issues and it may result in a re-usable implementation option. The approach was also used in the research “Spatial data on the Web on top of the existing SDI” in the Geonovum testbed [1].

The idea is illustrated in Figure 3. In the INSPIRE service network, clients connect to the discovery services (CSW) to search for data and access the data using download ser-vices.

INSPIRE supports complex requirements and powerful capabilities. For example, WFS supports:

rich data models like those in INSPIRE application schemas,

rich queries similar to the capabilities of SQL or SPARQL,

processing the query result, such as transforming the data to other coordinate ref-erence systems,

providing the data in GML/XML, supporting modularisation of the data using XML namespaces, validation, etc.

On the other hand, as discussed in the above assessments, the powerful capabilities are difficult to use for many, they are not aligned with the current practices on the Web and they do not meet the expectations of developers familiar with those prac-tices. The figure mentions a few of those characteristics, which are also discussed and explained in more detail below.

If the download service supports “direct access”, i.e. user-defined queries, it is feasible to provide a proxy on top of the download service and the associated metadata in the discovery service, where ‘proxying’ would make the dataset available to others in a way consistent with several of the best practices that are currently not implemented in INSPIRE or other SDIs.

As a result, the proxy approach does not require changes to the current infrastructure but, at same time, allows gaps to be identified in the current infrastructure and ex-plore ways to close them.

Note that this document does not consider how to address mechanisms for authenti-cation and access control, i.e., it mainly targets the sharing of datasets that are pub-lished under an open data license. The reason for this is that open data is better suited for experimentation due to their availability to the public.

Page 44: Spatial Data on the Web tools and guidance for data providers · ELISE initiative - Spatial Data on the Web tools and guidance for data providers Date: 12/10/2017 2 / 56 Doc. Version:

ELISE initiative - Spatial Data on the Web tools and guidance for data providers

Date: 12/10/2017 44 / 56 Doc. Version: v1.0

Figure 3 - Using a proxy for experiments

The previous chapter has identified a number of differences between the current IN-SPIRE technical guidance and Web best practices or user needs. Some are more im-portant than others and the remainder of this chapter will focus on a number of rec-ommendations and which of the differences should be investigated first.

The following measures are probably the most important for making INSPIRE (and other SDIs) aligned with today’s practices on the Web:

Use globally unique persistent HTTP URIs for features. This is already a recommen-dation in INSPIRE, but rarely followed by current implementations.

Make spatial data indexable by search engines. Discovering INSPIRE data should not require the use of a geoportal and it should be possible to browse the data that is presented in a clear and understandable way without additional tools.

Data presented as HTML pages should be useful to a user, link to related infor-mation and encourage others to link to the page when they share related infor-mation. This best practice also implies a structure for the resources to be provided.

Expose data through RESTful, easy-to-use APIs, supporting multiple formats and simple queries.

Expose the data via the API in a flattened, simplified feature encoding – including in GeoJSON (currently developers often prefer JSON over XML).

Publish the data via the API in WGS 84 coordinate reference systems, too, as these are the basis of many spatial data tools and APIs on the Web19.

Make datasets (and spatial objects) the main resources that are shared. Use the metadata as part of the dataset to describe it and treat downloadable files and spatial data services as distributions.

19 For the same reasons, map image should use the Web Mercator projection.

WFSDirect

Access Download

Service

DatasetMetadata

CSWDiscovery

Service

INSPIRE

GIS Experts and Developers

ISO 19115

ISO 19139/XML

Indexed Web Linked Data Web

Web APIs

Search Engine

Crawlers

“Web Developers”

Proxy

schema.org

WGS 84HTML, GeoJSON,

JSON-LD, XML

Content negotiation

“Follow your nose”

simple API

any model

any CRSGML/XML

rich queries

Page 45: Spatial Data on the Web tools and guidance for data providers · ELISE initiative - Spatial Data on the Web tools and guidance for data providers Date: 12/10/2017 2 / 56 Doc. Version:

ELISE initiative - Spatial Data on the Web tools and guidance for data providers

Date: 12/10/2017 45 / 56 Doc. Version: v1.0

In order to explore support for these technical capabilities using a proxy layer on top of the existing infrastructure, the download services need to meet certain requirements. These are discussed in 6.2.

6.1.3 ldproxy

ldproxy [6] is such a proxy. At the back end, it currently supports WFS. The first version was developed during the Geonovum testbed [1]. ELISE deliverable D2.1.2, Improved interoperability tools for publishing spatial data on the Web, is using ldproxy for experi-ments.

Some screenshots (Figure 4 to Figure 6, from a workshop at the INSPIRE Conference 201620) are shown below to illustrate the general approach. http://www.ldproxy.net/ will be updated later in 2017 to use the improved ldproxy version.

All the data in the HTML pages is accessed directly from a WFS and presented as HTML using on-the-fly transformations.

The ldproxy service can be configured to provide the information in a more human-readable way. For example, in Figure 4, the qualified XML element name “lands2:wa-tertorens” has been changed to “Watertorens”.

Figure 4 - Landing page of a "landscape atlas" dataset in the Netherlands

20 http://inspire.ec.europa.eu/events/conferences/inspire_2016/pdfs/2016_workshops/29%20THURS-DAY_WORKSHOPS_H1_9.00-10.30______geo4web-topic4-overview+ldproxy-20160929.pdf

Page 46: Spatial Data on the Web tools and guidance for data providers · ELISE initiative - Spatial Data on the Web tools and guidance for data providers Date: 12/10/2017 2 / 56 Doc. Version:

ELISE initiative - Spatial Data on the Web tools and guidance for data providers

Date: 12/10/2017 46 / 56 Doc. Version: v1.0

Figure 5 - Page of the spatial object type "water towers" in the Netherlands

The page for the spatial object type provides paged access to all water tower features in the dataset. A map displays the location of the features shown on the current page.

The page also provides access to other representations (GeoJSON, GML, JSON-LD). Links to related information is shown as clickable links in HTML.

Figure 6 - Page of a "water tower" feature in the Netherlands

Links to images are embedded in the HTML page of the feature.

6.1.4 Evolution of INSPIRE

As discussed in Chapter 2, this document assumes a careful evolution of INSPIRE. One of the problems with the process of creating the current INSPIRE implementing rules

Page 47: Spatial Data on the Web tools and guidance for data providers · ELISE initiative - Spatial Data on the Web tools and guidance for data providers Date: 12/10/2017 2 / 56 Doc. Version:

ELISE initiative - Spatial Data on the Web tools and guidance for data providers

Date: 12/10/2017 47 / 56 Doc. Version: v1.0

and technical guidelines was that, typically, only the mandated implementation work-flow was tested before adoption. The proposed infrastructure, however, was not veri-fied to meet existing user needs.

Going forward, emphasis should be placed on keeping the INSPIRE baseline stable, un-less the value of changing it is very clear. Success criteria need to be used to measure value for users - interoperability or conformance should not be the ultimate goals, but the use of the data and its infrastructure.

This is consistent with the general trend of the discussions at the “INSPIRE – What If …” workshop.

6.2 Download services supporting a proxy layer

6.2.1 General

This section analyses the requirements that download services should meet in order to be suitable for use in a proxy layer.

INSPIRE download services that can support the best practices would need to support both a pre-defined dataset conformance class (for bulk download of the dataset and subsets) and a direct access conformance class (for persistent identifiers and API sup-port).

This section looks at all options for download services for which Technical Guidelines exist (WFS, Atom, SOS, WCS). One result of the analysis is that it is recommended to put a priority on spatial data that is available via WFS with support for direct access. The remainder of this chapter will, therefore, focus on spatial datasets that are availa-ble via WFS.

6.2.2 Web Feature Service (WFS)

The INSPIRE technical guidance for WFS direct access download services foresees that for accessing data, a query is constructed to select the desired features and submitted to the WFS. However, in practice, this typically will only work for small datasets. Most WFSs limit the number of features returned by a single query. This is not prohibited by the INSPIRE technical guidance, but most datasets consist of a larger number of fea-tures than such limits. To reliably access all features meeting a query, the WFS needs to support paged access, or at least the “startIndex” and “count” parameters to itera-tively download all features matching a query. This option is not required by the cur-rent INSPIRE technical guidance.

Another issue that sometimes exists with WFS deployments is that although WFS re-quires the gml:id attributes of all features to be persistent, they sometimes change. This is most likely to occur when the dataset is updated and the update process does not consider that the gml:ids must remain stable. This violates the WFS requirements and the need for persistent identifiers. See also Table 3: DWBP 10, SDWBP 7.

It is important to note that meeting these requirements would be important for other users of INSPIRE data, too, not just the Web users.

6.2.3 Atom

No specific requirements exist for Atom-based download services. However, their use in a proxy layer is limited as these download services do not provide access to features.

Page 48: Spatial Data on the Web tools and guidance for data providers · ELISE initiative - Spatial Data on the Web tools and guidance for data providers Date: 12/10/2017 2 / 56 Doc. Version:

ELISE initiative - Spatial Data on the Web tools and guidance for data providers

Date: 12/10/2017 48 / 56 Doc. Version: v1.0

Therefore, the four key best practices identified in 4.1 cannot be implemented in a proxy layer on top of an Atom feed:

• Best Practice 1: Use globally unique persistent HTTP URIs for Spatial Things

• Best Practice 2: Make your spatial data indexable by search engines

• Best Practice 3: Link resources together to create the Web of data

• Best Practice 12: Expose spatial data through 'convenience APIs'

6.2.4 Sensor Observation Service (SOS)

In a way, a SOS is a specialised WFS where all features are observations. This re-striction results in less generic query capabilities and the filtering mechanisms are closely tied to the feature model for observations. In INSPIRE, all SOS must support spatial filtering.

As a result, a similar proxy like the one described in 6.1.3 could also be developed for observation data. There are some differences, for example:

• Unlike WFS, the SOS interface does not support limiting the number of observa-tions that are returned for a query. All queries must return all matching obser-vations. This simplifies the access, but may be a challenge when presenting this in HTML.

• Temporal aspects are an important aspect in observation data and the repre-sentation in HTML should not only present the spatial location(s) of the obser-vation(s) on a map, but also on a timeline.

• Unlike a WFS, which - in INSPIRE - always provides access to a single dataset, a SOS may be used to share multiple datasets (offerings in the SOS terminology). I.e., the (simplified) resource hierarchy would be: dataset collection dataset observation, whereas in WFS the (simplified) hierarchy is: dataset feature type / layer feature.

• When considering non-XML encodings for the observation data, the following encodings are relevant:

o JSON: OGC Observations and Measurements – JSON implementation21

o RDF: Semantic Sensor Network Ontology (SOSA)22

Creating such a proxy layer for observation data is probably not a priority at the mo-ment as there do not seem to be many download services based on SOS at the mo-ment (a simple search for “SOS” in the INSPIRE geoportal results in 2 hits).

6.2.5 Web Coverage Service (WCS)

WCS is somewhat different from WFS and a proxy would be different as a result. First of all, a WCS contains multiple, often many datasets as each coverage is a dataset. At the same time, each coverage is also a spatial object. The resource hierarchy would therefore simply be: dataset collection dataset/feature.

Key questions are:

21 https://portal.opengeospatial.org/files/?artifact_id=64910

22 https://www.w3.org/TR/vocab-ssn/

Page 49: Spatial Data on the Web tools and guidance for data providers · ELISE initiative - Spatial Data on the Web tools and guidance for data providers Date: 12/10/2017 2 / 56 Doc. Version:

ELISE initiative - Spatial Data on the Web tools and guidance for data providers

Date: 12/10/2017 49 / 56 Doc. Version: v1.0

• Coverages provided via a WCS can be quite different in their nature. The tables 2 and 3 in the Technical Guidelines for the WCS-based download services pro-vide a good overview of the feature types for which WCS may be used in IN-SPIRE. As a consequence, the libraries required to process the coverage data, in particular the ranges (e.g. in TIFF, JPEG2000, netCDF, GRIB, etc.), as well as the appropriate ways how to display the coverage to a user will differ from cover-age to coverage.

• How to present the (list of) coverages in a useful way, if the WCS include a large number of them?

To explore how a proxy approach could work for coverage data will require a deeper analysis and experiments. In general, coverage data will be more difficult to handle and use for non-experts compared to vector data, so this is probably not a priority. In INSPIRE, complementary view services should exist that provide the coverage data in ways that already work in a browser (using PNG or JPEG images) and that can be un-derstood and handled more easily by non-experts.

In any case, in order to support that coverage data can be used together with other spatial data on the Web more easily, the WCS (and the view services) should support re-projection of the coverages to the coordinate reference systems WGS 84 lat/lon (http://www.opengis.net/def/crs/EPSG/0/4326) and Web Mercator (http://www.opengis.net/def/crs/EPSG/0/3857).

Dealing with coverage data on the Web is also subject to continued work in the joint activity of W3C and OGC. Intermediate results are:

• Coverage JSON23,

• Publishing and Using Earth Observation Data with the RDF Data Cube and the Discrete Global Grid System24,

• QB4ST: RDF Data Cube extensions for spatio-temporal components25.

6.3 Access to a dataset distributed using WFS via a “landing page”

In order to make it easier for non-expert users, access to the dataset should be pro-vided through a single resource, a “landing page” for the dataset, accessible at the per-sistent URI of the dataset. See Table 2: DWBP 9.

This resource ideally

combines information from the metadata record on the dataset and the down-load/view services (see Table 25: DWBP 1, DWBP 2, SDWBP 13);

is available in HTML with schema.org annotations and perhaps DCAT annotations (see Table 5: SDWBP 2);

is pleasant to view, easy to understand for non-experts and includes useful infor-mation (see Table 5: SDWBP 2);

23 https://www.w3.org/TR/covjson-overview/

24 https://www.w3.org/TR/eo-qb/

25 https://www.w3.org/TR/qb4st/

Page 50: Spatial Data on the Web tools and guidance for data providers · ELISE initiative - Spatial Data on the Web tools and guidance for data providers Date: 12/10/2017 2 / 56 Doc. Version:

ELISE initiative - Spatial Data on the Web tools and guidance for data providers

Date: 12/10/2017 50 / 56 Doc. Version: v1.0

has direct links to download the dataset – or pre-defined subsets – using links to the relevant resources from the pre-defined dataset download service (see Table 15: DWBP 17, DWBP 18);

includes easy to understand information and examples about the data content and the encoding (see Table 5: SDWBP 2);

shows the spatial extent on a map (see Table 5: SDWBP 2);

includes easy to understand information and examples how to access and filter the dataset via API(s) (see Table 22: DWBP 25 and the last bullet item in 6.7);

includes links to all spatial objects to enable the search engine to crawl and hu-mans to browse the data (at least for spatial objects that represent data that con-tains useful information that others may link to) (see Table 3: DWBP 10, SDWBP 7 and Table 5: SDWBP 2);

includes information about user feedback and enables users to provide feedback (see Table 31: DWBP 29, DWBP 30);

is available publicly.

The main challenges include:

presenting the information in ways that are useful for the users, which may involve a range of actors;

consolidating the often-inconsistent metadata from metadata records and Capabil-ities documents.

6.4 Data formats for vector data - HTML

Feature data should be provided in HTML with schema.org annotations, and DCAT an-notations where applicable, to support search engines, see Table 5: SDWBP 2 and Ta-ble 27: DWBP 15.

As with the landing page, a key challenge is to present the information in ways that are useful for the users (again, see Table 5: SDWBP 2). Some ideas for the HTML page of a feature:

The “language-independent names” used in the INSPIRE application schemas and the encoding should be replaced by labels for humans. See Table 8: DWBP 13.

Information about the schemas (structural metadata), including the definitions and descriptions, should be available if needed to understand the meaning. Per-haps using tooltips or a link to information in the INSPIRE registry or the other re-sources / tools on the INSPIRE website. See Table 26: DWBP 3.

Some values need to be translated (for example, codes to readable text, or ISO 8601 dates to readable dates). See Table 8: DWBP 13.

Display links to images as images and other links as clickable links.

Add a map that displays the feature on a basemap. See Table 33: DWBP 32.

Include a map from the view service. See Table 33: DWBP 32. This is not trivial as the information is not readily available in INSPIRE. Required steps would be: search the discovery service for an infoMapAccessService that “operates on” the same da-

Page 51: Spatial Data on the Web tools and guidance for data providers · ELISE initiative - Spatial Data on the Web tools and guidance for data providers Date: 12/10/2017 2 / 56 Doc. Version:

ELISE initiative - Spatial Data on the Web tools and guidance for data providers

Date: 12/10/2017 51 / 56 Doc. Version: v1.0

taset (coupled resource). The layer would have to be determined based on the fea-ture type. The view service should support the Web Mercator projection (which of-ten is not the case).

Use coordinates to display the location in Google Streetview or similar. See Table 33: DWBP 32.

Provide links to nearby features of other INSPIRE features in the same dataset. See Table 6: SDWBP 3 and Table 11: SDWBP 10.

o In general, also links to other datasets could be derived dynamically. To make it useful, that other dataset should be shared on the Web using the same principles and we need to know the API in order to query the related data26.

Enrich the data with links to other data (e.g. dbpedia, Eurostat, EEA). See Table 6: SDWBP 3.

Adding a feedback option. See Table 31: DWBP 29, DWBP 30. As user feedback is under investigation in a separate ELISE deliverable Approach and prototype for user feedback on geospatial metadata (D2.1.3), we should wait until these results are available.

While all features should have a persistent URI, it is unclear, if there is value in hav-ing a separate web page for each feature. For some features, it may be sufficient to present the features in a table of the dataset. This will depend on whether the fea-ture provides sufficient information for a web page that is useful for end-users.

See section 6.1.3 for some screenshots of web pages using the ldproxy tooling.

To gain a better understanding how users are using the resources and where they refer to them, a web analytics tool27 could be used.

6.5 Data formats for vector data - GeoJSON

Beside the default encoding GML and the HTML encoding, GeoJSON should be sup-ported, too. See Table 7: DWBP 12, DWBP 14, SDWBP 4.

Like in the HTML case, the data should be easy to understand, see Table 24: SDWBP 12. This will typically involve:

Structured, complex attribute values and attributes with multiple values are flat-tened.

Reducing data complexity where the most features does not include the infor-mation.

Dropping properties from the JSON data that are void/nil.

Values from controlled vocabularies should be translated (e.g. codes to human-readable values).

26 In the Geonovum testbed [1] this was done with a dataset for public announcements that was ac-cessed using SPARQL. See http://geo4web-testbed.github.io/topic4/#h.4vnq8jpf6th6.

27 https://en.wikipedia.org/wiki/Web_analytics

Page 52: Spatial Data on the Web tools and guidance for data providers · ELISE initiative - Spatial Data on the Web tools and guidance for data providers Date: 12/10/2017 2 / 56 Doc. Version:

ELISE initiative - Spatial Data on the Web tools and guidance for data providers

Date: 12/10/2017 52 / 56 Doc. Version: v1.0

6.6 Data formats for vector data – RDF / JSON-LD

Potential benefits of using data from INSPIRE in a RDF encoding iare currently under investigation. See https://github.com/inspire-eu-rdf/.

If that work will progress in the future, RDF encodings, most likely JSON-LD, should be considered, too. See Table 6: SDWBP 3, Table 7: DWBP 12, DWBP 14, SDWBP 4, Table 9: SDWBP 5, SDWBP 6, SDWBP 7, SDWBP 8 and Table 11: SDWBP 10.

6.7 Data access API for vector data

A simple data access API should be provided in the proxy layer that implements the recommendations in SDWBP 12, see Table 24: SDWBP 12, Table 21: DWBP 24, and Ta-ble 20: DWBP 23.

Such an API would also be a generic data access API as it is not specific to a dataset and how that dataset is used in particular, but in terms of ease-of-use should be closer to a ‘convenience API’ than the rich and powerful generic data access APIs like WFS, WCS, SOS or SPARQL.

The query capabilities of the API should be limited. Focus on simplicity first and sup-port more powerful capabilities only if there is clear value. The WFS/SOS/WCS inter-face is available for more powerful queries.

It should support bounding-box and neighbourhood search queries for feature types and subsets.

The API should provide access to the GML, HTML and GeoJSON encodings, using con-tent negotiation (see Table 16: DWBP 19).

It could also be provided as RDF depending on the status of the work on INSPIRE in RDF (see 6.6).

The API should support interpolating geometries, if the application wants to present the features at a certain map scale. See Table 9: SDWBP 5, SDWBP 6, SDWBP 7, SDWBP 8.

The API should be documented using the OpenAPI specification (a.k.a. Swagger) and include examples. Swagger UI will allow users to play with the API in a web browser. See Table 22: DWBP 25.

6.8 Coordinate reference systems

By default, data and the API should use WGS84 geodetic coordinates in the proxy layer. See Table 9: SDWBP 5, SDWBP 6, SDWBP 7, SDWBP 8.

A question for discussion is, whether the data should be provided in other coordinate reference systems, too. To support application that need more accurate coordinates, it probably would be appropriate to provide the geometries at least in the native coordi-nate reference system of the dataset, too.

Page 53: Spatial Data on the Web tools and guidance for data providers · ELISE initiative - Spatial Data on the Web tools and guidance for data providers Date: 12/10/2017 2 / 56 Doc. Version:

ELISE initiative - Spatial Data on the Web tools and guidance for data providers

Date: 12/10/2017 53 / 56 Doc. Version: v1.0

6.9 Sitemaps

Explore, if and how sitemaps improve the indexing by search engines. See Table 5: SDWBP 2, also the related discussions in the Geonovum report [1] and SDWBP 228 in the Spatial Data on the Web Best Practices [3].

6.10 Fault-tolerant proxy

From experience, the two main reasons for failure when setting up proxy services us-ing ldproxy are:

The schema documents advertised by the DescribeFeatureType operation of the WFS are either not accessible (outdated URLs, file or jar URLs, relative URLs, etc.) or not valid. While these are fundamental issues that should be corrected in the WFS configuration, it would also be possible to attempt to try to “reverse engi-neer” the schema information from the data in such cases. Those schemas may be incomplete and not precise, but this does not really impact the HTML representa-tion and would allow to create proxy services also in those cases.

Since support for paging is not required by the current technical guidelines (see 6.2.2), a significant number of the services will not have this capability enabled. A fall-back could be implemented that downloads the maximum number of fea-tures supported by the WFS and provide at least paged access to those features and display a warning about the limitation.

6.11 Datasets and distributions

In the presentation of the resources from INSPIRE on the Web, datasets (and spatial objects) should be the main resources that are shared.

6.12 Metadata should be provided as part of the dataset to describe it. and spatial data services should be shared as distributions. See Table 25: DWBP 1, DWBP 2, SDWBP 13 and 4.2 (Datasets and distributions

) and 4.3 (“INSPIRE datasets”). As explained in these sections, the way in which re-sources are managed in INSPIRE in metadata records differs to some extent from the best practices on the Web.

In the sense of the Data on the Web Best Practices (see Figure 1), the distributions in INSPIRE are the view and download services (or other spatial data services providing some kind of access to a dataset). A problem is, however, that a spatial data service in INSPIRE may give access to multiple datasets. In this context, the distribution can be seen as the “reified relationship” between a dataset and a service.

This topic may require further discussion first:

A spatial data service (distribution) in INSPIRE may operate on more than one da-taset.

28 direct link: https://www.w3.org/TR/sdw-bp/#indexable-by-search-engines

Page 54: Spatial Data on the Web tools and guidance for data providers · ELISE initiative - Spatial Data on the Web tools and guidance for data providers Date: 12/10/2017 2 / 56 Doc. Version:

ELISE initiative - Spatial Data on the Web tools and guidance for data providers

Date: 12/10/2017 54 / 56 Doc. Version: v1.0

6.13 GeoDCAT-AP [5] supports both approaches: Services may be represented as distributions of a dataset – as described in 4.2 (Datasets and distributions

), but they may also be modelled as independent entities of type “Service” in a cat-alogue (like datasets and series)29.

The new W3C Dataset Exchange Working Group (DXWG)30 is looking into related questions31. The progress of the work should be monitored.

6.14 “INSPIRE datasets”

Section 4.3 discussed that some data providers treat the publication of an existing da-taset in INSPIRE as a new dataset, not as a distribution. It might be worth to discuss this topic in the MIG to achieve a consistent approach and one that is useful for users searching for spatial data. See also Table 25: DWBP 1, DWBP 2, SDWBP 13.

6.15 Improve metadata content

The metadata available in INSPIRE could be improved in several ways (in addition to the general topics in the two previous sections):

Consistency: The metadata is often not consistent, for example the information in a WMS/WFS capabilities document about a dataset differs from the information in the metadata in the metadata records in discovery services.

The information is not always up-to-date.

Completeness: In some cases, metadata elements are described in the Technical Guidelines, but are not populated in the metadata records. However, the goal should not be to populate as many elements as possible, but focus on elements that are typically useful for developers and non-experts.

See Table 12: DWBP 7, Table 25: DWBP 1, DWBP 2, SDWBP 13, Table 26: DWBP 3, Ta-ble 29: DWBP 4 and Table 30: DWBP 5, DWBP 6.

This topic is nothing that can be addressed “on top of the existing infrastructure”, but it would require effort from data providers.

6.16 Additional ideas

6.16.1 Ideas for the INSPIRE geoportal

In general, some of the proxy capabilities discussed in this chapter could also be in-cluded in the INSPIRE geoportal. For example, a capability to browse a dataset in the geoportal without a WFS, WCS or SOS client.

29 The current GeoDCAT-AP specification does not model services differently from direct file download distributions, but some work has been done in the framework of the DCAT-AP implementation guide-lines on this topic. See https://www.w3.org/2016/11/sdsvoc/SDSVoc16_paper_25#modelling-service-api-based-data-access.

30 https://www.w3.org/2017/dxwg/

31 See, for example, https://www.w3.org/2017/dxwg/wiki/Use_Case_Working_Space#ID18. Other use cases under discussion are relevant, too.

Page 55: Spatial Data on the Web tools and guidance for data providers · ELISE initiative - Spatial Data on the Web tools and guidance for data providers Date: 12/10/2017 2 / 56 Doc. Version:

ELISE initiative - Spatial Data on the Web tools and guidance for data providers

Date: 12/10/2017 55 / 56 Doc. Version: v1.0

6.16.2 Ideas for “INSPIRE in Practice”

The “INSPIRE in Practice” platform could be used to document good implementation examples as a reference where certain aspects that others are struggling with have been taken care of. Two examples related to identifiers:

• Implementation examples where a direct access download service using WFS is provided and where the publication process takes explicit care to keep the gml:id attributes of its features stable during updates of the dataset. See also Table 3: DWBP 10, SDWBP 7 and 6.2.2.

• Implementation examples where a data provider has developed a strategy for using persistent HTTP URIs for each feature. This is in particular useful where the URIs can be dereferenced.

Page 56: Spatial Data on the Web tools and guidance for data providers · ELISE initiative - Spatial Data on the Web tools and guidance for data providers Date: 12/10/2017 2 / 56 Doc. Version:

ELISE initiative - Spatial Data on the Web tools and guidance for data providers

Date: 12/10/2017 56 / 56 Doc. Version: v1.0

APPENDIX 1: REFERENCES AND RELATED DOCUMENTS

ID Reference or Related Document

Source or Link/Location

1 Spatial Data on the Web using the current SDI, C. Portele, P. van Genuchten, L. Verhelst, A. Zahnen, 8 June 2016

http://geo4web-testbed.github.io/topic4/

2 Data on the Web Best Prac-tice, W3C Recommendation, 31 January 2017

https://www.w3.org/TR/dwbp/

3 Spatial Data on the Web Best Practice, W3C Working Group Note, 11 May 2017

https://www.w3.org/TR/sdw-bp/

4 INSPIRE – What if…? – Workshop at OGC TC meeting, 23 March 2017

http://inspire.ec.europa.eu/news/inspire-–-what-if…-–-workshop-ogc-tc-meeting

5 GeoDCAT-AP: A geospatial ex-tension for the DCAT applica-tion profile for data portals in Europe, 1.0.1

https://joinup.ec.europa.eu/node/154143/

6 ldproxy, GitHub repository https://github.com/interactive-instruments/ldproxy