peter fox, rpi curt tilmes , nasa xiaogang (marshall) ma, rpi anne waple , noaa

35
Persistent Identification of Agents and Objects of Global Change: Progress in the Global Change Information System Peter Fox, RPI Curt Tilmes, NASA Xiaogang (Marshall) Ma, RPI Anne Waple, NOAA Stephan Zednik, RPI Jin Zheng, RPI [email protected], ctilmes@ usgcrp.gov www.globalchange .gov

Upload: pancho

Post on 25-Feb-2016

55 views

Category:

Documents


2 download

DESCRIPTION

Persistent Identification of Agents and Objects of Global Change: Progress in the Global Change Information System. Peter Fox, RPI Curt Tilmes , NASA Xiaogang (Marshall) Ma, RPI Anne Waple , NOAA Stephan Zednik , RPI Jin Zheng , RPI [email protected], [email protected]. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Peter Fox, RPI Curt  Tilmes , NASA Xiaogang  (Marshall)  Ma, RPI Anne  Waple , NOAA

Persistent Identification of Agents and Objects of Global Change: Progress in

the Global Change Information SystemPeter Fox, RPI

Curt Tilmes, NASAXiaogang (Marshall) Ma, RPI

Anne Waple, NOAAStephan Zednik, RPI

Jin Zheng, RPI

[email protected], [email protected]

www.globalchange.gov

Page 2: Peter Fox, RPI Curt  Tilmes , NASA Xiaogang  (Marshall)  Ma, RPI Anne  Waple , NOAA

2

The Global Change Research Act and USGCRP

• USGCRP was mandated by Congress in the Global Change Research Act (GCRA) of 1990 (P.L. 101 – 606)

“To provide for development and coordination of a comprehensive and integrated United States Research Program which will assist the Nation and the world to understand, assess, predict, and respond to human-induced and natural processes of global change.”

Page 3: Peter Fox, RPI Curt  Tilmes , NASA Xiaogang  (Marshall)  Ma, RPI Anne  Waple , NOAA

• Coordinates Federal research to better understand and prepare the nation for global change

• Prioritizes and supports cutting edge scientific work in global change

• Assesses the state of scientific knowledge and the Nation’s readiness to respond to global change

• Communicates research findings to inform, educate, and engage the global community

The Program:

U.S. Global Change Research Program

3

Page 4: Peter Fox, RPI Curt  Tilmes , NASA Xiaogang  (Marshall)  Ma, RPI Anne  Waple , NOAA

4

Global Change Information System(GCIS)

Vision:A unified web based source of authoritative, accessible, usable, and timely information about climate and global change for use by scientists, decision makers, and the public.

Page 5: Peter Fox, RPI Curt  Tilmes , NASA Xiaogang  (Marshall)  Ma, RPI Anne  Waple , NOAA

5

Global Change Research Act (1990), Section 106…not less frequently than every 4 years, the Council… shall prepare… an assessment which–• integrates, evaluates, and interprets the findings

of the Program and discusses the scientific uncertainties associated with such findings;

• analyzes the effects of global change on the natural environment, agriculture, energy production and use, land and water resources, transportation, human health and welfare, human social systems, and biological diversity; and

• analyzes current trends in global change, both human- induced and natural, and projects major trends for the subsequent 25 to 100 years.

Page 6: Peter Fox, RPI Curt  Tilmes , NASA Xiaogang  (Marshall)  Ma, RPI Anne  Waple , NOAA

6

Previous National Climate AssessmentsClimate Change Impacts on the United States (2000)

Global Climate Change Impacts in the United States (2009)

Target date for next NCA: 2013

http://nca2009.globalchange.gov

Page 7: Peter Fox, RPI Curt  Tilmes , NASA Xiaogang  (Marshall)  Ma, RPI Anne  Waple , NOAA

7

NCA 2009

http://nca2009.globalchange.gov

Page 8: Peter Fox, RPI Curt  Tilmes , NASA Xiaogang  (Marshall)  Ma, RPI Anne  Waple , NOAA

Prototype Use Case (UC-1)Name Discover and visit data center website of dataset used to generate report figure.

Goal The NCA Report reader sees a figure and wants to know where the data came from.

Summary A reader of the NCA is browsing the content via the website. He/she sees a figure and wants to know where the data came from. A reference to the publication in which the figure originated appears in the figure caption. Selecting the link to the source publication displays a page of information about the publication including, if available, the publication DOI. The page also includes references to the datasets cited in the publication. Following each of dataset reference links presents a page of information about the dataset, including links back to the agency/data center webpage describing the dataset in more detail and making the actual data available for order or download.

Actors Primary Actor - reader of the NCA

Preconditions Reader is viewing the NCA online report

Post Conditions Reader visits the data center dataset website

Normal Flow 1) System is presenting the NCA report to the reader in a web site. Presentation includes report figure with caption that includes reference to source publication.

2) Reader selects publication reference in figure caption3) System displays information about publication, including DOI (if available).4) Publication information includes publication dataset citations.5) Reader selects a dataset cited by the publication.6) System displays information about dataset including links to agency / data center webpages where more information

and (potentially) data download links are available.7) Reader selects the data center link and is redirected to data center dataset webpage.

Page 9: Peter Fox, RPI Curt  Tilmes , NASA Xiaogang  (Marshall)  Ma, RPI Anne  Waple , NOAA

9

NCA links to GCIS entities

Page 10: Peter Fox, RPI Curt  Tilmes , NASA Xiaogang  (Marshall)  Ma, RPI Anne  Waple , NOAA

Key Message & Traceable Account

Page 11: Peter Fox, RPI Curt  Tilmes , NASA Xiaogang  (Marshall)  Ma, RPI Anne  Waple , NOAA

Key Message vs. “General” Message (early draft)

Page 12: Peter Fox, RPI Curt  Tilmes , NASA Xiaogang  (Marshall)  Ma, RPI Anne  Waple , NOAA

12

GCIS

Page 13: Peter Fox, RPI Curt  Tilmes , NASA Xiaogang  (Marshall)  Ma, RPI Anne  Waple , NOAA

13

GCIS• Create an entity from the

structured metadata about each thing – tag with related concepts.

• Identify it with a persistent, controlled identifier.

• Present with a human readable web page and a machine interface.

• Represent all relationships between items.

Page 14: Peter Fox, RPI Curt  Tilmes , NASA Xiaogang  (Marshall)  Ma, RPI Anne  Waple , NOAA

14

GCIS and W3C ProvFor GCIS, we have agents (people, projects, agencies, data centers, publishers, etc.) who are associated with activities (measuring, deriving, modeling, analyzing, authoring, publishing, archiving, distributing, visualizing, etc. ) the entities (software, data, images, figures, papers, reports, etc.) related to global change.

We assign local identifiers to each (so we can persistently resolve them) and capture and represent their relationships.

If possible, we link with external authorities:agency data centers, journal publishers,Researcher ID (researcherid.com) or ORCID (orcid.org).

Page 15: Peter Fox, RPI Curt  Tilmes , NASA Xiaogang  (Marshall)  Ma, RPI Anne  Waple , NOAA

W3C PROV (starting points..)

actedOnBehalf

ENTITY

AGENT

wasAttributedTo

wasAssociatedWith

wasInformedBywasDerivedFrom

wasGeneratedBy

used

startedAtTime,endedAtTime

ACTIVITY

Diagram from W3C PROV group and Ivan Herman

Page 16: Peter Fox, RPI Curt  Tilmes , NASA Xiaogang  (Marshall)  Ma, RPI Anne  Waple , NOAA

Prototype Use Case (UC-2)Name Find Latest Datasets by Keyword

Goal Search for datasets associated with the keyword “snow”, list search results by recentness of publication.

Summary User story:

I want to look for information concerning “snow.” I don’t know if it is a CLEAN word or a GCMD word or don’t even know what GCMD or CLEAN is. How would I do it, and what would I see on my monitor during the process?

Assumptions The reader is not assumed to have knowledge regarding the GCMD Keywords (or other) vocabulary.

Actors Primary Actor - reader of the NCA

Preconditions TBD

Post Conditions Reader is presented with a list of datasets associated with the keyword “snow” sorted by dataset publication date.

Normal Flow TBD

Notes We are looking into two user interface options for dataset selection by keyword

1) As a free-text search where the user inputs “snow”.2) Present the user a faceted browse interface with a vocabulary faceted which presents the user with terms from a

structured vocabulary. The user can manually select the term(s) which match or contain “snow”.

We intend to implement prototypes of both.

Page 17: Peter Fox, RPI Curt  Tilmes , NASA Xiaogang  (Marshall)  Ma, RPI Anne  Waple , NOAA

17

NCA links to GCIS entities

Page 18: Peter Fox, RPI Curt  Tilmes , NASA Xiaogang  (Marshall)  Ma, RPI Anne  Waple , NOAA

18

Traceable accounts…

Page 19: Peter Fox, RPI Curt  Tilmes , NASA Xiaogang  (Marshall)  Ma, RPI Anne  Waple , NOAA

19

Page 20: Peter Fox, RPI Curt  Tilmes , NASA Xiaogang  (Marshall)  Ma, RPI Anne  Waple , NOAA

20

Page 21: Peter Fox, RPI Curt  Tilmes , NASA Xiaogang  (Marshall)  Ma, RPI Anne  Waple , NOAA

Interagency Information Integration

GCIS can use relationships between all relevant information about global change across the agencies:oFrom observations to datasets to research papers to models to

analyses to organizations to people to synthesized reports to human impacts...

oDetermine agency interdependencies -- An EPA analysis uses a NOAA model dependent on observations from a NASA satellite.

oCan present unique interagency metrics "How many papers referenced datasets from a specific satellite?"

oDirect users back to agency data centers for more detailed information and the actual content and data.

Page 22: Peter Fox, RPI Curt  Tilmes , NASA Xiaogang  (Marshall)  Ma, RPI Anne  Waple , NOAA

GCIS Data Mining

Structured information with relationships allows integrated data mining, searching, metrics.o What projects provided data used to produce figures that were

referenced in the 2013 NCA section about coastal sea level rise impacts?

o Which data centers hold data referenced by papers related to forests in the midwest?

o Which agencies have people working on projects related to societal impacts of extreme weather events?

o Show me the latest papers about health impacts of air quality in California. Which datasets were used in the analysis of air quality in California?

Page 24: Peter Fox, RPI Curt  Tilmes , NASA Xiaogang  (Marshall)  Ma, RPI Anne  Waple , NOAA

24

GCIS Benefits NCA web portal, GCIS prototype

• NCA content available online• Searchable, linkable• Complete provenance, traceability• Links back to source information including agency

sources, scenarios, technical input• Link to associated and applicable information and

tools• Ensure authoritative and appealing design and

accessibility• Incorporates initial indicators of change, impact and

response• Access to information about NCA process

(transparency) • Facilitates collaboration across segments of the

climate science and applications community• Construct, prototype and test the initial framework• Use constrained scope and dedicated staff to

accomplish a lot in a short time• Ensure the system design is extensible and able to

grow to meet long term GCIS needs

GCIS• A single web site can lead back to

agency global change information across the program

• A friendly, accessible entry into global change information for non-scientists

• Global, persistent, reusable identifiers for each item

• Integrated data catalog provides interagency metrics, data mining, searching, etc.

• Interagency relationships allow discovery of interdependencies and increase collaboration opportunities

• Agency information mapped into a common, consistent model with a standard vocabulary

• Concept tagging and linking improves search results for agency products

Page 25: Peter Fox, RPI Curt  Tilmes , NASA Xiaogang  (Marshall)  Ma, RPI Anne  Waple , NOAA

25

Page 26: Peter Fox, RPI Curt  Tilmes , NASA Xiaogang  (Marshall)  Ma, RPI Anne  Waple , NOAA

URI Schema

• URI for NCA instances consists of 3 parts: domain name, type of instance, identifier– Domain name: data.globalchange.gov– Types: Person, Project, Organization,

Publication, etc.– Identifiers: depends on the instance’s type, we

will assign a unique id number or construct an identifier base on the instance’s unique property value.

Page 27: Peter Fox, RPI Curt  Tilmes , NASA Xiaogang  (Marshall)  Ma, RPI Anne  Waple , NOAA

More Examples

Person http://data.globalchange.gov/person/<unique name>Publication http://data.globalchange.gov/publication/doi/<doi>

Project http://data.globalchange.gov/project/ACMAPTopic http://data.globalchange.gov/topic/Human-healthImage http://data.globalchange.gov/image/<uuid>Figure http://data.globalchange.gov/report/<reportid>/figure/<figureid>

Chapter http://data.globalchange.gov/report/<reportid>/chapter/<chapterid>

Organization http://data.globalchange.gov/organization/NASAModel http://data.globalchange.gov/model/<model_name>Dataset http://data.globalchange.gov/dataset/doi/<doi>Platform http://data.globalchange.gov/platform/<platform_name>Instrument http://data.globalchange.gov/instrument/<instrument_name>

Page 28: Peter Fox, RPI Curt  Tilmes , NASA Xiaogang  (Marshall)  Ma, RPI Anne  Waple , NOAA

28

GCIS Ontology for NCA (subset)

Page 29: Peter Fox, RPI Curt  Tilmes , NASA Xiaogang  (Marshall)  Ma, RPI Anne  Waple , NOAA

29

Provenance Modeling Example

Page 30: Peter Fox, RPI Curt  Tilmes , NASA Xiaogang  (Marshall)  Ma, RPI Anne  Waple , NOAA

30

Linked Data Principles

1. Use URIs as names for things.2. Use HTTP URIs so that people can look up those names.3. When someone looks up a URI, provide useful

information, using the standards.4. Include links to other URIs, so that they can discover

more things.

http://www.w3.org/DesignIssues/LinkedData.html

Page 31: Peter Fox, RPI Curt  Tilmes , NASA Xiaogang  (Marshall)  Ma, RPI Anne  Waple , NOAA

31

Linked Open Data

http://5stardata.info

Page 32: Peter Fox, RPI Curt  Tilmes , NASA Xiaogang  (Marshall)  Ma, RPI Anne  Waple , NOAA

32

Data Identifiers•NASA Earth Science Data Systems Working Group and ESIP Federation study resulted in dataset identifiers recommendations, [1] Duerr, et. al.

•DOI – Digital Object Identifiers provide a well-defined mechanism to attach an identifier to a digital object.

Recommendation adopted by NASA for EOSDIS:http://earthdata.nasa.gov/wiki/main/index.php/Digital_Object_Identifiers_(DOIs)_for_EOSDIS

doi:10.5067/MEASURES/GSSTF/DATA308

http://dx.doi.org/10.5067/MEASURES/GSSTF/DATA308

Page 33: Peter Fox, RPI Curt  Tilmes , NASA Xiaogang  (Marshall)  Ma, RPI Anne  Waple , NOAA

33

Identifier Resolutiondoi:10.5067/MEASURES/GSSTF/DATA308

A common, persistent, citable reference to that dataset.

We build GCIS specific identifiers from those:

http://data.globalchange.gov/doi/10.5067/MEASURES/GSSTF/DATA308

Then we can resolve it (with content negotiation) on our site, and link it with identifiers for our other resources, including asserting equivalence and linking with the data center responsible for stewardship and distribution of the actual data. We can also refer and link to other repositories of information about those resources.

Page 34: Peter Fox, RPI Curt  Tilmes , NASA Xiaogang  (Marshall)  Ma, RPI Anne  Waple , NOAA

34

Content Negotiationhttp://data.globalchange.gov/doi/10.5067/MEASURES/GSSTF/DATA308

The server response from the URI depends on what you ask for: •A traditional browser will ask for HTML, and receive and render a human readable description of the resource.•Web services can request formal, structured XML or RDF metadata about the resource.

Our goal is to provide a curated collection of authoritative global change information, but always link back to the data center or publisher responsible for the long term stewardship of the resource.

Page 35: Peter Fox, RPI Curt  Tilmes , NASA Xiaogang  (Marshall)  Ma, RPI Anne  Waple , NOAA

CLEAN Vocabulary