a web-based resource model for escience: object reuse & exchange
DESCRIPTION
A Web-Based Resource Model for eScience: Object Reuse & Exchange 2008 Microsoft eScience Conference Indianapolis, December 8, 2008. OAI-ORE Editors. Carl Lagoze Cornell University Herbert Van de Sompel Los Alamos National Laboratory Pete Johnston Eduserv Foundation Michael Nelson - PowerPoint PPT PresentationTRANSCRIPT
A Web-Based Resource Model for eScience:
Object Reuse & Exchange
2008 Microsoft eScience Conference
Indianapolis, December 8, 2008
OAI-ORE Editors
• Carl Lagozeo Cornell University
• Herbert Van de Sompelo Los Alamos National Laboratory
• Pete Johnstono Eduserv Foundation
• Michael Nelsono Old Dominion University
• Rob Sandersono University of Liverpool
• Simeon Warnero Cornell University
Joint work with …
ORE Technical Committee Chris Bizer Freie UniversitŠt Berlin Les Carr University of Southampton Tim DiLauro Johns Hopkins University Leigh Dodds Ingenta David Fulker UCAR Tony Hammond Nature Publishing Group Pete Johnston Eduserv Foundation Richard Jones HP Labs Carl Lagoze Cornell University Peter Murray OhioLINK Michael Nelson Old Dominion University Ray Plante NCSA and National Virtual Observatory Rob Sanderson University of Liverpool Herbert Va n de Sompel Los Alamos National Laboratory Simeon Warner Cornell University Jeff Young OCLC ORE Liaison Group Leonardo Candela Consiglio Nazionale delle Ricerche - DRIVER Tim Cole University of Illinois Urbana-Champaign - Aquifer Julie Allinson JISC Jane Hunter University of Queensland - DEST Savas Parastatidis Microsoft Corporation Sandy Payette Fedora Commons Thomas Place University of Tilburg - DARE Andy Powell Eduserv Foundation - DCMI Robert Tansley Google, Inc. - DSpace
OAI Object Reuse and Exchange: Support
• The Andrew W. Mellon Foundation• The Coalition for Networked Information• Joint Information Systems Committee• Microsoft Corporation• The National Science Foundation
OAI Object Reuse and Exchange
Subject: Aggregations of Web resources
Approach: Publish Resource Maps to the Web that Instantiate, Describe, and Identify Aggregations
Instantiate, Describe, and Identify Aggregations
Aggregations
Aggregations
At one time it was possible to convey all scientific information about a topic in a
single “convenient” medium.
Babylonian Astronomical Catalogue
Aggregations
But quickly the limitations of that medium became obvious.
text data1857 Astrophysics paper
Aggregations
Those limitations seem to live on.
Aggregations
“Solving” the problem with ad hoc methods.
Photo plate kept separate from text(digitized version of original plate shown)
text
1890 Astrophysics paper
Hubble optical observationBaltimore, MD
Basic object informationStrasbourg, France
Aggregations
Objects of interest in eScience are by nature compound.
text
2006 Astrophysics paper
X-MM-Newton X-ray observationVilspa, Spain
Chandra X-ray observationCambridge, MA
A1795
Aggregations!
http://arxiv.org/abs/astro-ph/0611775
Formats
Versions
Identifiers
Relationships
Splash page
Object Reuse and Exchange: A Web-Centric Approach
• The Web Architecture as the platform for interoperability
• De-facto integration with existing Web
applications
• Potential of adoption by other
communities
• Potential of tools created by other
communities
• Incorporating the “social web” (Web 2.0) in eScience
Foundations of OAI-ORE
o Web Architecture- <http://www.w3.org/TR/webarch/>
o Semantic Web, RDF- <http://www.w3.org/TR/rdf-primer/>
o Linked Data- <http://linkeddata.org/>- <http://www4.wiwiss.fu-berlin.de/bizer/pub/LinkedDataTutorial/>
o Cool URIs for the Semantic Web- <http://www.w3.org/TR/cooluris>
W3C Web Architecture
Resource
URIRepresentation 2
Represents
Representation 1
Represents
Identifies
Content Negotiation
The tools we have to solve the interoperability problem are:
• Resource• URI• Representation
Semantic Web
The tools we have to solve the interoperability problem are:
• URI• RDF• Vocabularies
SemanticWeb
URI RDF
Vocabularies
Linked Data
• Linked Data principles:
1. Use URIs as names for things.
2. Use HTTP URIs so that people can look up those names.
3. When someone looks up a URI, provide useful information.
4. Include links to other URIs. So that they can discover more things.
OAI Object Reuse and Exchange: The Approach
Subject: Aggregations of Web resources
Approach: Publish Resource Maps to the Web that Instantiate, Describe, and establish identity of
Aggregations
Approach: Instantiate Aggregations as Resources with unique URIs on the Web
An Aggregation and the Web• Resources of an
Aggregation are distinct URI-identified Web resources
• Missing are:o The boundary that
delineates the Aggregation in the Web
o An identity (URI) for the Aggregation
Publish a Resource Map to the Web
The Resource Map Describes the Aggregation
The Resource Map and the Aggregation integrate into the Web
ORE Data Model
ORE Data Model
We want to have our cake and to eat it too (don't we all?):
o ORE should be simple and easy to use without deep understanding
- Use simple tools and rules to create Atom Resource Maps
o ORE should have well crafted data model that enables interoperability through well defined semantics
- Separate design from implementation- Future-proof ORE – today's technologies will be
replaced (even HTTP?)- Don't need to understand Data Model fully to do ORE
Aggregation: Resource that is a set of resources
This resource is an Aggregation
This resource is an Aggregated Resource
A Relationship defined in the ORE vocabulary
Resource Map: Describes an Aggregation:
This resource is a Resource Map
ResourceMap
SerializationThe resource has a representation
HTTP GET
ore:isDescribedBy
Implied as inverse of “describes”
Recommend use if HTTP URIs
• HTTP is technology of today's web
• Want to be able to cite of refer to Aggregation but get Resource Map describing it
o Follow Linked Data strategies to link: access URI-A, get redirected to URI-R (HTTP 303) or simple # URI
o Provides notion of Authority
Multiple Resource Mapso An Aggregation MAY be asserted and described by multiple
Resource Mapso The purpose of multiple Resource Maps is to provide
descriptions of the Aggregation in multiple serializations (e.g., Atom, RDF/XML, RDFa, etc.)
o Each Resource Map MUST have only one representation
Authority
o Authoritative Resource Mapso Get to Resource Map via Aggregation, usually created by
same authorityo Multiple: MUST be minimally equivalent (same Aggregated
Resources and Proxies), SHOULD assert mutual existenceo Non-authoritative Resource Maps
o Best practice is to not create themo Assert your own Aggregation insteado Use rdfs:seeAlso to assert relationship between two
Aggregation
Multiple Resource Maps
Atom
RDFa
Atom
RDF/XML
ore:describes
ore:describes
These are authoritative Resource Maps
These are authoritative Resource Maps
These are non-authoritative Resource MapsThese are non-authoritative Resource Maps
Not much else
Association with another resource/identifier
Adding other properties to the core
The ReM makes the assertions
The ReM makes the assertions
Metadata about the
ReM
Metadata about the
ReM
Metadata about the
Aggregation
Metadata about the
Aggregation
Required
Asserting other Relationships
Aggregation is a journal
Aggregation is a journal
Aggregation has another version “A”Aggregation has
another version “A”
Aggregated Resources are
articles
Aggregated Resources are
articles“AR-3” is by Stephen Hawking
“AR-3” is by Stephen Hawking
The ReM makes the assertions
The ReM makes the assertions
Assertions about the Aggregation.
Assertions about the Aggregation.
Assertions about Aggregated Resources.
Assertions about Aggregated Resources.
Limits of Assertions thus Far
• The meaning of an RDF triple is independent of the context in which it is stated
• Think of the difference:o Carl is a mano Carl is visiting Indianapolis
• All the triples described thus far are context independento Therefore they can have the URI of an aggregated
resource as subject or objecto But remember that is just the URI of the Resource and is
not exclusive of it being an Aggregated Resource
• Introduce proxy URI
Proxy: Stands for resource in context of other resource
hasNext might have meaning only in context
lineage: “this came from”
Reuse of data set AR-1 in Aggregation A-2.
ore:lineage predicate expressed origin or provenance of data. Needs proxies because statement depends on contexts
ORE Deployment
arXiv.org: ORE possibilities
arXiv is an e-print archive of 500k scholarly articles
Express:• Structure of arXiv: archives, sub-categories, articles• Versioning: “article” (concept) and specific versions and
formats• Articles by Joe Smith – somewhat like a result set• Constituents of an article (metadata, PDF, source, video, data,
extracted references)• Describe internal and external components (e.g. external
video associated with article but on Perimeter Institute server)• Use as part of workflow for ingest – assembly of components,
possible combination with SWORD
http://www.openarchives.org/oreChem/
SCOPE Architecture