linked data: principles and practice
DESCRIPTION
Linked Data: Principles and Practice. Joe Futrelle Woods Hole Oceanographic Institution [email protected] WHOI / BCO-DMO, July 11, 2011. Grand challenge: whole systems. Observation and modelling of multiple systems at multiple scales Linking data from different disciplines - PowerPoint PPT PresentationTRANSCRIPT
Linked Data:Principles and Practice
Joe FutrelleWoods Hole Oceanographic Institution
WHOI / BCO-DMO, July 11, 2011
Grand challenge: whole systems
Observation and modelling of multiple systems at multiple scalesLinking data from different disciplinesto get useful global results!
“... modelling complex systems will be a major research challenge for the 21st century”- National Science Foundation
Building current practices up isn't working
Heterogeneous tools, data formatsCan’t get everyone in one workgroupFunding goes to science, not stewardship
M.C. Escher, “Tower of Babel” (1928)
Proposed solutions aren't working
• e-Journals – not machine-interpretable• Collaboration tools
– everyone falls back on email & other p2p• Portals and repositories – typically:
– centralized– domain-specific
• “The Grid” – can orchestrate complex processing jobs, but that's not science
Only networks work at scale
Single researcherAd hoc data mgt, single-user appsCommunityCommunity tools, resources, controlGlobalNo global practice, tools, control
Desktop
Workgroup
Network
Or to put it another way …
Ted Nelson, Computer Lib / Dream Machines (1974)
Data is the network
There is no boundary, center, or locus of control,… so it scales
linkeddata.org (2009)
Benjamin Franklin (1754)
“If you can’t tweet your dataset, it doesn’t exist”
• Links are the global currency of the internet
• The more people link to you, the more you matter (e.g., Page rank)
• If nobody can link to your data, they will choose data they can link to instead
• If someone links to your data, someone will link to them, and thus to you
• The lowest entry barrier wins
Don’t drink the Kool-aid• Semantic web
“layer cake”• Where do we do
actual work?– User interface?– Applications?
• “Semantic Grid” (D. DeRoure, C. Goble)
(source: World Wide Web Consortium)
Semantics = what they hear• Shared semantics
are minimal• Maximal
semantics emerge when multiple nodes act on partial information
• Validating each exchange doesn’t scale
Gary Larson (1983)
Design data for network effects• Global, persistent identification• Open models (tolerate incompleteness)• Transparent protocols (pass-through)• “Graceful degradation” (cf. Dublin Core)• Data outlives code, so data should
control code, not the other way around• Semantics matter, so they must be
explicit and machine-readable (not a side effect of running code)
Practices that grow the network• Give everything a portable identifier• Link entities via properties = network• Reuse existing ontologies and only build
the partial ontologies that fill in the gaps (e.g., don’t re-develop Dublin Core terms)
• Emit metadata early and often; don’t assume curators will do it later (who? $?)
• “Not building a wall; building a brick” (Oblique Strategies, 1970)