open data dialog 2013 - linked data in education
DESCRIPTION
TRANSCRIPT
Motivation Data on the Web
Some eyecatching opener illustrating growth and or diversity of web data
Linked Data and Education – Opportunities, Challenges & the case of LinkedUp
Stefan Dietze (L3S Research Center, DE,
@stefandietze, http://purl.org/dietze)
Stefan Dietze 18/11/13
„…waiting @ #berlinhbf“
„blurb… Berlin ...main
station… „blurb… Berlin
central…“
„blurb… Tiergarten … Bahnhof…“
?
Stefan Dietze 18/11/13
„…Lehrter Bahnhof…“
HTML pages PDFs Social Data
Once upon a time (just a short while ago in fact)
“A little semantics goes a long way” (J. Hendler1)
dbp:Berlin
dbp:populatedPlace
dbp:Berlin_Hauptbahnhof
dbp:Berlin_Central_Station dbp:Lehrter_Bahnhof
typeOf
location
redirectOf redirectOf
Semantic Web
Adding meaning through shared vocabularies and schemas (eg DBpedia)
W3C standards RDF & SPARQL for data & knowledge representation and querying
Persistent URIs to reference & interlink data on the Web
1 Hendler, J., The Dark Side of the Semantic Web, IEEE Intelligent Systems, Jan/Feb 2007
dbp:Tiergarten
city
typeOf
„…waiting @ #berlinhbf“
„blurb… Berlin ...main
station… „blurb… Berlin
central…“
„blurb… Tiergarten … Bahnhof…“
„…Lehrter Bahnhof…“
HTML pages PDFs Social Data
Use of URIs, RDF and SPARQL for exposing data
De-facto standard for sharing data on the Web
Vision: well connected graph of open Web data
350+ datasets and 32 billion triples in LOD Cloud alone
Semantic Web / Linked Data
Source: http://lod-cloud.net/state, September 2011
Media Ontology
FOAF
Gene Ontology
FMA Ontology
BIBO
Geo Ontology
DBpedia Ontology
Dublin Core
rNews
Other „incarnations“:
Google Knowledge Graph
Facebook Open Graph
http://schema.org
Linked Data for Education – How is it useful? 1. Linked Data as body of knowledge for education
vast amount of publicly available resources and data (300+ datasets, 32 billion statements LOD alone)
Dedicated OER and university data + „knowledge resources“ (from DBpedia to Slideshare)
2. Linked Data as set of principles and W3C standards for data sharing
RDF, SPARQL & shared vocabularies to improve interoperability of educational data
Supports Open Education Resources (OER) vision: reuse across isolated platforms
Interlinking educational Resources and the Web of Data – a
Survey of Challenges and Approaches
Stefan Dietze, Salvador Sanchez-Alonso, Hannes Ebner, Hong Qing
Yu, Daniela Giordano, Ivana Marenzi, Bernardo Pereira Nunes,
Emerald Program: electronic Library and Information Systems,
Volume 47, Issue 1 (2013).
Linked Data for Open and Distance Learning
Mathieu d’Aquin, report for the Common Wealth of Learning, Stefan Dietze 18/11/13
„HTTP-accessibility“ (SPARQL, URI-dereferencing)
„Structure“ & „Semantics“ (=> shared/linked vocabularies)
„Interlinked“
„Persistent“
http://linkededucation.org
http://linkeduniversities.org
How LD principles can be useful for data sharing LD as background knowledge
Trusted knowledge, exposed via
established standards
Shared semantics (enrichment,
disambiguation)
Stefan Dietze 18/11/13
„HTTP-accessibility“ (SPARQL, URI-dereferencing)
„Structure“ & „Semantics“ (=> shared/linked vocabularies)
„Interlinked“
„Persistent“
http://dbpedia.org/resources/Berlin
<yo:Video 8748720>
<dc:title>Pluto & the
Dwarf Planets</dc:title>
…
</yo:Video 8748720>
Video
<sioc:Item 2139393292>
<title>Planetary motion
& gravity</title>
…
</sioc:Item 2139393292>
Slideset
Semantics of terms? Topics/categories addressed? Relatedness of resources/entities? (types, semantics)
<po:Programme519215>
<po:Series>Wonders of the Solar
System</po:Series>
<po:Episode>Emp. of the Sun</po:Episode>
<po:Actor>Brian Cox</po:Actor>
</po:Programme519215 >
Programme
Combining a co-occurrence-based and a semantic measure
for entity linking, B. P. Nunes, S. Dietze, M.A. Casanova, R.
Kawase, B. Fetahu, and W. Nejdl., ESWC 2013 - 10th Extended
Semantic Web Conference, (May 2013).
How LD principles can be useful for data sharing LD as background knowledge
Stefan Dietze 18/11/13
<yo:Video 8748720>
<dc:title>Pluto & the
Dwarf Planets</dc:title>
…
</yo:Video 8748720>
Video <po:Programme519215>
<po:Series>Wonders of the Solar
System</po:Series>
<po:Episode>Emp. of the Sun</po:Episode>
<po:Actor>Brian Cox</po:Actor>
</po:Programme519215 >
Programme
Brian Cox?
Sun?
Pluto?
How LD principles can be useful for data sharing LD as background knowledge
Stefan Dietze 18/11/13
db:Pluto
(Dwarf Planet)
db:Astrono-
mical Objects
db:Sun
db:Astronomy
<yo:Video 8748720>
<dc:title>Pluto & the
Dwarf Planets</dc:title>
…
</yo:Video 8748720>
Video <po:Programme519215>
<po:Series>Wonders of the Solar
System</po:Series>
<po:Episode>Emp. of the Sun</po:Episode>
<po:Actor>Brian Cox</po:Actor>
</po:Programme519215 >
Programme
<sioc:Item 2139393292>
<title>Planetary motion
& gravity</title>
…
</sioc:Item 2139393292>
Slideset
How LD principles can be useful for data sharing LD as background knowledge
Stefan Dietze 18/11/13
…why are there so few datasets actually used?
LD reuse and links very much focused on trusted „reference graphs“ such as DBpedia
Long tail of LD datasets which are neither reused nor linked to (LOD Cloud alone consists of 300+ datasets)
Explanations?
That’s awesome, but...
Hm,
really?
Stefan Dietze 18/11/13
„HTTP-accessibility“ (SPARQL, URI-dereferencing)
„Structure“ & „Semantics“ (=> shared/linked vocabularies)
„Interlinked“
„Persistent“
LD is more heterogeneous than we think SPARQL Web-Querying Infrastructure:
Ready for Action?, Carlos Buil-Aranda, Aidan Hogan, Jürgen
Umbrich Pierre-Yves Vandenbussch, International Semantic Web
Conference 2013, (ISWC2013).
SPARQL endpoint availability over time [Buil-Aranda et al 2013]
“Availability” & “Standards” ?
Less than 50% of all SPARQL endpoints actually responsive at given point of time (“high reliability”)
“THE” SPARQL protocol? No, but many subsets/variants
Huge differences in response times
Shared vocabularies & schemas, but:
…still very heterogeneous [d’Aquin, WebSci13]
…data partially messy an not conformant (RDFS, schemas) [HoganJWS2012]
…even widely used reference datasets such as DBpedia noisy [Fürber2010]
Co-occurence graph of data types in 146 datasets: 144 Vocabularies, 588 highly overlapping types, 719 Properties
Assessing the Educational Linked Data Landscape, D’Aquin, M.,
Adamou, A., Dietze, S., ACM Web Science 2013 (WebSci2013), Paris,
France, May 2013.
Using semantic web resources for data quality management. Fürber,
C., Hepp, M..2010,. In Proceedings of the 17th international conference on
Knowledge engineering and management by the masses (EKAW'10),
Springer-Verlag, Berlin, Heidelberg, 211-225.
An empirical survey of Linked Data conformance. Hogan, A., Umbrich,
J., Harth, A., Cyganiak, R., Polleres, A., Decker., S., In the Journal of Web
Semantics 14: pp. 14–44, 2012
12
(Linked) Open Data for Educationnutshell
(Open) Educational Resources
World
Wide
Web
Distance Universities
Linked Open Data
MOOCs
Stefan Dietze 18/11/13
http://linkededucation.org &
http://linkeduniversities.org
Using/exploiting Linked Data in Education ?
Lack of reliable dataset metadata about
Resource types
Topics & disciplines
Quality, currentness & availability
Provenance
Lack of links and cross-dataset references
Lack of federated query approaches
….
Stefan Dietze 18/11/13 13
Success models: data & applications
LinkedUp Challenge to identify innovative tools & applications
Evaluation methods and approaches
“LinkedUp” – Linking Web Data for Education L
Data curation
Technology transfer & community-building
Collecting & exposing open data of educational relevance => LinkedUp Data Catalog
Profiling and linking of Web Data for education => educational data graph
Disseminating knowledge & building communities (educators, computer scientists, data engineers)
Gathering stakeholder feedback: use cases, and requirements
European project aimed at advancing take-up of open data and related technologies
http://linkedup-challenge.org/#usecases
http://linkedup-project.eu/events
http://www.linkedup-challenge.org/
http://data.linkededucation.org
http://linkedup-project.eu
Stefan Dietze 18/11/13
17/09/2013 14
Who we areL
LinkedUp Network
LinkedUp Consortium
LinkedUp Advisory Board
Stefan Dietze 18/11/13
“LinkedUp” – Linking Web Data for Education L
Technology transfer & community-building
Disseminating knowledge & building communities (educators, computer scientists, data engineers)
Gathering stakeholder feedback: use cases, and requirements
European project aimed at advancing take-up of open data and related technologies
http://linkedup-challenge.org/#usecases
http://linkedup-project.eu/events
Success models: data & applications
LinkedUp Challenge to identify innovative tools & applications
Evaluation methods and approaches
http://www.linkedup-challenge.org/
Data curation
Collecting & exposing open data of educational relevance => LinkedUp Data Catalog
Profiling and linking of Web Data for education => educational data graph
http://data.linkededucation.org
http://linkedup-project.eu
Goal: helping data consumers to discover and use suitable datasets
Dataset selection: “LinkedUp/Linked Education cloud” (http://datahub.io/groups/linked-education)
RDF (VoID) catalog of datasets (LinkedUp Catalog): classification of datasets according to, eg, represented types, disciplines/topics, data quality, accessability
Links and coreferences => unified view on data => Linked Education Graph
Infrastructure, unified (SPARQL) endpoint & APIs for federated querying
Data curation and dataset profiling LinkedUp approach
Educational Datasets
LinkedUp
Catalog Automated processing to generate: Descriptive VoID/RDF Dataset Catalog Data links
Stefan Dietze 18/11/13
LinkedUp Data Catalog in a nutshell http://datahub.io/group/linked-education
http://data.linkededucation.org/linkedup/catalog/
VoID dataset catalog: browse, explore and query for datasets/types
Federated queries using type mappings
Stefan Dietze 18/11/13
db:Pluto
(Dwarf
Planet)
db:Astrono-
mical Objects
db:Sun
What‘s all the data about: dataset profiling
db:Astronomy
<yo:Video 8748720>
<dc:title>Pluto & the
Dwarf Planets</dc:title>
…
</yo:Video 8748720>
Video <po:Programme519215>
<po:Series>Wonders of the Solar
System</po:Series>
<po:Episode>Emp. of the Sun</po:Episode>
<po:Actor>Brian Cox</po:Actor>
</po:Programme519215 >
Programme
<sioc:Item 2139393292>
<title>Planetary motion
& gravity</title>
…
</sioc:Item 2139393292>
Slideset
Stefan Dietze 18/11/13
Issue:
Considering LOD as knowledge graph, most nodes are connected
Relevance of topics (DBpedia entities & categories) for particular resources and datasets?
„Topic profile“ of a given dataset?
<po:Programme519215>
<po:Series>Wonders of the Solar
System</po:Series>
<po:Episode>Emp. of the Sun</po:Episode>
<po:Actor>Brian Cox</po:Actor>
</po:Programme519215 >
Programme
db:Astrono-
mical Objects
db:Astronomy
db:Sun
Goal: extracting representative „topic profile“ for datasets
How: computing of normalised (DBpedia) category relevance scores from sample resource sets (scalability vs representativeness)
Applied to entire LOD cloud
DBpedia category graph
Stefan Dietze 18/11/13
What‘s all the data about: dataset profiling Generating structured Profiles of Linked Data
Graphs, Fetahu, B; Adamou, A., Dietze, S., d’Aquin,
M., Nunes, B.P., ISWC2013 – 12th International
Semantic Web Conference;
http://data.linkededucation.org/linkedup/categories-explorer
http://data.linkededucation.org/
Dataset profile explorer http://data.linkededucation.org/request/pipeline/sparql
Stefan Dietze 18/11/13
“LinkedUp” – Linking Web Data for Education L
Data curation
Technology transfer & community-building
Collecting & exposing open data of educational relevance => LinkedUp Data Catalog
Profiling and linking of Web Data for education => educational data graph
Disseminating knowledge & building communities (educators, computer scientists, data engineers)
Gathering stakeholder feedback: use cases, and requirements
European project aimed at advancing take-up of open data and related technologies
Success models: data & applications
LinkedUp Challenge to identify innovative tools & applications
Evaluation methods and approaches
http://www.linkedup-challenge.org/
http://linkedup-project.eu
Series of 3 competitions („Veni“, „Vidi“, „Vici“) running until end of 2014
Open & focused tracks
Total prize budget of almost 40.000 EUR
LinkedUp support activities
http://www.linkedup-challenge.org/
Veni Competition
17 September 2013, Geneva
Stefan Dietze 18/11/13
Tools and demos that analyse or integrate open web data (deadline: 27 June, 1 Open Track, 10.000 EUR awards)
22 submissions, shortlist of 8, from which:
3 winners
People's Choice Award
Final ceremony on 17 September at OKCon, Geneva
The Shortlist incl. 2nd/3rd/People’s Choice
DataConf.
KnowNodes
Mismuseos
ReCredible
YourHistory 18/11/13
http://www.globe-town.org/
WeShare - 3rd price / people‘s choice
GlobeTown - 2nd price
http://seek.cloud.gsic.tel.uva.es/weshare/
1st Place: PoliMedia Exploring political debates & events
09/04/13 Stefan Dietze
Cross-media analysis of political events.
Browsing parliament debates & related media coverage
Automatically generated links between transcripts debates, newspaper articles, including their original lay-out on the page, and radio bulletins.
Generated data available as Linked Data (http://data.polimedia.nl)
Data sources: 1) newspapers in their original layout of the historical newspaper archive, and 2) radio bulletins of the Dutch National Press Agency (ANP)
9000+ debates (1945 – 1995)
Over 3000 media links
Martijn Kleppe, Max Kemman, Henri Beunders (Erasmus Universiteit Rotterdam), Laura Hollink Damir Juric (Vrije Universiteit Amsterdam), Johan Oomen Jaap Blom (Nederlands Instituut voor Beeld en Geluid)
http://www.polimedia.nl/
LinkedUp Veni Competition
Wanted: tools and demos that analyse or integrate open web data (for education)
Anyone can participate - researchers, students, developers, industry
“Open track” & “focused tracks”
20.000+ EUR worth of awards
Final awards ceremony at 11th Extended Semantic Web Conference (ESWC2014)
Submission: 14 February 2014
Outlook
18/11/13 25
http://linkedup-challenge.org/
Learning Analytics & Knowledge (LAK) Data Challenge
Analyse, apply, use, exploit the „LAK Dataset“
Finals at Learning Analytics & Knowledge Conference 2014, Indianapolis, US
Submission: 20th January
http://lak.linkededucation.org/
Thank you!
Contact http://purl.org/dietze | @stefandietze
See also (data)
http://datahub.io/group/linked-education
http://data.linkededucation.org
http://data.linkededucation.org/linkedup/catalog/
http://lak.linkededucation.org
See also (general)
http://linkedup-project.eu
http://linkedup-challenge.org
http://linkededucation.org
http://linkeduniversities.org
Stefan Dietze 18/11/13