open education challenge 2014: exploiting linked data in educational applications

23
Exploiting (Linked) Web Data in Educational Applications Stefan Dietze L3S Research Center http://purl.org/dietze @stefandietze - Open Education Challenge, Berlin, 2014 - 28/10/14 1 Stefan Dietze

Upload: stefan-dietze

Post on 20-Aug-2015

523 views

Category:

Technology


0 download

TRANSCRIPT

Exploiting (Linked) Web Data

in Educational Applications

Stefan Dietze

L3S Research Center

http://purl.org/dietze

@stefandietze

- Open Education Challenge, Berlin, 2014 -

28/10/14 1 Stefan Dietze

Linked Data for education

Data sharing: TED, Open Courseware, mEducator, LinkedUp, LAK….

Tutorials & workshops (eg „Linked Learning“ series)

LinkedUniversities.org and LinkedEducation.org

W3C Linked Open Education community group

Research areas

Web & data science, information retrieval, semantic web & Linked Data, data & knowledge integration

Application domains: education/TEL, Web archiving, …

Some projects

Introduction

http://www.l3s.de/

28/10/14 2

See also: http://purl.org/dietze

Stefan Dietze

Social Media

Exploiting Open Data for Education?nutshell

(Open) Educational Resources World

Wide

Web

Distance Universities

MOOCs

Linked Open Data

28/10/14 3 Stefan Dietze

How Open is Open Data?

Open Data (as in “open licensing”)

Open licensing (ODL, CC etc)

Yet: variety of approaches

APIs/feeds: SOAP, REST, etc

Diverse schemas & vocabularies

(lack of) controlled vocabularies

Reuse & interoperability?

Linked Data (technology) (as in “interoperability”)

Defacto Standard for Open Data on the Web

W3C standards:

Common HTTP interface: SPARQL

Common representation: RDF

Dereferencable URIs

Shared/linked vocabularies

Linked Open Data

5-star scheme by Sir Tim Berners Lee

28/10/14 4 Stefan Dietze

Semantic Web

Example: Google Knowledge Graph (DBpedia, Freebase, Yago etc)

W3C standards (RDF & SPARQL) for knowledge representation and querying

URIs to identify/link data

“A little semantics goes a long way” (J. Hendler1)

dbp:United_States

http://dbpedia.org/resource/Cambridge_MA

dbp:W3C

country cityOf

1 Hendler, J., The Dark Side of the Semantic Web, IEEE Intelligent Systems, Jan/Feb 2007

schema:City

typeOf

dbp:MIT

ru.dbp:Кембридж_(Массачусетс)

sameAs headquarterOf

HTTP accessibility: persistent URIs, SPARQL

FOAF

Gene Ontology

BIBO

Geo Ontology

DBpedia Ontology

Dublin Core

BBC Program

mes

Connected graph of open Web data (500+ datasets and 100 billion triples)

Persistent, dereferencable URIs & content negotiation, shared/linked vocabularies

SPARQL to query via HTTP

Other „incarnations“:

Google Knowledge Graph

Facebook Open Graph

http://schema.org

http://dbpedia.org/resource/Cambridge_MA

28/10/14 6 Stefan Dietze

LD to ensure discoverability of content/Websites (eg schema.org/microdata/RDFa)

Annotating HTML documents about (educational) material with schema.org (eg LRMI, Learning Resource Metadata Initiative)

Adopted by major sites (YouTube, LinkedIn etc) & tool support (DRUPAL, WordPress)

LD is not just for your data Schema.org for discovery of content/websites

http://schema.org

© Ramanathan V. Guha, Google, SemTech2014

28/10/14 7 Stefan Dietze

Other learning-relevant data & resources

Publications & literature

(Social) media resource metadata

Domain-specific knowledge: Bioportal, Europeana, Geonames, …

Cross-domain factual knowledge: DBpedia, Freebase, …

LD as body of knowledge for education http://linkededucation.org

http://linkeduniversities.org

28/10/14 8 Stefan Dietze

Educational datasets and vocabularies

University Linked Data: The Open University UK, http://data.open.ac.uk, Southampton University, http://education.data.gov.uk, …

Open Educational Resources metadata: mEducator, Open Learn, Open Courseware, …

Schemas: Learning Resource Metadata Initiative (LRMI, mEducator Educational Resources schema, BIBO, AAISO, …

LD as background knowledge for educational apps?

http://metamorphosis.med.duth.gr/

Title: ECG Patient case 1001 chest and limb leads

28/10/14 9 Stefan Dietze

Title: ECG Patient case 1001 chest and limb leads

„ECG“ dismabiguation on Wikipedia: 9 meanings

LD as background knowledge for educational apps?

28/10/14 10 Stefan Dietze

dbpedia.org/resource/Electrocardiagraphy

1. Understanding data: contextual disambiguation through NLP tools

2. Enrichment with factual knowledge

dbpedia:Электрокардиография

category:Cardiac_procedures

dbpedia:Willem_Einthoven

3. interlinking with related resources

bbc:ProgrammeXY

slideshare:SlidesetXY

yovisto:VideolectureXY

Title: ECG Patient case 1001 chest and limb leads

Understanding, enriching, linking data

28/10/14 11 Stefan Dietze

„Success models“: data & applications

Supporting innovative tools & applications

Evaluation methods

LinkedUp – Linking Web Data for Education

Technology transfer & community-building

Involving educators, developers, computer scientists, data engineers…

http://www.linkedup-challenge.org/

Data curation & profiling

Collecting & exposing open data for education

Profiling of Web Data

http://data.linkededucation.org

EC-funded project aimed at advancing take-up of open data and related technologies

http://www.linkedup-project.eu/events

28/10/14 Stefan Dietze 12

http://www.linkedup-project.eu/

Community-building and collaboration Joint work on tangible outcomes (datasets, applications....)

Associated Partners

Initiatives

EC Projects

Stefan Dietze

Collected & curated datasets of educational relevance

Beyond collecting: published over 50 datasets as LD together with most important content providers e.g. TED, OCW, SoLAR etc

LinkedUp catalog: most comprehensive collection of LD/Open Data for education

RDF dataset metadata

Federated queries across datasets using type mappings

Publishing & curating educational data

http://data.linkededucation.org/linkedup/catalog/

28/10/14 Stefan Dietze 14

http://data-observatory.org/lod-explorer

Supporting developers and data consumers

Devtalk blog: developer resource & community to aid developers

Webinars and tutorials

http://data.linkededucation.org/linkedup/devtalk/

Topic-based annotation and discovery of data

Data exploration & visualisation features

28/10/14 Stefan Dietze 16

LinkedUp events, training & technology transfer Bringing stakeholders together

Data Providers & Data Scientists

Developers

Community-building through events & communication channels/social media (cross-disciplinary, industry & academia)

Exploitation of project outcomes across communities: technology transfer

(Co-)organised approx. 20 events (tutorials, workshops, booths etc)

More than 30 invited talks/lectures

….

Users (Learners, Tutors, Teachers)

28/10/14 Stefan Dietze 17

May –September 2013 October 2013 – May 2014 May 2014 – October 2014

Series of Open Data Competitions to promote applications which exploit Linked Open Data

http://www.linkedup-challenge.org/

LinkedUp Challenge

23

1413

89

10

0

5

10

15

20

25

Veni Vidi Vici

submissions

shortlist

LinkedUp Challenge results

50 submissions of which 27 were shortlisted and supported (through travel grants, participation in events and rewards)

13 Veni, Vidi, Vici winners (grants: 1000 – 3000 €)

Authors from 23 distinct, mostly European countries

LinkedUp submissions & shortlist

Coatia; 4Greece; 4

Belgium; 5

Italy; 7

Germany; 11

Spain; 13

France; 14Netherlands; 15

United States; 15

United Kingdom; 21

authors

Top-10 author‘s origins

28/10/14 Stefan Dietze 21

Issues (1/3) - open data is messier than we think

SPARQL endpoint availability over time [Buil-Aranda et al 2013]

Accessibility of datasets?

Less than 50% of all SPARQL endpoints actually responsive at given point of time [Buil-Aranda2013]

“THE” SPARQL protocol? No, but many variants & subsets

Data “quality”?

…data accuracy (eg DBpedia)? [Paulheim2013]

…vocabulary reuse/links? [D’AquinWebSci13]

…schema compliance (RDFS, schemas) [HoganJWS2012]

Stefan Dietze

SPARQL Web-Querying Infrastructure: Ready for Action?, Carlos Buil-Aranda,

Aidan Hogan, Jürgen Umbrich Pierre-Yves Vandenbussch, International Semantic

Web Conference 2013, (ISWC2013).

Assessing the Educational Linked Data Landscape, D’Aquin, M., Adamou, A.,

Dietze, S., ACM Web Science 2013 (WebSci2013), Paris, France, May 2013.

Type Inference on Noisy RDF Data, Paulheim H., Bizer, C. Semantic Web – ISWC

2013, Lecture Notes in Computer Science Volume 8218, 2013, pp 510-525

An empirical survey of Linked Data conformance. Hogan, A., Umbrich, J., Harth,

A., Cyganiak, R., Polleres, A., Decker., S., Journal of Web Semantics 14, 2012

28/10/14 22

Issues (2/3) – accepting inconsistency

Analyzing Relative Incompleteness of Movie Descriptions

in the Web of Data: A Case Study, Yuan, W., Demidova, E.,

Dietze, S., Zhu, X., International Semantic Web Conference

2014 (ISWC2014)

28/10/14 Stefan Dietze 23

Issues (3/3) – licensing/legal aspects

Dataset Words Pages

DBpedia 7163 16

Flickr 10367 23

ConceptNet 7163 16

World Bank 7056 16

Nature 7024 16

LinkedIn 6104 14

Google+ 5740 13

Tumblr 5362 12

Twitter 4247 9

Facebook 4179 9

Mashing up data: legal and licensing related issues under-estimated

What license do you get when mashing up:

Attribution: copyright violation from missing (86%) or incorrect attribution (14%) information

Terms & conditions: complexity and conflicts when merging data from different sources

Potential non-compliance from evolution of (a) LOD applications and (b) underlying datasets (and their licenses)

T&C of established datasets

28/10/14 Stefan Dietze 24

Nature (CC0) + DBpedia (CC-ShareAlike) + FAO (Proprietary non-commercial) => ?

Get involved!

http://www.w3.org/community/opened

http://data.linkededucation.org/linkedup/catalog/

http://data.linkededucation.org/linkedup/devtalk/

Thank you!

28/10/14 Stefan Dietze 26