opencms days 2012 - keynote: semantic technologies for cms

91
Co-funded by the European Union Semantic CMS Community Semantic Technologies for CMS September 25, 2012 Copyright Tilman Becker, DFKI 1 Dr. Tilman Becker DFKI GmbH, Saarbrücken, Germany OpenCMS Days, Cologne

Upload: alkacon-software-gmbh

Post on 11-May-2015

1.494 views

Category:

Technology


0 download

DESCRIPTION

In this session, Tilman will present the impact of Semantic Technologies for CMS systems. After a brief overview over the current state of affairs for Semantic Technologies, he will drill down by presenting some of the recent results of the EU-funded project IKS (Interactive Knowledge Stack). In IKS, DFKI, Alkacon and 12 further partners strive to bring interaction to the knowledge contained in CMS systems by providing a technology stack that can be used by all CMS systems. The main results of IKS are two software packages: Apache Stanbol (see http://projects.apache.org/projects/stanbol.html) is a modular software stack and reusable set of components for semantic content management, focusing on storage and retrieval. VIE.js (see http://viejs.org/) is a JavaScript library for implementing decoupled Content Management Systems and semantic interaction in web applications, thus focusing on the front end.

TRANSCRIPT

Page 1: OpenCms Days 2012 - Keynote: Semantic Technologies for CMS

Co-funded by the European Union

Semantic CMS Community

Semantic Technologies for CMS

September 25, 2012

Copyright Tilman Becker, DFKI 1

Dr. Tilman Becker DFKI GmbH, Saarbrücken, Germany OpenCMS Days, Cologne

Page 2: OpenCms Days 2012 - Keynote: Semantic Technologies for CMS

www.iks-project.eu

Page:

Web evolution

Copyright IKS Consortium

2

Slide by Nova Spivack, Radar Networks

Web  1.0   Web  2.0  

Web  3.0   Web  4.0  

Page 3: OpenCms Days 2012 - Keynote: Semantic Technologies for CMS

www.iks-project.eu

Page:

What is the problem?

Tilman Becker, DFKI

3

Page 4: OpenCms Days 2012 - Keynote: Semantic Technologies for CMS

www.iks-project.eu

Page:

The Semantic Web  The vision of the Semantic Web has been originally

proposed by Tim Berners-Lee

  “The Semantic Web is not a separate Web but an extension of the current one, in which information is given well-defined meaning, better enabling computers and people to work in cooperation.” [The Semantic Web, 2001]

 Standardized specification techniques for the semantic annotation of content (RDF, OWL, ...)

Copyright IKS Consortium

4

Page 5: OpenCms Days 2012 - Keynote: Semantic Technologies for CMS

www.iks-project.eu

Page:

Semantic Web Stack  W3C provides standardized

specifications for Semantic Web technologies

 Semantic Web Layer Cake as a conceptual architecture describes an hierarchy of languages

 Each layer exploits and uses capabilities of the layers below

Copyright IKS Consortium

Semantic Web Layer Cake, Image source: http://www.w3.org/2007/03/layerCake.svg

5

Page 6: OpenCms Days 2012 - Keynote: Semantic Technologies for CMS

www.iks-project.eu

Page:

A Few Semantic Web Concepts   Identification: URI  Statements: RDF  Queries: SPARQL  Storage: Triple Stores  Ontologies: OWL   Is there anybody out there: Linked Open Data

 Semantic Lifting

Copyright IKS Consortium

6

Page 7: OpenCms Days 2012 - Keynote: Semantic Technologies for CMS

www.iks-project.eu

Page:

Unique Identification of Resources

  “...more fundamental than either HTTP or HTML are URIs, which are simple text strings that refer to Internet resources -- documents, resources, people, and indirectly to anything. URIs are the glue that binds the Web together.”

  In a “Web of Data” the unique identification of entities is required

Copyright IKS Consortium

7

Page 8: OpenCms Days 2012 - Keynote: Semantic Technologies for CMS

www.iks-project.eu

Page:

How to identify resources?  URI – Uniform Resource Identifier [RFC 3986]

  “A Uniform Resource Identifier (URI) is a compact sequence of characters that identifies an abstract or physical resource.”

  A URI consists of five parts: scheme, authority, path, query and fragment

  URI = scheme ":" authority "/" path [ "?" query ] [ "#" fragment ]

 Example:

Copyright IKS Consortium

8

scheme authority path query fragment http://[email protected]:8042/over/there?name=ferret#nose

Page 9: OpenCms Days 2012 - Keynote: Semantic Technologies for CMS

www.iks-project.eu

Page:

What do we need?  We want to express the statement:

  “The brand of the car is Jaguar.”

 We need ...   ...a way to address the concrete resource car.   ... to express the property brand of the resource car.   ... to define the property value Jaguar for the property

brand.

Copyright IKS Consortium

9

Page 10: OpenCms Days 2012 - Keynote: Semantic Technologies for CMS

www.iks-project.eu

Page:

Resource Description Framework (RDF)

  “The Resource Description Framework (RDF) identifies things using Web identifiers (URIs), and describes resources with properties and property values.”

 A Resource is an object that can be identified by an URI, e.g. “http://example.org/Car”.

 A Property describes an aspect of a resource, e.g. “http://example.org/Brand”. The property is also identified by an URI.

 The Property value assigns a concrete value to a property, e.g. “Jaguar” or ““http://example.org/Jaguar”.

Copyright IKS Consortium http://www.w3schools.com/rdf/

10

Page 11: OpenCms Days 2012 - Keynote: Semantic Technologies for CMS

www.iks-project.eu

Page:

RDF Statements  RDF statements consist of subject (resource), predicate

(property) and object (property value)

 Subjects (except Blank Nodes) and Predicates are

always defined by URIs  Objects can be defined by URIs and literals

Copyright IKS Consortium

11

Subject Object (URI)

Predicate

Object (literal)

Predicate

Page 12: OpenCms Days 2012 - Keynote: Semantic Technologies for CMS

www.iks-project.eu

Page:

RDF Statements - Example  Exemplary statements:

  “The brand of the car is Jaguar.”   “The model of the car is XF.”

Copyright IKS Consortium

12

http://example.org/Car http://example.org/Jaguar http://example.org/rel/Brand

XF

http://example.org/rel/Model

Subject Predicate Object

Object

Predicate

Page 13: OpenCms Days 2012 - Keynote: Semantic Technologies for CMS

www.iks-project.eu

Page:

Resource Description Framework (RDF)

  “The Resource Description Framework (RDF) is a language for representing information about resources...” [RDF Primer]

•  W3C Standard (http://www.w3.org/RDF)  RDF provides a graph-based data model   for representing metadata   for describing the semantics of

information in a machine-accessible way Copyright IKS Consortium

13

Page 14: OpenCms Days 2012 - Keynote: Semantic Technologies for CMS

www.iks-project.eu

Page:

RDF Serialization Formats  RDF/XML  N3  N-Triples  TRiG  TRiX  Turtle  JSON  JSON-LD  RDFa

Copyright IKS Consortium

14

Page 15: OpenCms Days 2012 - Keynote: Semantic Technologies for CMS

www.iks-project.eu

Page:

Semantic Layer Web Cake

Copyright IKS Consortium

15

Semantic Web Layer Cake, Image source: http://www.w3.org/2007/03/layerCake.svg

Unique identification of resources

A format for specifying structured data in a machine-readable form

A model for describing resources with properties

and property values.

Page 16: OpenCms Days 2012 - Keynote: Semantic Technologies for CMS

www.iks-project.eu

Page:

RDF Queries  RDF provides a model for describing resources with

properties and property values.

Copyright IKS Consortium

16

@prefix  ex:  <http://www.example.org/>.    ex:Car1    ex:Brand    ex:Jaguar  ex:Car1    ex:Colour  “Black”  ex:Car2    ex:Brand    ex:Jaguar  ex:Car2    ex:Colour  “White”  ex:Car3    ex:Brand    ex:VW  ex:Car3    ex:Colour  “Black”  

How do I get all

black Jaguars?

Page 17: OpenCms Days 2012 - Keynote: Semantic Technologies for CMS

www.iks-project.eu

Page:

SPARQL  SPARQL Protocol and RDF Query Language  W3C Recommendation since 2008

 SPARQL provides a standard for querying information, that is specified in RDF

 SPARQL consists of three specifications   Query language   Query results XML format   Data access protocol

Copyright IKS Consortium

17

Page 18: OpenCms Days 2012 - Keynote: Semantic Technologies for CMS

www.iks-project.eu

Page:

Exemplary SPARQL Query “Return the models and prices for all cars of brand ‘Jaguar’ ”

SPARQL Query:

Exemplary Result:

Copyright IKS Consortium

18

PREFIX  ex:    <http://example.org/>    SELECT  ?model  ?price  WHERE        {  ?car  ex:Brand    ex:Jaguar  .          ?car  ex:Model  ?model  .          ?car  ex:Price  ?price  .    }    

Model Price “XJ” “79.750,00” “XF” “44.900,00”

Declares namespaces for abbreviated resources

identifiers.

Identifies the variables to appear in the query

results.

Provides the basic graph pattern to match against

the data graph.

Page 19: OpenCms Days 2012 - Keynote: Semantic Technologies for CMS

www.iks-project.eu

Page:

Triple Stores  Can be categorized into 3 category:

  In memory triple stores  Used for certain operations like benchmarking, caching, etc

  Native triple stores  Provides their own implementations (Virtuoso, Mulgara,

AllegroGraph, …)   Non memory non native triple stores

 Are built on third party databases (Jena SDB, Kaon, …)

19

Copyright IKS Consortium

Page 20: OpenCms Days 2012 - Keynote: Semantic Technologies for CMS

www.iks-project.eu

Page:

Functionalities provided by Triple Stores   RDBMS-support   General RDF model access   Query language support in the store such as RQL,

SPARQL   Some stores provide:

 Provenance - tracking of who-said-what  APIs for accessing triple store over network

  Very few stores provide:  Full text search   Inference and rule languages

Copyright IKS Consortium

20

Page 21: OpenCms Days 2012 - Keynote: Semantic Technologies for CMS

www.iks-project.eu

Page:

Example Triple Store implementations

  RDF Suite   Sofia Alexaki, Vassilis Christophides, Gregory Karvounarakis,

Dimitris Plexousakis, Karsten Tolle. The ICS-FORTH RDFSuite: Managing Voluminous RDF Description Bases , SemWeb, 2001

  Based on an ORDBMS model   Sesame

  http://www.openrdf.org/   Relational databases (mysql, postgres, oracle)

  Jena   http://www.hpl.hp.com/semweb/jena2.htm   Relational databases (mysql , postgres, oracle)

  Virtuoso   http://virtuoso.openlinksw.com/   Native RDF Quad Storage (Physical Quads)

Copyright IKS Consortium

21

Page 22: OpenCms Days 2012 - Keynote: Semantic Technologies for CMS

www.iks-project.eu

Page:

Computational ontologies  Ontologies are (software) components, expressed and

managed in standard W3C languages like RDF, OWL, RIF, SPARQL

 Computational Ontologies are artifacts  Have a structure (linguistic, logical, etc.)  Their function is to “encode” a description of the

world (actual, possible, counterfactual, impossible, desired, etc.) for some purpose

22

Page 23: OpenCms Days 2012 - Keynote: Semantic Technologies for CMS

www.iks-project.eu

Page:

Searching for ontologies on the Semantic Web

23

Page 24: OpenCms Days 2012 - Keynote: Semantic Technologies for CMS

www.iks-project.eu

Page:

From the lessons learnt ...  Small ontologies with explicit documentation of design

rationales   components supported by specific functionalities

 selection, matching, composition, etc.   implemented in repositories, registries, catalogues,

open discussion and evaluation forums, and in new-generation ontology design tools  ontologydesignpattern.org  ODP and Watson APIs  NeOn ODP Plugin  etc.

24

Page 25: OpenCms Days 2012 - Keynote: Semantic Technologies for CMS

www.iks-project.eu

Page:

Ontology Design Patterns

An ontology design pattern is a reusable successful solution to a recurrent modeling problem

25

Page 26: OpenCms Days 2012 - Keynote: Semantic Technologies for CMS

www.iks-project.eu

Page:

Align CMS Representation With External Ontology

-NewsSubjectCodes

-ArtsCultureEntertainment

-DisasterAccident

-EconomyBusinessFinance -Education

-EnvironmentalIssues

-Health

-HealthTreatment

-Illness

-Medicine

-SocialIssues

-Disease Representation of New Subject Codes as hierarchical ontology classes

-Obesity

-EatingDisorder

-MeSH

-Anatomy

-Diseases

-Organisms

-BehaviorMechanisms

-Psychiatry

-BehaviorDisciplines

-MentalDisorders

-AnxietyDisorders

-EatingDisorders

-SleepingDisorders

-SomotoformDisorders

Mesh Biomedical Ontology

equivalentTo

Page 27: OpenCms Days 2012 - Keynote: Semantic Technologies for CMS

www.iks-project.eu

Page:

Why is RDFS not enough?  RDFS cannot express negations  Defined property restrictions are global

 Missing cardinalities for properties

 Relations between (sub-)classes (e.g. disjunction)

Copyright IKS Consortium

27

Page 28: OpenCms Days 2012 - Keynote: Semantic Technologies for CMS

www.iks-project.eu

Page:

OWL – Web Ontology Language

  “The OWL Web Ontology Language is designed for use by applications that need to process the content of information instead of just presenting information to humans.”

 OWL has been developed as a vocabulary extension of RDF

 Explicitly represents the meaning of terms in vocabularies and the relationships between those terms. (Ontology)

Copyright IKS Consortium

28

http://www.w3.org/TR/2004/REC-owl-features-20040210/

Page 29: OpenCms Days 2012 - Keynote: Semantic Technologies for CMS

www.iks-project.eu

Page:

OWL – The Story  2004 - OWL W3C Recommendation  2009 - OWL 2 W3C Recommendation OWL = Web Ontology Language

 Why not WOL?   Obvious pronunciation which is easy on the ear   Opens up great opportunities for logos   Owls are associated with wisdom   It has an interesting back story

Copyright IKS Consortium

29

http://lists.w3.org/Archives/Public/www-webont-wg/2001Dec/0169.html

http://piqs.de

Page 30: OpenCms Days 2012 - Keynote: Semantic Technologies for CMS

www.iks-project.eu

Page:

schema.org   “simple” ontology  Designed for web search

  Contains movies and records, but not plants and animals  Supported by

  Google   Bing   Yahoo!

Copyright IKS Consortium

30

Page 31: OpenCms Days 2012 - Keynote: Semantic Technologies for CMS

www.iks-project.eu

Page:

Back to the Cake ...

Copyright IKS Consortium

31

Semantic Web Layer Cake, Image source: http://www.w3.org/2007/03/layerCake.svg

Unique identification of resources

A format for specifying structured data in a machine-readable form

A model for describing resources with properties

and property values.

A language for describing a lightweight ontology.

A language for querying information specified in

RDF.

Highly expressive ontology language for modelling complex

knowledge domains.

Page 32: OpenCms Days 2012 - Keynote: Semantic Technologies for CMS

www.iks-project.eu

Page:

Linking Open Data Project   Is an W3C SWEO Project  Aims to make data freely to everyone  Aims to publish open data sets as RDF and set

semantic relationships between them   Serves information in a machine readable format   Enriches content   Reduces duplication

 Linked datasets increasing rapidly   A large number of datasets are linked already

32

Copyright IKS Consortium

Page 33: OpenCms Days 2012 - Keynote: Semantic Technologies for CMS

www.iks-project.eu

Page:

Linked Datasets As of October 2008

33

Copyright IKS Consortium

Page 34: OpenCms Days 2012 - Keynote: Semantic Technologies for CMS

www.iks-project.eu

Page:

Linked Datasets As of September 2010

34

Copyright IKS Consortium

Page 35: OpenCms Days 2012 - Keynote: Semantic Technologies for CMS

www.iks-project.eu

Page:

2011

35

Copyright IKS Consortium

Page 36: OpenCms Days 2012 - Keynote: Semantic Technologies for CMS

www.iks-project.eu

Page:

Access Data In The Cloud  Follow the RDF links representing the “things”  SPARQL Endpoints  Ready to use software to discover linked data (See the

next slide)

36

Copyright IKS Consortium

Page 37: OpenCms Days 2012 - Keynote: Semantic Technologies for CMS

www.iks-project.eu

Page:

Linked Data Applications   Lots of application on top of the linked data

  Tabulator   Marbles   Openlink RDF Browser   …

  Just google   RDF Crawlers   RDF Browsers

  Also see the following link containing a number of linked data applications:   http://www.w3.org/wiki/SweoIG/TaskForces/CommunityProjects/

LinkingOpenData/Applications

37

Copyright IKS Consortium

Page 38: OpenCms Days 2012 - Keynote: Semantic Technologies for CMS

www.iks-project.eu

Page:

What is “Semantic Lifting”?  Semantic Lifting refers to the process of associating

content items with suitable semantic objects as metadata to turn “unstructured” content items into semantic knowledge resources

 Semantic Lifting makes explicit “hidden” metadata in content items

Copyright IKS Consortium 38

Page 39: OpenCms Days 2012 - Keynote: Semantic Technologies for CMS

www.iks-project.eu

Page:

Metadata: Variants   Metadata exist in many forms:

  Free text descriptions   Descriptive content related keywords or tags from fixed vocabularies or

in free form   Taxonomic and classificatory labels   Media specific metadata, such a mime-types, encoding, language, bit

rate   Media-type specific structured metadata schemes such as EXIF for

photos, IPTC tags for images, ID3-tags for MP3, MPEG-7 for videos, etc.

  Content related structured knowledge markup, e.g. to specify what objects are shown in an image or mentioned in a text, what the actors are doing, etc.

Copyright IKS Consortium 39

Page 40: OpenCms Days 2012 - Keynote: Semantic Technologies for CMS

www.iks-project.eu

Page:

Publishing Web Content with semantic metadata

  Augmenting web content with structured information becomes increasingly important

  Several methods have emerged in recent years to include structured metadata in Web pages   Microformats   RDFa   Microdata (HTML5)

  Supported by the major search engines to improve search and result presentation, e.g. Google („Rich Snippets), Bing, Yahoo

Copyright IKS Consortium 40

Page 41: OpenCms Days 2012 - Keynote: Semantic Technologies for CMS

www.iks-project.eu

Page:

Augmenting Web Content   The HTML code contains a review of a restaurant in plain text

using only line breaks for structuring

  Without specialized information extraction analysis tools it cannot be interpreted, e.g. that it is a review (of what and when?), who the reviewer was, etc.

<div> L’Amourita Pizza Reviewed by Ulysses Grant on Jan 6. Delicious, tasty pizza on Eastlake! L'Amourita serves up traditional wood-fired Neapolitan-style pizza, brought to your table promptly and without fuss. An ideal neighborhood pizza joint. Rating: 4.5 </div>

41 Copyright IKS Consortium

Page 42: OpenCms Days 2012 - Keynote: Semantic Technologies for CMS

www.iks-project.eu

Page:

Microformats   Same text but additional span elements with class attributes to

encode the type of contained information (hReview) and the properties of that type

<div class="hreview"> <span class="item"> <span class="fn">L’Amourita Pizza</span> </span> Reviewed by <span class="reviewer">Ulysses Grant</span> on <span class="dtreviewed"> Jan 6<span class="value-title" title="2009-01-06"></span> </span>. <span class="summary">Delicious, tasty pizza on Eastlake!</span> <span class="description">L'Amourita serves up traditional wood-fired Neapolitan-style pizza, brought to your table promptly and without fuss. An ideal neighborhood pizza joint.</span> Rating: <span class="rating">4.5</span> </div>

42 Copyright IKS Consortium

Page 43: OpenCms Days 2012 - Keynote: Semantic Technologies for CMS

www.iks-project.eu

Page:

RDFa   Same text but additional attributes and span elements encoding a

RDF structure:   namespace declaration of the used ontology   RDF class encoded by typeof attribute and its properties by a property attribute

<div xmlns:v="http://rdf.data-vocabulary.org/#" typeof="v:Review"> <span property="v:itemreviewed">L’Amourita Pizza</span> Reviewed by <span property="v:reviewer">Ulysses Grant</span> on <span property="v:dtreviewed" content="2009-01-06">Jan 6</span>. <span property="v:summary">Delicious, tasty pizza on Eastlake!</span> <span property="v:description">L'Amourita serves up traditional wood-fired Neapolitan-style pizza, brought to your table promptly and without fuss. An ideal neighborhood pizza joint.</span> Rating: <span property="v:rating">4.5</span> </div>

43 Copyright IKS Consortium

Page 44: OpenCms Days 2012 - Keynote: Semantic Technologies for CMS

www.iks-project.eu

Page:

Microdata (HTML5)   Same text but additional attributes and span elements:

  A class declaration as value of an itemtype attribute and its properties as values of an itemprop attribute

<div> <div itemscope itemtype="http://data-vocabulary.org/Review"> <span itemprop="itemreviewed">L’Amourita Pizza</span> Reviewed by <span itemprop="reviewer">Ulysses Grant</span> on <time itemprop="dtreviewed" datetime="2009-01-06">Jan 6</time>. <span itemprop="summary">Delicious, tasty pizza in Eastlake!</span> <span itemprop="description">L'Amourita serves up traditional wood-fired Neapolitan-style pizza, brought to your table promptly and without fuss. An ideal neighborhood pizza joint.</span> Rating: <span itemprop="rating">4.5</span> </div> </div>

44 Copyright IKS Consortium

Page 45: OpenCms Days 2012 - Keynote: Semantic Technologies for CMS

www.iks-project.eu

Page:

Named Entities   Statistical Approaches: examples

  Lingpipe: Hidden Markov Models   OpenNLP: Maximum Entropy Models   Stanford NER: Conditional Random Fields

  Statistical models crated by supervised learning techniques   Large annotated corpora required

  Customization diffcult except by re-annotation/re-training   Not suitable for any type of named entity

Copyright IKS Consortium 45

Page 46: OpenCms Days 2012 - Keynote: Semantic Technologies for CMS

www.iks-project.eu

Page:

NER Markup for a Web Page

Copyright IKS Consortium 46

Page 47: OpenCms Days 2012 - Keynote: Semantic Technologies for CMS

www.iks-project.eu

Page:

IE Template

Copyright IKS Consortium 47

A Person Template (as Typed Featured Structure) instantiated from text. The template supports the extraction of various properties of a person.

Page 48: OpenCms Days 2012 - Keynote: Semantic Technologies for CMS

www.iks-project.eu

Page:

Clustering  Detection of classes in a data set  Partitioning data into classes in an unsupervised way

with high intra-class similarity low inter-class similarity  Main variants:

  Hierarchical clustering  Agglomerative

  Partitioning clustering  K-Means

Copyright IKS Consortium 48

Page 49: OpenCms Days 2012 - Keynote: Semantic Technologies for CMS

www.iks-project.eu

Page:

NER Evaluation  Nobel Prize Corpus from NYT, BBC, CNN  538 documents (Ø 735 words/document)

  28948 person, 16948 organization occurrences

Copyright IKS Consortium 49

Sprout Calais Stanford NER

OpenNLP

Precision 77,26 94,22 73,21 57,69 Recall 65,85 86,66 73,62 42,86 F1 71,10 90,28 73,41 49,18

Page 50: OpenCms Days 2012 - Keynote: Semantic Technologies for CMS

www.iks-project.eu

Page:

A Few Semantic Web Concepts   Identification: URI  Statements: RDF  Queries: SPARQL  Storage: Triple Stores  Ontologies: OWL   Is there anybody out there: Linked Open Data

 Semantic Lifting

Copyright IKS Consortium

50

Page 51: OpenCms Days 2012 - Keynote: Semantic Technologies for CMS

www.iks-project.eu

Page:

Bringing it all together   Exporting data (more datasets)

  Grab information from your content (i.e., recognize the „entities“)

  Merging your data   Merge it from different data

  Conbine with different datasets/content   Use data to interact with (e.g., configure)

web services

  Publishing Semantics/Content/interaction   Enrich your content with dinamically

generated, interactive information

51 Copyright IKS Consortium

Page 52: OpenCms Days 2012 - Keynote: Semantic Technologies for CMS

www.iks-project.eu

Page:

Bringing it all together   Exporting data (more datasets)

  Grab information from your content (i.e., recognize the „entities“)

  Merging your data   Merge it from different data

  Conbine with different datasets/content   Use data to interact with (e.g., configure)

web services

  Publishing Semantics/Content/interaction   Enrich your content with dinamically

generated, interactive information

52 Copyright IKS Consortium

Page 53: OpenCms Days 2012 - Keynote: Semantic Technologies for CMS

www.iks-project.eu

Page:

Bringing it all together   Exporting data (more datasets)

  Grab information from your content (i.e., recognize the „entities“)

  Merging your data   Merge it from different data

  Conbine with different datasets/content   Use data to interact with (e.g., configure)

web services

  Publishing Semantics/Content/interaction   Enrich your content with dinamically

generated, interactive information

53 Copyright IKS Consortium

Page 54: OpenCms Days 2012 - Keynote: Semantic Technologies for CMS

www.iks-project.eu

Page:

Bringing it all together   Exporting data (more datasets)

  Grab information from your content (i.e., recognize the „entities“)

  Merging your data   Merge it from different data

  Conbine with different datasets/content   Use data to interact with (e.g., configure)

web services

  Publishing Semantics/Content/interaction   Enrich your content with dinamically

generated, interactive information

54 Copyright IKS Consortium

Page 55: OpenCms Days 2012 - Keynote: Semantic Technologies for CMS

www.iks-project.eu

Page:

Page:

IKS Goal

A Reference Architecture for Semantically Enabled Content Management Systems

Copyright IKS Consortium

2

Copyright IKS Consortium

55

Page 56: OpenCms Days 2012 - Keynote: Semantic Technologies for CMS

www.iks-project.eu

Page:

What is a Semantic CMS?

Copyright IKS Consortium

56

www.iks-project.eu

Page:

vs. Traditional CMS

Atomic unit: Document Properties as meta-data e.g. author tags, keywords

Keyword search for strings in docs

Document Management Document types Document workflow

Semantic CMS

Atomic unit: Entity Semantic meta-data Defined entity types Linked entities

Semantic search for entities and their relations

Knowledge Management Entity management Ontologies

Copyright IKS Consortium

4

What is a Semantic CMS?

Page 57: OpenCms Days 2012 - Keynote: Semantic Technologies for CMS

www.iks-project.eu

Page:

Building Semantic CMS  Ask the experts:  Top 8 CMS Customer Needs

  The following list features the top 8 CMS capabilities that are perceived as highly relevant by CMS customers. The ranking is based on in-depth interviews with 12 IT executives of CMS customer organizations in Europe.

Copyright IKS Consortium

57

Page 58: OpenCms Days 2012 - Keynote: Semantic Technologies for CMS

www.iks-project.eu

Page:

Top 8 Customer Needs   Interoperability  Support for Content Creation  Workflow management  Multi-Channel Access to Content  Personalization  Enrichment of Content   Intuitive User Interface  Enhanced Search Functionality

Copyright IKS Consortium

58

Page 59: OpenCms Days 2012 - Keynote: Semantic Technologies for CMS

www.iks-project.eu

Page:

Ask the experts

Copyright IKS Consortium

59

Book title: Semantic Technologies in Content Management Systems - Applications, Trends and Evaluations Editors: Wolfgang Maass, Saarland University, Germany; Tobias Kowatsch, University of St. Gallen, Switzerland Publisher: Springer, Heidelberg, Germany ISBN: 978-3642215490 (1st Edition. 213 p. 56 illus. Hard cover) Year: January 31, 2012

Page 60: OpenCms Days 2012 - Keynote: Semantic Technologies for CMS

www.iks-project.eu

Page:

IKS guidelines

 Do not change existing CMS!

 Provide as much abstraction as possible!

Copyright IKS Consortium

60

Page 61: OpenCms Days 2012 - Keynote: Semantic Technologies for CMS

www.iks-project.eu

Page:

Traditional CMS Architecture

Copyright IKS Consortium

61

Page 62: OpenCms Days 2012 - Keynote: Semantic Technologies for CMS

www.iks-project.eu

Page:

Semantic CMS Architecture

Copyright IKS Consortium

62

Page 63: OpenCms Days 2012 - Keynote: Semantic Technologies for CMS

www.iks-project.eu

Page:

Implementation of the Reference Architecture

 Reference implementation within the IKS project   IKS: An open source community to

bring semantic technologies to CMS platforms

  New incubating project at the Apache Software Foundation http://incubator.apache.org/stanbol

Copyright IKS Consortium

63

Page 64: OpenCms Days 2012 - Keynote: Semantic Technologies for CMS

www.iks-project.eu

Page:

Do Not Replace – but Extend

www.iks-project.eu

Page:

Do Not Replace – but Extend

No need to replace your existing technology. IKS components offer service oriented integration.

Copyright IKS Consortium

5

Traditional CMS

Database

IKS Technology

Stack

Extend by Using Semantic Services

Copyright IKS Consortium

64

Page 65: OpenCms Days 2012 - Keynote: Semantic Technologies for CMS

www.iks-project.eu

Page:

Use the Concepts of the Web

www.iks-project.eu

Page:

Rely on the Concepts of the Web

Integration through a RESTful web service API Resources are identified by their URI

Copyright IKS Consortium

6

Traditional CMS

Database

IKS Technology

Stack

HTTP Request

HTTP Response

Copyright IKS Consortium

65

Page 66: OpenCms Days 2012 - Keynote: Semantic Technologies for CMS

www.iks-project.eu

Page:

Copyright IKS Consortium

66

www.iks-project.eu

Page:

Copyright IKS Consortium

9

IKS 7.0

Know

ledge Adm

inistration

Knowledge Repository

Knowledge Models

Reasoning

Semantic User Interaction

ContentRepository

Knowledge

IKSReference Implementation

Knowledge Extraction Pipelines

Content

SemanticUserInterface

Knowledge Access

ApacheStanbol

Reasoners

IKS VIEWidgets

IKS VIE

ApacheStanbol

Enhancer

ApacheStanbol Rules

ApacheStanbol

Ontology Manager

ApacheStanbol

ContentHub ApacheStanbol

EntityHub ApacheStanbol

FactStore

StanbolEnhancement

Engine

OSG

IConsole

ApacheStanbol

CMS AdapterCMIS /JCR

RDF

Apache Stanbol RESTful API

ApacheClerezza

Page 67: OpenCms Days 2012 - Keynote: Semantic Technologies for CMS

www.iks-project.eu

Page:

VIE Quick Facts

www.iks-project.eu

Page:

VIE Quick Facts

VIE is a utility library for semantic maintenance in JavaScript

Offers semantic web developers a DSL to ease recurring tasks Easy access to embedded semantic annotations in HTML

(RDFa) Easy loading of properties for entities from external

services Easy saving of knowledge about entities Easy querying of semantic services

VIE Widgets are web user interface components based on VIE.

Copyright IKS Consortium

10

Copyright IKS Consortium

67

Page 68: OpenCms Days 2012 - Keynote: Semantic Technologies for CMS

www.iks-project.eu

Page:

Copyright IKS Consortium

68

www.iks-project.eu

Page:

Apache Stanbol Quick Facts

Modular (OSGi) components implemented in Java

Semantic Lifting Enhance content Link to Linked Open Data (LOD) sources Store and index enhanced content for search

Knowledge Representation & Reasoning Manage ontologies Apply rules to ontologies Reasoning over managed ontologies

Copyright IKS Consortium

11

Page 69: OpenCms Days 2012 - Keynote: Semantic Technologies for CMS

www.iks-project.eu

Page:

Copyright IKS Consortium

69

www.iks-project.eu

Page:

Apache Stanbol Service Layer

Apache StanbolComponent Layer

ApacheStanbol

Reasoners

ApacheStanbol

Enhancer

ApacheStanbol Rules

ApacheStanbol

Ontology Manager

ApacheStanbol

ContentHub

ApacheStanbol

EntityHub

ApacheStanbol

FactStoreStanbolEnhancement

Engines

VIE - User Interface LayerVIE VIE

Widgets

ApacheStanbol

CMS Adapter

Copyright IKS Consortium

12

Service-Oriented View

Semantic Lifting Knowledge Representation & Reasoning

Page 70: OpenCms Days 2012 - Keynote: Semantic Technologies for CMS

www.iks-project.eu

Page:

Copyright IKS Consortium

70

www.iks-project.eu

Page:

Enhancer & Engines Features Semantic lifting by automatically extracting entities from

textual content Different enhancement engines for specific tasks Engines are arranged in customizable enhancement

chains where one engine may rely on the output of another engine

Examples Language Identification Engine Named Entity Extraction Engine Geonames Engine to annotate places with additional

information from geonames.org

Copyright IKS Consortium

14

Page 71: OpenCms Days 2012 - Keynote: Semantic Technologies for CMS

www.iks-project.eu

Page:

Copyright IKS Consortium

71

www.iks-project.eu

Page:

Entityhub Features

Manage a network of remote sites for fast entity lookup Caching of externally retrieved entity information CRUD management of local entities

Examples Use DBPedia linked open data source to retrieve

additional information for entities Use a customized vocabulary for local entities

Copyright IKS Consortium

16

Page 72: OpenCms Days 2012 - Keynote: Semantic Technologies for CMS

www.iks-project.eu

Page:

Copyright IKS Consortium

72

www.iks-project.eu

Page:

Contenthub Features

Document repository by indexing retrieved documents Supports indexing of additional semantic metadata

provided along the content Search facilities Keyword Search Faceted Search based on available semantic metadata

Copyright IKS Consortium

18

Page 73: OpenCms Days 2012 - Keynote: Semantic Technologies for CMS

www.iks-project.eu

Page:

Copyright IKS Consortium

73

www.iks-project.eu

Page:

CMS Adapter Features

Bootstrapping component to import content from a CMS

into Apache Stanbol Import content from a CMIS/JCR compliant CMS into

the Apache Stanbol Contenthub

Copyright IKS Consortium

20

Page 74: OpenCms Days 2012 - Keynote: Semantic Technologies for CMS

www.iks-project.eu

Page:

Copyright IKS Consortium

74

www.iks-project.eu

Page:

Apache Stanbol Service Layer

Apache StanbolComponent Layer

ApacheStanbol

Reasoners

ApacheStanbol

Enhancer

ApacheStanbol Rules

ApacheStanbol

Ontology Manager

ApacheStanbol

ContentHub

ApacheStanbol

EntityHub

ApacheStanbol

FactStoreStanbolEnhancement

Engines

VIE - User Interface LayerVIE VIE

Widgets

ApacheStanbol

CMS Adapter

29

VIE & VIE Widgets

Copyright IKS Consortium Semantic Lifting Knowledge Representation & Reasoning

Page 75: OpenCms Days 2012 - Keynote: Semantic Technologies for CMS

www.iks-project.eu

Page:

Copyright IKS Consortium

75

www.iks-project.eu

Page:

VIE & VIE Widgets Features

VIE is a JavaScript library for implementing decoupled

CMS and semantic interaction in web applications VIE provides easy access to the semantic metadata

(RDFa) within a web page VIE Widgets are user interface components that

implement semantic user interactions Examples Semantic image search Automatic tagging of entities Semi-automatic content annotation

Copyright IKS Consortium

30

Page 76: OpenCms Days 2012 - Keynote: Semantic Technologies for CMS

www.iks-project.eu

Page:

VIE: Core

Javascript framework/library is a

Page 77: OpenCms Days 2012 - Keynote: Semantic Technologies for CMS

www.iks-project.eu

Page:

VIE: Core

Javascript framework/library

abstraction of

semantic entities and their relations

is a

Page 78: OpenCms Days 2012 - Keynote: Semantic Technologies for CMS

www.iks-project.eu

Page:

VIE: Core

Javascript framework/library

abstraction of

semantic entities and their relations

is a

using

Page 79: OpenCms Days 2012 - Keynote: Semantic Technologies for CMS

www.iks-project.eu

Page:

VIE: Core

Javascript framework/library

abstraction of

semantic entities and their relations

is a

addressing

using

Web Developers   bringing semantics into webpage   without caring too much about

triples/triplestores and so on

Page 80: OpenCms Days 2012 - Keynote: Semantic Technologies for CMS

www.iks-project.eu

Page:

VIE: Core  VIE offers an API to: -

  create entities with properties   link entities   serialize entities (either into the HTML using RDFa or to a

server)   access semantic lifting services (e.g., Zemanta,

OpenCalais, Apache Stanbol, …)   query databases to fill

 The default "ontology" that VIE is delivered with, is http://schema.org, which can be easily switched or extended.

Page 81: OpenCms Days 2012 - Keynote: Semantic Technologies for CMS

www.iks-project.eu

Page:

VIE: UI Widgets

UI Widgets

On top of VIE we gathered a bunch of UI widgets in a library that help to simplifying embedding VIEs power into a webpage more directly.

81 Copyright IKS Consortium

Page 82: OpenCms Days 2012 - Keynote: Semantic Technologies for CMS

www.iks-project.eu

Page:

VIE Widgets

  VIE-Widgets are a sort of jQuery UI Widgets in order to:   achive maximum portability   accelerating lerning curve

Widgets

Widgets

82 Copyright IKS Consortium

Page 83: OpenCms Days 2012 - Keynote: Semantic Technologies for CMS

www.iks-project.eu

Page:

It‘s about abstraction

VIE - UI Widgets „VIE-W“

VIE „Edit your content w. Semantics“

VIE-2 „Edit your Semantics“

(Semantic) Services (e.g., Stanbol Enhancer, - EntityHub, Zemanta, ...)

(Semantic) Databases (e.g., DBPedia, Geonames, ...)

83 Copyright IKS Consortium

Page 84: OpenCms Days 2012 - Keynote: Semantic Technologies for CMS

www.iks-project.eu

Page:

Analyze with Apache Stanbol var elem = $('<p>This is a small test, where Steve Jobs

sings a song.</p>'); v .analyze({element: elem}) .using('stanbol') .execute() .done(function(entities) { alert ("found: " + entities.length +

" entities!"}) .fail(function(f) { alert("something went wrong") });

Page 85: OpenCms Days 2012 - Keynote: Semantic Technologies for CMS

www.iks-project.eu

Page:

Interaction Patterns: IP An IP consists of four parts:

  the problem   the pattern (i.e., the

solution of the problem)

  use cases for the pattern

  how the pattern applies for the use cases

Page 86: OpenCms Days 2012 - Keynote: Semantic Technologies for CMS

www.iks-project.eu

Page:

An Experiment within IKS: Ambient Interaction Beyond Classical CMS

Copyright IKS Consortium

86

It's Thursday morning. I get site-specific weather information when I am brushing my teeth in the bathroom. Based on weather information and my calendar, free-time event suggestions are given, e.g. "Today, 8 p.m. - Miss Marple Night at CinemaOne. Do you want to order tickets?”

Copyright by Duravit

Page 87: OpenCms Days 2012 - Keynote: Semantic Technologies for CMS

www.iks-project.eu

Page:

Most of IKS Semantic CMS is used in the AmI Case System

Copyright IKS Consortium

87

IKS Semantic CMS Architecture

AmI Case System Logical Architecture

The blue marked modules indicate modules that exist in both architectures

Page 88: OpenCms Days 2012 - Keynote: Semantic Technologies for CMS

www.iks-project.eu

Page:

IKS Licenses:

Copyright IKS Consortium

88

www.iks-project.eu

Page:

License

IKS software is licensed under business-friendly open source software licenses.

IKS software can be freely used / changed / distributed in your products.

For the rare cases where artifacts use a less permissive license, you will find a notice. e.g. we use models for natural language processing from

the Apache OpenNLP project whose licenses are not clarified, yet.

Copyright IKS Consortium

31

Page 89: OpenCms Days 2012 - Keynote: Semantic Technologies for CMS

www.iks-project.eu

Page:

Copyright IKS Consortium

89

www.iks-project.eu

Page:

Get in Contact

VIE Homepage

http://viejs.org Google User Group

https://groups.google.com/forum/#!forum/viejs

Apache Stanbol Homepage

http://incubator.apache.org/stanbol Mailinglist subscription

[email protected]

Copyright IKS Consortium

32

Page 90: OpenCms Days 2012 - Keynote: Semantic Technologies for CMS

www.iks-project.eu

Page:

Thank you for your attention !

Acknowledgement: to all participants of IKS, especially the provider of in-depth tutorials.

Copyright IKS Consortium

90

Page 91: OpenCms Days 2012 - Keynote: Semantic Technologies for CMS

Co-funded by the European Union

Semantic CMS Community

Semantic Technologies for CMS

September 25, 2012

Copyright Tilman Becker, DFKI 91

Dr. Tilman Becker DFKI GmbH, Saarbrücken, Germany OpenCMS Days, Cologne