co-funded by the european union semantic cms community semantic lifting for traditional content...

45
Co-funded by the European Union Semantic CMS Community Semantic Lifting for Traditional Content Resources Copyright IKS Consortium 1 Lecturer Organization Date of presentation

Upload: sonya-brammell

Post on 14-Dec-2015

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Co-funded by the European Union Semantic CMS Community Semantic Lifting for Traditional Content Resources Copyright IKS Consortium 1 Lecturer Organization

Co-funded by the European Union

Semantic CMS Community

Semantic Lifting for Traditional Content Resources

Copyright IKS Consortium1

LecturerOrganization

Date of presentation

Page 2: Co-funded by the European Union Semantic CMS Community Semantic Lifting for Traditional Content Resources Copyright IKS Consortium 1 Lecturer Organization

www.iks-project.eu

Page:

Copyright IKS Consortium

Introduction of Content Management

Foundations of Semantic Web Technologies

Storing and Accessing Semantic Data

Knowledge Interaction and Presentation

Knowledge Representation and Reasoning

Semantic Lifting

Designing Interactive Ubiquitous IS

Requirements Engineering for Semantic CMS

Designing Semantic CMS

Semantifying your CMS

Part I: Foundations

Part II: Semantic Content Management

Part III: Methodologies

(2) (1)

(3)

(4)

(5)

(6)

(7)

(8)

(9)

(10)

Page 3: Co-funded by the European Union Semantic CMS Community Semantic Lifting for Traditional Content Resources Copyright IKS Consortium 1 Lecturer Organization

www.iks-project.eu

Page:

What is this Lecture about?

We have learned ... ... how to build ontologies

representing complex knowledge domains.

... a way to reason about knowledge.

We need a way ... ... to extract knowledge from

content in a automatic way Semantic Lifting

Copyright IKS Consortium

3

Storing and Accessing Semantic Data

Knowledge Interaction and Presentation

Knowledge Representation and Reasoning

Semantic Lifting

Part II: Semantic Content Management

(3)

(4)

(5)

(6)

Page 4: Co-funded by the European Union Semantic CMS Community Semantic Lifting for Traditional Content Resources Copyright IKS Consortium 1 Lecturer Organization

www.iks-project.eu

Page:

Overview

What is semantic lifting? Core concepts Scenarios Requirements Technologies

Semantic Reengineering Semantic Enhancements of textual content

Copyright IKS Consortium

4

Page 5: Co-funded by the European Union Semantic CMS Community Semantic Lifting for Traditional Content Resources Copyright IKS Consortium 1 Lecturer Organization

www.iks-project.eu

Page:

What is “Semantic Lifting”? Semantic Lifting refers to the process of associating

content items with suitable semantic objects as metadata to turn “unstructured” content items into semantic knowledge resources

Semantic Lifting makes explicit “hidden” metadata in content items

Copyright IKS Consortium

5

Page 6: Co-funded by the European Union Semantic CMS Community Semantic Lifting for Traditional Content Resources Copyright IKS Consortium 1 Lecturer Organization

www.iks-project.eu

Page:

Semantic Lifting Targets Semantic Reengineering of structured data

Semantic Lifting harmonizes metadata representations Semantic Lifting reengineers data from an existing resource so

that the data from the resource can be reused within in a semantic repository

Semantic Content Enhancement Semantic Lifting generates additional metadata and annotations

by semantic analysis of content items Semantic Lifting classifies content objects by means of semantic

annotations

6

Copyright IKS Consortium

Page 7: Co-funded by the European Union Semantic CMS Community Semantic Lifting for Traditional Content Resources Copyright IKS Consortium 1 Lecturer Organization

www.iks-project.eu

Page:

Structured Content

Structured content provides implicit semantics through the structure definition Table definitions in relational databases, XML

schemata, field definitions for adressbooks, calendars, etc.

Application programs are designed to „know“ how to interpret the structures and the data within.

Semantic Lifting is used for Reengineering to support data exchange and seamless interoperability between different systems

Copyright IKS Consortium

7

Page 8: Co-funded by the European Union Semantic CMS Community Semantic Lifting for Traditional Content Resources Copyright IKS Consortium 1 Lecturer Organization

www.iks-project.eu

Page:

Unstructured Content

Unstructured content Images, texts, videos, music, web pages composed

of various types of media items Meaningful only to humans not to machines

Content must be described semantically by metadata to become meaningful to machines, e.g. what the text or image is about.

Semantic Lifting is used as content enhancement

8

Copyright IKS Consortium

Page 9: Co-funded by the European Union Semantic CMS Community Semantic Lifting for Traditional Content Resources Copyright IKS Consortium 1 Lecturer Organization

www.iks-project.eu

Page:

Mixed Content

No dichotomy of structured and unstructured content Structured databases are used to store unstructured

content types, such as texts, images etc. Documents can be composed of unstructured content

items such as free text and images as well as more structured information, e.g. tables and charts

Copyright IKS Consortium

9

Free text

Structured content

Page 10: Co-funded by the European Union Semantic CMS Community Semantic Lifting for Traditional Content Resources Copyright IKS Consortium 1 Lecturer Organization

www.iks-project.eu

Page:

Metadata: Variants Metadata exist in many forms:

Free text descriptions Descriptive content related keywords or tags from fixed vocabularies or

in free form Taxonomic and classificatory labels Media specific metadata, such a mime-types, encoding, language, bit

rate Media-type specific structured metadata schemes such as EXIF for

photos, IPTC tags for images, ID3-tags for MP3, MPEG-7 for videos, etc.

Content related structured knowledge markup, e.g. to specify what objects are shown in an image or mentioned in a text, what the actors are doing, etc.

Copyright IKS Consortium

10

Page 11: Co-funded by the European Union Semantic CMS Community Semantic Lifting for Traditional Content Resources Copyright IKS Consortium 1 Lecturer Organization

www.iks-project.eu

Page:

Metadata: Variants

Inline metadata are part of content ID3 tags embedded in MP3 files

Offline metadata are kept separate from content

Copyright IKS Consortium

11

Page 12: Co-funded by the European Union Semantic CMS Community Semantic Lifting for Traditional Content Resources Copyright IKS Consortium 1 Lecturer Organization

www.iks-project.eu

Page:

Formal semantic metadata

Data representation in a formalism with a formal semantic interpretation that defines the concept of (logical) entailment for reasoning: Soundness: conclusions are valid entailments Completeness: every valid entailment can be deduced Decidability: a procedure exists to determine whether a

conclusion can be deduced Embodiments:

Logics Knowledge Representation Systems, Description Logics

Semantic Web: RDF, OWL

Copyright IKS Consortium

12

Page 13: Co-funded by the European Union Semantic CMS Community Semantic Lifting for Traditional Content Resources Copyright IKS Consortium 1 Lecturer Organization

www.iks-project.eu

Page:

„Semantics“ in CMS

CMS systems provide various methods to include metadata Organize content in hierarchies Hierarchical taxonomies Attachment of properties to content items for metadata Content type definitions with inheritance

These methods are used in CMS systems in ad-hoc fashion without clear semantics. Therefore no well-defined reasoning is possible.

Copyright IKS Consortium

13

Page 14: Co-funded by the European Union Semantic CMS Community Semantic Lifting for Traditional Content Resources Copyright IKS Consortium 1 Lecturer Organization

www.iks-project.eu

Page:

Semantic Lifting Usage Content Creation and Acquisition

Authoring content Support content editors in providing metadata of specified types

Uploading external content/documents automatic extraction and analysis, e.g. for indexing

Importing content from external sources/documents Integration of external content into content repository Content needs to be transformed to match internal CMS structures and

metadata schemes Crossreferencing/linking among CMS content items and external

content Detect related or additional content Add pointers/links to related or additional content

Copyright IKS Consortium

14

Page 15: Co-funded by the European Union Semantic CMS Community Semantic Lifting for Traditional Content Resources Copyright IKS Consortium 1 Lecturer Organization

www.iks-project.eu

Page:

Semantic Lifting Usage

Access to external documents and content repositories Semantic harmonization with CMS semantic structures Semantic interoperability in data exchange with other content

repositories The CMS needs to understand the data structures used

by external services and programs E.g synchronization of a local calendar from Outlook with an

external calendar based on iCalendar format E.g. Importing RDF from a Linked Data endpoint such as

dbpedia The CMS must present its data in a form understood by

external target services or programsCopyright IKS Consortium

15

Page 16: Co-funded by the European Union Semantic CMS Community Semantic Lifting for Traditional Content Resources Copyright IKS Consortium 1 Lecturer Organization

www.iks-project.eu

Page:

Semantic Lifting Usage

Publishing content with metadata Metadata need to be transformed into a form compatible

with the publication format E.g. converting FreeDB metadata into ID3 tags for inclusion in

an MP3 file

Copyright IKS Consortium

16

Page 17: Co-funded by the European Union Semantic CMS Community Semantic Lifting for Traditional Content Resources Copyright IKS Consortium 1 Lecturer Organization

www.iks-project.eu

Page:

Publishing Web Content with semantic metadata

Augmenting web content with structured information becomes increasingly important

Several methods have emerged in recent years to include structured metadata in Web pages Microformats RDFa Microdata (HTML5)

Supported by the major search engines to improve search and result presentation, e.g. Google („Rich Snippets), Bing, Yahoo

Copyright IKS Consortium

17

Page 18: Co-funded by the European Union Semantic CMS Community Semantic Lifting for Traditional Content Resources Copyright IKS Consortium 1 Lecturer Organization

www.iks-project.eu

Page:

Augmenting Web Content The HTML code contains a review of a restaurant in plain text

using only line breaks for structuring

Without specialized information extraction analysis tools it cannot be interpreted, e.g. that it is a review (of what and when?), who the reviewer was, etc.

<div>L’Amourita PizzaReviewed by Ulysses Grant on Jan 6.Delicious, tasty pizza on Eastlake!L'Amourita serves up traditional wood-fired Neapolitan-style pizza, brought to your table promptly and without fuss. An ideal neighborhood pizza joint.Rating: 4.5</div>

18

Copyright IKS Consortium

Page 19: Co-funded by the European Union Semantic CMS Community Semantic Lifting for Traditional Content Resources Copyright IKS Consortium 1 Lecturer Organization

www.iks-project.eu

Page:

Microformats Same text but additional span elements with class attributes to

encode the type of contained information (hReview) and the properties of that type

<div class="hreview"> <span class="item"> <span class="fn">L’Amourita Pizza</span> </span> Reviewed by <span class="reviewer">Ulysses Grant</span> on <span class="dtreviewed"> Jan 6<span class="value-title" title="2009-01-06"></span> </span>. <span class="summary">Delicious, tasty pizza on Eastlake!</span> <span class="description">L'Amourita serves up traditional wood-fired Neapolitan-style pizza, brought to your table promptly and without fuss. An ideal neighborhood pizza joint.</span> Rating: <span class="rating">4.5</span></div>

19

Copyright IKS Consortium

Page 20: Co-funded by the European Union Semantic CMS Community Semantic Lifting for Traditional Content Resources Copyright IKS Consortium 1 Lecturer Organization

www.iks-project.eu

Page:

RDFa Same text but additional attributes and span elements encoding a

RDF structure: namespace declaration of the used ontology RDF class encoded by typeof attribute and its properties by a property attribute

<div xmlns:v="http://rdf.data-vocabulary.org/#" typeof="v:Review"> <span property="v:itemreviewed">L’Amourita Pizza</span> Reviewed by <span property="v:reviewer">Ulysses Grant</span> on <span property="v:dtreviewed" content="2009-01-06">Jan 6</span>. <span property="v:summary">Delicious, tasty pizza on Eastlake!</span> <span property="v:description">L'Amourita serves up traditional wood-fired Neapolitan-style pizza, brought to your table promptly and without fuss. An ideal neighborhood pizza joint.</span> Rating: <span property="v:rating">4.5</span></div>

20

Copyright IKS Consortium

Page 21: Co-funded by the European Union Semantic CMS Community Semantic Lifting for Traditional Content Resources Copyright IKS Consortium 1 Lecturer Organization

www.iks-project.eu

Page:

Microdata (HTML5) Same text but additional attributes and span elements:

A class declaration as value of an itemtype attribute and its properties as values of an itemprop attribute

<div> <div itemscope itemtype="http://data-vocabulary.org/Review"> <span itemprop="itemreviewed">L’Amourita Pizza</span> Reviewed by <span itemprop="reviewer">Ulysses Grant</span> on <time itemprop="dtreviewed" datetime="2009-01-06">Jan 6</time>. <span itemprop="summary">Delicious, tasty pizza in Eastlake!</span> <span itemprop="description">L'Amourita serves up traditional wood-fired Neapolitan-style pizza, brought to your table promptly and without fuss. An ideal neighborhood pizza joint.</span> Rating: <span itemprop="rating">4.5</span> </div></div>

21

Copyright IKS Consortium

Page 22: Co-funded by the European Union Semantic CMS Community Semantic Lifting for Traditional Content Resources Copyright IKS Consortium 1 Lecturer Organization

www.iks-project.eu

Page:

Lifting Requirements: Overview

Top-level requirementsSemantic Associations with ContentSemantic HarmonizationSemantic Linking Interactive LiftingCustomizabilitySemantically Transparent Structured Content

Sources

Copyright IKS Consortium

22

Page 23: Co-funded by the European Union Semantic CMS Community Semantic Lifting for Traditional Content Resources Copyright IKS Consortium 1 Lecturer Organization

www.iks-project.eu

Page:

Semantic Associations with Content

Unstructured content and information must be supplied with structured semantic annotations and metadata. Support for various content/media types Information extraction from text, topic classification, image

tagging, … Support for creation of semantic annotations in content

authoring

23

Copyright IKS Consortium

Page 24: Co-funded by the European Union Semantic CMS Community Semantic Lifting for Traditional Content Resources Copyright IKS Consortium 1 Lecturer Organization

www.iks-project.eu

Page:

Semantic Harmonization

Metadata and annotations must be harmonized with requirements for semantic processing in the CMS Reengineering methods, interpreters and wrappers for all

types and formats of metadata and annotations, e.g. tags, microformats, XML Metadata ( MPEG-7, …), ID3 tags, EXIF data, …

Ensure semantic interoperability of data and annotation schemes within the CMS and across external resources

Ontology mapping and harmonization of annotations External metadata Metadata generated by semantic analysis

24

Copyright IKS Consortium

Page 25: Co-funded by the European Union Semantic CMS Community Semantic Lifting for Traditional Content Resources Copyright IKS Consortium 1 Lecturer Organization

www.iks-project.eu

Page:

Semantic Linking

Lifting must enable the interlinking of content objects by semantic relationships. Internal linking of content items within the CMS links to external resources, e.g. Linked Open Data Establish semantic relatedness of content for different

views as well as different search, navigation and browsing strategies, … Direct semantic links among content items and metadata Similarity relations over sets of content items Clustering of content items

Slide 25

Copyright IKS Consortium

Page 26: Co-funded by the European Union Semantic CMS Community Semantic Lifting for Traditional Content Resources Copyright IKS Consortium 1 Lecturer Organization

www.iks-project.eu

Page:

Interactive Lifting

Lifting must interact with CMS users. Suggest semantic annotations during content creation

Support for various publishing formats such as microformats, RDFa, etc.

Automatic annotations (autotagging) with optional correction option

Learning capabilities and adaptability of automatic annotation components from user feedback

Slide 26

Copyright IKS Consortium

Page 27: Co-funded by the European Union Semantic CMS Community Semantic Lifting for Traditional Content Resources Copyright IKS Consortium 1 Lecturer Organization

www.iks-project.eu

Page:

Customizability

Lifting components must be customizable by CMS users/customers. Users must not be restricted to predefined vocabularies,

ontologies, … Domain ontologies, terminologies, tag sets are defined by

CMS users/customers. Browsers and editors for component resources are

necessary.

27

Copyright IKS Consortium

Page 28: Co-funded by the European Union Semantic CMS Community Semantic Lifting for Traditional Content Resources Copyright IKS Consortium 1 Lecturer Organization

www.iks-project.eu

Page:

Transparent Structured Content Sources

Structured content sources need to be reengineered to semantic resources Support uniform data access to structured content

repositories, e.g. SPARQL end points based on D2RQ technologies for transparent access to RDF and non-RDF databases

Extraction of ontologies from database structures, schemata, XML, resources, …

Alignment and mapping of the descriptions

28

Copyright IKS Consortium

Page 29: Co-funded by the European Union Semantic CMS Community Semantic Lifting for Traditional Content Resources Copyright IKS Consortium 1 Lecturer Organization

www.iks-project.eu

Page:

Semantic Reengineering of structured data sources

Focus on tools for reengineering structured data sources to RDF representations

Many tools and platforms for D2R Servers: Exhibit relational DBs as RDF Talis platform: Linked Open Data Triplify: like D2R but in PHP Virtuoso middleware Krextor/OntoCape: generating RDF from XML Various Transformers for inducing RDF ontologies and instance

data from XSD and XML More details in presentation on Knowledge

Representation (KReS)Copyright IKS Consortium

29

Page 30: Co-funded by the European Union Semantic CMS Community Semantic Lifting for Traditional Content Resources Copyright IKS Consortium 1 Lecturer Organization

www.iks-project.eu

Page:

Semantic Content Enhancements: Overview

Focus here is on textual content Metadata Extraction from existing content in various

formats to make embedded metadata explicit Information Extraction from textual content:

Named Entities Coreference Relationships

Classification and Clustering of content items Statistical methods and tools Semantic classification based on ontological definitions

Copyright IKS Consortium

30

Page 31: Co-funded by the European Union Semantic CMS Community Semantic Lifting for Traditional Content Resources Copyright IKS Consortium 1 Lecturer Organization

www.iks-project.eu

Page:

Information Extraction Rule based approaches for shallow text analysis

Usually based on Finite State technology: fast, robust Cascaded processing Based on templates as target structures to be filled Example platforms:

GATE SProUT

Can be used for nearly any kind of extraction/annotation task, including Named-Entity-Recognition (NER)

Easy customization

Copyright IKS Consortium

31

Page 32: Co-funded by the European Union Semantic CMS Community Semantic Lifting for Traditional Content Resources Copyright IKS Consortium 1 Lecturer Organization

www.iks-project.eu

Page:

Information Extraction

Semi-supervised learning approaches Rule induction from corpora Use example annotations as seeds for bootstrapping Pattern Rules learned from contextual features with

generalization over contexts

Copyright IKS Consortium

32

Page 33: Co-funded by the European Union Semantic CMS Community Semantic Lifting for Traditional Content Resources Copyright IKS Consortium 1 Lecturer Organization

www.iks-project.eu

Page:

Named Entities Statistical Approaches: examples

Lingpipe: Hidden Markov Models OpenNLP: Maximum Entropy Models Stanford NER: Conditional Random Fields

Statistical models crated by supervised learning techniques Large annotated corpora required

Customization diffcult except by re-annotation/re-training Not suitable for any type of named entity

Copyright IKS Consortium

33

Page 34: Co-funded by the European Union Semantic CMS Community Semantic Lifting for Traditional Content Resources Copyright IKS Consortium 1 Lecturer Organization

www.iks-project.eu

Page:

NER Document Markup

Copyright IKS Consortium

34

Page 35: Co-funded by the European Union Semantic CMS Community Semantic Lifting for Traditional Content Resources Copyright IKS Consortium 1 Lecturer Organization

www.iks-project.eu

Page:

NER Markup for a Web Page

Copyright IKS Consortium

35

Page 36: Co-funded by the European Union Semantic CMS Community Semantic Lifting for Traditional Content Resources Copyright IKS Consortium 1 Lecturer Organization

www.iks-project.eu

Page:

IE Template

Copyright IKS Consortium

36

A Person Template (as Typed Featured Structure) instantiated from text.The template supports the extraction of various properties of a person.

Page 37: Co-funded by the European Union Semantic CMS Community Semantic Lifting for Traditional Content Resources Copyright IKS Consortium 1 Lecturer Organization

www.iks-project.eu

Page:

Classification

Assign a data item to some predefined class Statistical classification Numerous methods, e.g.:

Bayes classifiers K-Nearest Neighbor (KNN) Support Vector Machines (SVM)

Copyright IKS Consortium

37

Page 38: Co-funded by the European Union Semantic CMS Community Semantic Lifting for Traditional Content Resources Copyright IKS Consortium 1 Lecturer Organization

www.iks-project.eu

Page:

Semantic Classification

Copyright IKS Consortium

38

Semantic classification in Knowledge Representation Formalisms Infer the item‘s class from the item‘s properties by matching

them with the class definitions: Which classes allow for these properties?

Assume that our ontology contains 2 classes with some properties

SpatialThing: latitude, longitudePopulatedPlace: population

Paderborn is an object with latidude „51°43′0″N“, longitude „8°46′0″E“ and a population of 146283.

Then we can infer that Paderborn is a SpatialThing as that are the things that have latitudes and longitudes in our ontology. Also, we can infer that it is a PopulatedPlace as that are the things that have a population.

Page 39: Co-funded by the European Union Semantic CMS Community Semantic Lifting for Traditional Content Resources Copyright IKS Consortium 1 Lecturer Organization

www.iks-project.eu

Page:

Clustering

Detection of classes in a data set Partitioning data into classes in an unsupervised way

with

high intra-class similarity

low inter-class similarity Main variants:

Hierarchical clustering Agglomerative

Partitioning clustering K-Means

Copyright IKS Consortium

39

Page 40: Co-funded by the European Union Semantic CMS Community Semantic Lifting for Traditional Content Resources Copyright IKS Consortium 1 Lecturer Organization

www.iks-project.eu

Page:

Tools for Classification and Clustering

Generic: WEKA: Java library implementing several dozen methods

for data mining. Application to textual data requires special preprocessing.

Text: MALLET: Java library with implementations of major

methods for text and document classification and clustering

Copyright IKS Consortium

40

Page 41: Co-funded by the European Union Semantic CMS Community Semantic Lifting for Traditional Content Resources Copyright IKS Consortium 1 Lecturer Organization

www.iks-project.eu

Page:

Evaluation Measures

Standard evaluation measures for IE/IR etc. systems: Accuracy: Precision: Recall: F-Measure :

Copyright IKS Consortium

41

fptp

tpprec

fntp

tprecall

recallprec

recallprecF

2

tp = true positivetn = true negativefp = false positivefn = false negative

fntnfptp

tntpacc

Page 42: Co-funded by the European Union Semantic CMS Community Semantic Lifting for Traditional Content Resources Copyright IKS Consortium 1 Lecturer Organization

www.iks-project.eu

Page:

Evaluation Measures: Classification

A confusion matrix which reports on the classification of 27 wines by grape variety. The reference in this case is the true variety and the response arises from the blind evaluation of a human judge.

Many-way Confusion Matrix

 Response

Cabernet Syrah Pinot Precision Recall F-MeasureRefer- Cabernet 9 3 0 0,69 0,75 0,72ence Syrah 3 5 1 0,56 0,56 0,56  Pinot 1 1 4 0,80 0,67 0,73

Macro average 0,68 0,66 0,67Overall accuracy 0,67

=9/(9+3+1)

=4/(1+1+4)

42

Copyright IKS Consortium

Page 43: Co-funded by the European Union Semantic CMS Community Semantic Lifting for Traditional Content Resources Copyright IKS Consortium 1 Lecturer Organization

www.iks-project.eu

Page:

Evaluation Measures: NER

Reference annotations: [Microsoft Corp.] CEO [Steve Ballmer] announced the release of [Windows 7] today

Recognized annotations: [Microsoft Corp.] [CEO] [Steve] Ballmer announced the release of Windows 7 [today]

-> Microsoft Corp. CEO Steve Ballmer announced the release of Windows 7 today

Precision: 1/(1+3) = 0,25

Recall: 1/(1+2) = 0,33

F-Measure:

2*0,25*0,33/(0,25+0,33) = 0,28

Counts Entities

TP 1 [Microsoft Corp.]

TN

FP 3 [CEO][Steve] [today]

FN 2 [Windows 7][Steve Ballmer]

43

Copyright IKS Consortium

Page 44: Co-funded by the European Union Semantic CMS Community Semantic Lifting for Traditional Content Resources Copyright IKS Consortium 1 Lecturer Organization

www.iks-project.eu

Page:

NER Evaluation

Nobel Prize Corpus from NYT, BBC, CNN 538 documents (Ø 735 words/document)

28948 person, 16948 organization occurrences

Copyright IKS Consortium

44

Sprout Calais StanfordNER

OpenNLP

Precision 77,26

94,22

73,21

57,69

Recall 65,85

86,66

73,62

42,86

F1 71,10

90,28

73,41

49,18

Page 45: Co-funded by the European Union Semantic CMS Community Semantic Lifting for Traditional Content Resources Copyright IKS Consortium 1 Lecturer Organization

www.iks-project.eu

Page:

References Microformats: http://microformats.org/ RDFa: http://www.w3.org/TR/xhtml-rdfa-primer/ Google Rich Snippets:

http://googlewebmastercentral.blogspot.com/2009/05/introducing-rich-snippets.html Linked Data: http://linkeddata.org/guides-and-tutorials Linked Data: Heath and Bizer, Linked Data: Evolving the Web into a Global Data

Space. Morgan & Claypool, 2011. (Online: http://linkeddatabook.com/book) Information Extraction: Moens, Information Extraction: Algorithms and Prospects in

a Retrieval Context. Springer 2006 Text Mining: Feldman and Sanger, The Text Mining Handbook: Advanced

Approaches in Analyzing Unstructured Data, CUP, 2007

Copyright IKS Consortium

45