machine learning techniques for the semantic web

64
Machine Learning Techniques for the Semantic Web Paul Dix http://pauldix.net [email protected]

Upload: pauldix

Post on 01-Nov-2014

6.057 views

Category:

Technology


9 download

DESCRIPTION

 

TRANSCRIPT

Page 1: Machine Learning Techniques for the Semantic Web

Machine Learning Techniques for the

Semantic WebPaul Dix

http://[email protected]

Page 2: Machine Learning Techniques for the Semantic Web
Page 3: Machine Learning Techniques for the Semantic Web
Page 4: Machine Learning Techniques for the Semantic Web
Page 5: Machine Learning Techniques for the Semantic Web

Machine Learning

Page 6: Machine Learning Techniques for the Semantic Web

Semantic Web

Page 7: Machine Learning Techniques for the Semantic Web
Page 8: Machine Learning Techniques for the Semantic Web
Page 9: Machine Learning Techniques for the Semantic Web

What is Semantic Web?

Page 10: Machine Learning Techniques for the Semantic Web
Page 11: Machine Learning Techniques for the Semantic Web
Page 12: Machine Learning Techniques for the Semantic Web

Ontology

Page 13: Machine Learning Techniques for the Semantic Web

RDF

Page 14: Machine Learning Techniques for the Semantic Web
Page 15: Machine Learning Techniques for the Semantic Web

Machine Learning is about Data

Page 16: Machine Learning Techniques for the Semantic Web

actually...

Page 17: Machine Learning Techniques for the Semantic Web

Making Predictions Based on Data

Page 18: Machine Learning Techniques for the Semantic Web
Page 19: Machine Learning Techniques for the Semantic Web
Page 20: Machine Learning Techniques for the Semantic Web
Page 21: Machine Learning Techniques for the Semantic Web

FOAFSimple Example

Page 22: Machine Learning Techniques for the Semantic Web

Marco Neumann<http://www.marconeumann.org/foaf.rdf> <http://xmlns.com/foaf/0.1/knows> <http://community.linkeddata.org/dataspace/person/kidehen2/about.rdf> .<http://www.marconeumann.org/foaf.rdf> <http://xmlns.com/foaf/0.1/knows> <http://www.johnbreslin.com/foaf/foaf.rdf> .<http://www.marconeumann.org/foaf.rdf> <http://xmlns.com/foaf/0.1/knows> <http://swordfish.rdfweb.org/people/libby/rdfweb/webwho.xrdf> .<http://www.marconeumann.org/foaf.rdf> <http://xmlns.com/foaf/0.1/knows> <http://danbri.org/foaf.rdf> .

Page 23: Machine Learning Techniques for the Semantic Web
Page 24: Machine Learning Techniques for the Semantic Web

Marco only knows 4 people?

Page 25: Machine Learning Techniques for the Semantic Web

Two Degrees Out

4 - <http://www.w3.org/People/Connolly/home-smart.rdf>4 - <http://jibbering.com/foaf.rdf>2 - <http://sw.deri.org/~haller/foaf.rdf>2 - <http://sw.deri.org/~knud/knudfoaf.rdf>2 - <http://www-cdr.stanford.edu/~petrie/foaf.rdf>

Page 26: Machine Learning Techniques for the Semantic Web

Three Degrees

9 - <http://sw.deri.org/~knud/knudfoaf.rdf>8 - <http://www.w3.org/People/Connolly/home-smart.rdf>7 - <http://jibbering.com/foaf.rdf>6 - <http://www.aaronsw.com/about.xrdf>5 - <http://sw.deri.org/~aharth/foaf.rdf>

Page 27: Machine Learning Techniques for the Semantic Web

but that’s not really machine learning

Page 28: Machine Learning Techniques for the Semantic Web

Short

Page 29: Machine Learning Techniques for the Semantic Web
Page 30: Machine Learning Techniques for the Semantic Web

Machine Learning is

• How you formulate the problem

• How you represent the data

Page 31: Machine Learning Techniques for the Semantic Web

• Graphical Models

• Vector Space Models

Page 32: Machine Learning Techniques for the Semantic Web

Back to FOAFConvert RDF triples to vector space

Page 33: Machine Learning Techniques for the Semantic Web

We Want to Find Groups of People

Page 34: Machine Learning Techniques for the Semantic Web

To make predictions on their interests...

Page 35: Machine Learning Techniques for the Semantic Web

(subject) (predicate) (object)Paul knows JeffPaul knows JoePaul knows MarcoJeff knows Joe

Page 36: Machine Learning Techniques for the Semantic Web

Vector Space Representation

Jeff Joe Marco Paul

Jeff 1 1

Joe 1 1

Marco 1

Paul 1 1 1

Page 37: Machine Learning Techniques for the Semantic Web

Latent Factors Analysis

• Used in Latent Semantic Indexing (LSI)

• Good for finding synonyms

• Good for finding “genres”

Page 38: Machine Learning Techniques for the Semantic Web

Latent Factors Methods

• Principle Component Analysis (PCA)

• Singular Value Decomposition (SVD)

• Restricted Boltzmann Machines (RBM)

Page 39: Machine Learning Techniques for the Semantic Web

Considerations for Semantic Web Data

• Large Data Sets

• Sparse Data Sets

Page 40: Machine Learning Techniques for the Semantic Web

Netflix Prize Research

• Movie Review Data set has similar problems

• Generalized Hebbian Algorithm for Dimensionality Reduction in NLP (Gorrell ’06.)

Page 41: Machine Learning Techniques for the Semantic Web

Reduce Dimensions

• 1m x 1m matrix with 1m people

• Reduce to 1m x 100

Page 42: Machine Learning Techniques for the Semantic Web

100 Latent FactorsRepresent different groups of people based on who

they know.

Page 43: Machine Learning Techniques for the Semantic Web

Factor 1 Factor 2

Paul 0.678 0.311

Joe 0.455 0.432

Jeff 0.476 0.398

Marco 0.203 0.789

What the Data Might Look Like

Page 44: Machine Learning Techniques for the Semantic Web

Find Similar Peoplek Nearest Neighbors

Page 45: Machine Learning Techniques for the Semantic Web

Pick a Similarity Metric

• Euclidean Distance

• Jaccard index

• Cosine Similarity

Page 46: Machine Learning Techniques for the Semantic Web

Joe’s Similarity to Paul(Paul (f1) - Joe (f1))^2 + (Paul (f2) - Joe (f2))^2)^1/2

Page 47: Machine Learning Techniques for the Semantic Web

• Fill In Missing Interests

• Target Ads, Content, Products

• ???

• Profit!

Once We’ve Calculated Similarities

Page 48: Machine Learning Techniques for the Semantic Web

Generalizing RDF Triples to Vector Space

Page 49: Machine Learning Techniques for the Semantic Web

• Subjects are Rows

• Objects are Columns

• Predicates are values

Page 50: Machine Learning Techniques for the Semantic Web

Object 1 Object 2

Subject 1 Predicate

Subject 2

Page 51: Machine Learning Techniques for the Semantic Web

Predicates Should be Mutually Exclusive

• Paul likes Ruby

• Paul hates PHP

• Paul loves PHP

Page 52: Machine Learning Techniques for the Semantic Web

Assign Values to Predicates

• 1 = Hates

• 2 = Dislikes

• 3 = Neutral

• 4 = Likes

• 5 = Loves

Page 53: Machine Learning Techniques for the Semantic Web

More Applications

Page 54: Machine Learning Techniques for the Semantic Web

Supervised Learning

• Classifiers

• Ontology Mapping

• Assigning Instances to Concepts

Page 55: Machine Learning Techniques for the Semantic Web

Ontology Mapping

• Examples from Ontology A

• Examples from Ontology B

Page 56: Machine Learning Techniques for the Semantic Web
Page 57: Machine Learning Techniques for the Semantic Web

Train Classifiers

• One Classifier for each Concept in A

• One Classifier for each Concept in B

Page 58: Machine Learning Techniques for the Semantic Web

Classify Instances

• Use A Classifiers to predict which concepts B instances map to

• Use B Classifiers to predict which concepts A instances map to

Page 59: Machine Learning Techniques for the Semantic Web

Use Classified Instances

• Predict Concept Mappings

• Which in A match ones in B

Page 60: Machine Learning Techniques for the Semantic Web
Page 61: Machine Learning Techniques for the Semantic Web

Limitations

• One Classifier per Concept

• Large Ontologies Could be a Problem

• Ontologies should be a little similar

Page 62: Machine Learning Techniques for the Semantic Web

Unsupervised Learning

• Clustering

• Hierarchical Clustering

• Learning Ontologies from Text

Page 63: Machine Learning Techniques for the Semantic Web

Machine Learning as Triage

• Automatically tag or recommend Examples the algorithm is Certain About

• Send uncertain examples to human for review

Page 64: Machine Learning Techniques for the Semantic Web

Thank YouPaul Dix

[email protected]://pauldix.net