Transcript
Page 1: Moving beyond sameAs with PLATO: Partonomy detection for Linked Data

May 2012 –GE Global Research– Prateek Jain23rd ACM HT Conference 2012– Prateek Jain

Moving beyond sameAs with PLATO: Partonomy detection for Linked Data

Prateek Jain, Pascal Hitzler, Amit ShethKno.e.sis Center

Wright State University, Dayton, OH

Peter Z. Yeh, Kunal VermaAccenture Technology Labs

San Jose, CA

Page 2: Moving beyond sameAs with PLATO: Partonomy detection for Linked Data

May 2012 –GE Global Research– Prateek Jain23rd ACM HT Conference 2012– Prateek Jain 22

Outline

• Introduction - Linked Open Data

• Challenges

• PLATO – Partonomic Relationship detection

• Conclusion & Future Work

Page 3: Moving beyond sameAs with PLATO: Partonomy detection for Linked Data

May 2012 –GE Global Research– Prateek Jain23rd ACM HT Conference 2012– Prateek Jain 33

Tim Berners-Lee 2006

• from http://www.w3.org/DesignIssues/LinkedData.html

1. Use URIs as names for things 2. Use HTTP URIs so that people can look up those names. 3. When someone looks up a URI, provide useful information, using the

standards (RDF*, SPARQL) 4. Include links to other URIs. so that they can discover more things.

Page 4: Moving beyond sameAs with PLATO: Partonomy detection for Linked Data

May 2012 –GE Global Research– Prateek Jain23rd ACM HT Conference 2012– Prateek Jain 44

Linked Open Data 2011

Page 5: Moving beyond sameAs with PLATO: Partonomy detection for Linked Data

May 2012 –GE Global Research– Prateek Jain23rd ACM HT Conference 2012– Prateek Jain 55

Linked Open Data

Number of Datasets

2011-09-19 295 2010-09-22 203 2009-07-14 95 2008-09-18 45 2007-10-08 25 2007-05-01 12

Number of triples (Sept 2011)

31,634,213,770

with 503,998,829 out-links

From http://www4.wiwiss.fu-berlin.de/lodcloud/state/

Page 6: Moving beyond sameAs with PLATO: Partonomy detection for Linked Data

May 2012 –GE Global Research– Prateek Jain23rd ACM HT Conference 2012– Prateek Jain 66

Page 7: Moving beyond sameAs with PLATO: Partonomy detection for Linked Data

May 2012 –GE Global Research– Prateek Jain23rd ACM HT Conference 2012– Prateek Jain 77

Page 8: Moving beyond sameAs with PLATO: Partonomy detection for Linked Data

May 2012 –IBM TJ Watson Center– Prateek Jain

Mainstream Semantic Web?

Page 9: Moving beyond sameAs with PLATO: Partonomy detection for Linked Data

May 2012 –GE Global Research– Prateek Jain23rd ACM HT Conference 2012– Prateek Jain 99

Is it really mainstream Semantic Web?

• What is the relationship between the models whose instances are being linked?

• How to do querying on LOD without knowing individual datasets?

• How to perform schema level reasoning over LOD cloud?

• A very fundamental, important and conceptual relationship namely “PART OF” has little or no existence in LOD

Page 10: Moving beyond sameAs with PLATO: Partonomy detection for Linked Data

May 2012 –GE Global Research– Prateek Jain23rd ACM HT Conference 2012– Prateek Jain

PLATO Approach

Page 11: Moving beyond sameAs with PLATO: Partonomy detection for Linked Data

May 2012 –GE Global Research– Prateek Jain23rd ACM HT Conference 2012– Prateek Jain 1111

Our Approach

Use knowledge contributed by users

• Detection of relationships within and across datasets

LOD Cloud

Page 12: Moving beyond sameAs with PLATO: Partonomy detection for Linked Data

May 2012 –GE Global Research– Prateek Jain23rd ACM HT Conference 2012– Prateek Jain 1212

PLATO Approach

• PLATO generates all possible partonomically linked pairs between the entities in the dataset. – Utilize “strongly” associated entities

• Identify the type of each entity in the pair using WordNet.– Use Class Names– Gives the lexicographer files for the synsets corresponding to these

entities

• Use this information to determine the applicable OWL partonomy properties.– Using Winston’s taxonomy

Page 13: Moving beyond sameAs with PLATO: Partonomy detection for Linked Data

May 2012 –GE Global Research– Prateek Jain23rd ACM HT Conference 2012– Prateek Jain 1313

Winston’s Taxonomy

Page 14: Moving beyond sameAs with PLATO: Partonomy detection for Linked Data

May 2012 –GE Global Research– Prateek Jain23rd ACM HT Conference 2012– Prateek Jain 1414

PLATO Approach – Step 2

• PLATO generates linguistic patterns for each applicable property based on linguistic cues suggested by Winston.– Cell Wall is made of Cellulose– Cellulose is made of Cell Wall– Cell Wall is partly Cellulose

• Tests the lexical patterns for each entity pair in a corpus-driven manner.– Using Web as a corpus

• PLATO counts the total number of web pages that contain the pattern– Parse the page and identify the occurance of pattern.

Page 15: Moving beyond sameAs with PLATO: Partonomy detection for Linked Data

May 2012 –GE Global Research– Prateek Jain23rd ACM HT Conference 2012– Prateek Jain 1515

PLATO Approach – Step 3

• Asserts the partonomy property with strongest supporting evidence– Cell Wall is made of Cellulose, 48– Cellulose is made of Cell Wall, 10

• PLATO also enriches the schema by generalizing from the instance level assertions.

Page 16: Moving beyond sameAs with PLATO: Partonomy detection for Linked Data

May 2012 –GE Global Research– Prateek Jain23rd ACM HT Conference 2012– Prateek Jain 1616

PLATO Evaluation

Page 17: Moving beyond sameAs with PLATO: Partonomy detection for Linked Data

May 2012 –GE Global Research– Prateek Jain23rd ACM HT Conference 2012– Prateek Jain 1717

Outreach

• Prateek Jain, Pascal Hitzler, Kunal Verma, Peter Z. Yeh and Amit P. Sheth, “Moving beyond sameAs with PLATO: Partonomy detection for Linked Data”. In Proceedings of the 23rd ACM Hypertext and Social Media conference (HT 2012), Milwaukee, WI, USA, June 25th-28th, 2012 (To Appear)

• Tool available for download at

http://wiki.knoesis.org/index.php/PLATO

Page 18: Moving beyond sameAs with PLATO: Partonomy detection for Linked Data

May 2012 –GE Global Research– Prateek Jain23rd ACM HT Conference 2012– Prateek Jain 1818

End Product

Page 19: Moving beyond sameAs with PLATO: Partonomy detection for Linked Data

May 2012 –GE Global Research– Prateek Jain23rd ACM HT Conference 2012– Prateek Jain

Conclusions and Future Work

Page 20: Moving beyond sameAs with PLATO: Partonomy detection for Linked Data

May 2012 –GE Global Research– Prateek Jain23rd ACM HT Conference 2012– Prateek Jain 2020

Conclusions

• PLATO is an approach for partonomic relationship detection

• Approach works for both instances and schema level relationships

• Evaluation performed between and within prominent and big LOD datasets

• Results validate the use of knowledge on the Web to solve tough problems

Page 21: Moving beyond sameAs with PLATO: Partonomy detection for Linked Data

May 2012 –GE Global Research– Prateek Jain23rd ACM HT Conference 2012– Prateek Jain 2121

Future Work

• Use incomplete knowledge for part of relationship identification– Machine learning based techniques

• Release the schema mappings in public domain

• Develop better querying system for LOD using PLATO and BLOOMS• Work in progress with ALOQUS (Submitted to ODBASE 2012)

• Identify and incorporate user preferences

Page 22: Moving beyond sameAs with PLATO: Partonomy detection for Linked Data

May 2012 –GE Global Research– Prateek Jain23rd ACM HT Conference 2012– Prateek Jain

Questions?

Prateek JainKno.e.sis Center

Wright State University, Dayton, OH http://wiki.knoesis.org/index.php/Prateek


Top Related