natural language processing with graph databases and neo4j
TRANSCRIPT
![Page 1: Natural Language Processing with Graph Databases and Neo4j](https://reader035.vdocuments.mx/reader035/viewer/2022081722/58f9a907760da3da068b6a27/html5/thumbnails/1.jpg)
Natural Language Processing With Graph DatabasesDataDay TexasJanuary 2016
William Lyon@lyonwj
![Page 3: Natural Language Processing with Graph Databases and Neo4j](https://reader035.vdocuments.mx/reader035/viewer/2022081722/58f9a907760da3da068b6a27/html5/thumbnails/3.jpg)
Agenda
• Brief intro to graph databases / Neo4j• Representing text as a graph• NLP tasks• Mining word associations• Graph based summarization and keyword
extraction• Content recommendation
![Page 4: Natural Language Processing with Graph Databases and Neo4j](https://reader035.vdocuments.mx/reader035/viewer/2022081722/58f9a907760da3da068b6a27/html5/thumbnails/4.jpg)
Agenda
• Brief intro to graph databases / Neo4j• Representing text as a graph• NLP tasks• Mining word associations• Graph based summarization and keyword
extraction• Content recommendation Survey of NLP
methods with graphs
![Page 5: Natural Language Processing with Graph Databases and Neo4j](https://reader035.vdocuments.mx/reader035/viewer/2022081722/58f9a907760da3da068b6a27/html5/thumbnails/5.jpg)
Intro to Graph Databases / Neo4j
![Page 6: Natural Language Processing with Graph Databases and Neo4j](https://reader035.vdocuments.mx/reader035/viewer/2022081722/58f9a907760da3da068b6a27/html5/thumbnails/6.jpg)
Charts
![Page 7: Natural Language Processing with Graph Databases and Neo4j](https://reader035.vdocuments.mx/reader035/viewer/2022081722/58f9a907760da3da068b6a27/html5/thumbnails/7.jpg)
Charts Graphs
![Page 8: Natural Language Processing with Graph Databases and Neo4j](https://reader035.vdocuments.mx/reader035/viewer/2022081722/58f9a907760da3da068b6a27/html5/thumbnails/8.jpg)
Neo4j
Graph Database
• Property graph data model• Nodes and relationships
• Native graph processing• Cypher query language
![Page 9: Natural Language Processing with Graph Databases and Neo4j](https://reader035.vdocuments.mx/reader035/viewer/2022081722/58f9a907760da3da068b6a27/html5/thumbnails/9.jpg)
The Whiteboard Model Is the Physical Model
![Page 10: Natural Language Processing with Graph Databases and Neo4j](https://reader035.vdocuments.mx/reader035/viewer/2022081722/58f9a907760da3da068b6a27/html5/thumbnails/10.jpg)
Relational Versus Graph Models
Relational Model Graph Model
KNOWS
KNOWS
KNOWS
ANDREAS
TOBIAS
MICA
DELIA
Person FriendPerson-Friend
ANDREASDELIA
TOBIAS
MICA
![Page 11: Natural Language Processing with Graph Databases and Neo4j](https://reader035.vdocuments.mx/reader035/viewer/2022081722/58f9a907760da3da068b6a27/html5/thumbnails/11.jpg)
Property Graph Model Components
Nodes • The objects in the graph • Can have name-value properties • Can be labeled
Relationships • Relate nodes by type and
direction • Can have name-value properties
CAR
DRIVES
name: “Dan” born: May 29, 1970
twitter: “@dan”name: “Ann”
born: Dec 5, 1975
since: Jan 10, 2011
brand: “Volvo” model: “V70”
LOVES
LOVES
LIVES WITH
OWNS
PERSON PERSON
![Page 12: Natural Language Processing with Graph Databases and Neo4j](https://reader035.vdocuments.mx/reader035/viewer/2022081722/58f9a907760da3da068b6a27/html5/thumbnails/12.jpg)
Cypher: Graph Query Language
CREATE (:Person { name:“Dan”} ) -[:LOVES]-> (:Person { name:“Ann”} )
LOVES
Dan Ann
LABEL PROPERTY
NODE NODE
LABEL PROPERTY
![Page 13: Natural Language Processing with Graph Databases and Neo4j](https://reader035.vdocuments.mx/reader035/viewer/2022081722/58f9a907760da3da068b6a27/html5/thumbnails/13.jpg)
“So what does this have to do with NLP?”
“Am I in the wrong talk?”
“I thought this was going to be about text processing….”
![Page 14: Natural Language Processing with Graph Databases and Neo4j](https://reader035.vdocuments.mx/reader035/viewer/2022081722/58f9a907760da3da068b6a27/html5/thumbnails/14.jpg)
Natural Language Processing With Graphs
![Page 15: Natural Language Processing with Graph Databases and Neo4j](https://reader035.vdocuments.mx/reader035/viewer/2022081722/58f9a907760da3da068b6a27/html5/thumbnails/15.jpg)
Natural Language Processing With Graphs
Uncovering meaning from text using a graph data model.
![Page 16: Natural Language Processing with Graph Databases and Neo4j](https://reader035.vdocuments.mx/reader035/viewer/2022081722/58f9a907760da3da068b6a27/html5/thumbnails/16.jpg)
Representing Text As A Graph
“Nearly all text processing starts by transforming text into vectors.”
- Matt Biddulph www.hackdiary.com
![Page 17: Natural Language Processing with Graph Databases and Neo4j](https://reader035.vdocuments.mx/reader035/viewer/2022081722/58f9a907760da3da068b6a27/html5/thumbnails/17.jpg)
Representing text as a graph
Text Adjacency Graph
![Page 18: Natural Language Processing with Graph Databases and Neo4j](https://reader035.vdocuments.mx/reader035/viewer/2022081722/58f9a907760da3da068b6a27/html5/thumbnails/18.jpg)
Representing text as a graph
Text Adjacency Graph
![Page 19: Natural Language Processing with Graph Databases and Neo4j](https://reader035.vdocuments.mx/reader035/viewer/2022081722/58f9a907760da3da068b6a27/html5/thumbnails/19.jpg)
My cat eats fish on Saturday.
![Page 20: Natural Language Processing with Graph Databases and Neo4j](https://reader035.vdocuments.mx/reader035/viewer/2022081722/58f9a907760da3da068b6a27/html5/thumbnails/20.jpg)
![Page 21: Natural Language Processing with Graph Databases and Neo4j](https://reader035.vdocuments.mx/reader035/viewer/2022081722/58f9a907760da3da068b6a27/html5/thumbnails/21.jpg)
![Page 22: Natural Language Processing with Graph Databases and Neo4j](https://reader035.vdocuments.mx/reader035/viewer/2022081722/58f9a907760da3da068b6a27/html5/thumbnails/22.jpg)
Convert to array of words
![Page 23: Natural Language Processing with Graph Databases and Neo4j](https://reader035.vdocuments.mx/reader035/viewer/2022081722/58f9a907760da3da068b6a27/html5/thumbnails/23.jpg)
Iterate with counter variable i,from 0 to number of words - 2
![Page 24: Natural Language Processing with Graph Databases and Neo4j](https://reader035.vdocuments.mx/reader035/viewer/2022081722/58f9a907760da3da068b6a27/html5/thumbnails/24.jpg)
Get or create node forwords at index i and i+1
![Page 25: Natural Language Processing with Graph Databases and Neo4j](https://reader035.vdocuments.mx/reader035/viewer/2022081722/58f9a907760da3da068b6a27/html5/thumbnails/25.jpg)
Create :NEXT relationship
![Page 26: Natural Language Processing with Graph Databases and Neo4j](https://reader035.vdocuments.mx/reader035/viewer/2022081722/58f9a907760da3da068b6a27/html5/thumbnails/26.jpg)
Representing A Text Corpus As A Graph
![Page 27: Natural Language Processing with Graph Databases and Neo4j](https://reader035.vdocuments.mx/reader035/viewer/2022081722/58f9a907760da3da068b6a27/html5/thumbnails/27.jpg)
![Page 28: Natural Language Processing with Graph Databases and Neo4j](https://reader035.vdocuments.mx/reader035/viewer/2022081722/58f9a907760da3da068b6a27/html5/thumbnails/28.jpg)
Add followship frequency
![Page 29: Natural Language Processing with Graph Databases and Neo4j](https://reader035.vdocuments.mx/reader035/viewer/2022081722/58f9a907760da3da068b6a27/html5/thumbnails/29.jpg)
Add word counts
![Page 30: Natural Language Processing with Graph Databases and Neo4j](https://reader035.vdocuments.mx/reader035/viewer/2022081722/58f9a907760da3da068b6a27/html5/thumbnails/30.jpg)
Query Word frequency
![Page 31: Natural Language Processing with Graph Databases and Neo4j](https://reader035.vdocuments.mx/reader035/viewer/2022081722/58f9a907760da3da068b6a27/html5/thumbnails/31.jpg)
Query Word pair frequencies (colocation)
![Page 32: Natural Language Processing with Graph Databases and Neo4j](https://reader035.vdocuments.mx/reader035/viewer/2022081722/58f9a907760da3da068b6a27/html5/thumbnails/32.jpg)
NLP Tasks
![Page 33: Natural Language Processing with Graph Databases and Neo4j](https://reader035.vdocuments.mx/reader035/viewer/2022081722/58f9a907760da3da068b6a27/html5/thumbnails/33.jpg)
Mining Word Associations
![Page 34: Natural Language Processing with Graph Databases and Neo4j](https://reader035.vdocuments.mx/reader035/viewer/2022081722/58f9a907760da3da068b6a27/html5/thumbnails/34.jpg)
Word Associations
• Paradigmatic• words that can be substituted• “Monday” <—> “Thursday”• “cat” <—> “dog”
• Syntagmatic• words that can be combined with each other• “cold”, “weather”• colocations
![Page 35: Natural Language Processing with Graph Databases and Neo4j](https://reader035.vdocuments.mx/reader035/viewer/2022081722/58f9a907760da3da068b6a27/html5/thumbnails/35.jpg)
Computing Paradigmatic Similarity
1. Represent each word by its context2. Compute context similarity3. Words with high context similarity likely have
paradigmatic relation
![Page 36: Natural Language Processing with Graph Databases and Neo4j](https://reader035.vdocuments.mx/reader035/viewer/2022081722/58f9a907760da3da068b6a27/html5/thumbnails/36.jpg)
Paradigmatic Similarity1. Represent each word by its context
![Page 37: Natural Language Processing with Graph Databases and Neo4j](https://reader035.vdocuments.mx/reader035/viewer/2022081722/58f9a907760da3da068b6a27/html5/thumbnails/37.jpg)
Paradigmatic Similarity1. Represent each word by its context
![Page 38: Natural Language Processing with Graph Databases and Neo4j](https://reader035.vdocuments.mx/reader035/viewer/2022081722/58f9a907760da3da068b6a27/html5/thumbnails/38.jpg)
Paradigmatic Similarity1. Represent each word by its context
Left1 Right1
![Page 39: Natural Language Processing with Graph Databases and Neo4j](https://reader035.vdocuments.mx/reader035/viewer/2022081722/58f9a907760da3da068b6a27/html5/thumbnails/39.jpg)
Paradigmatic Similarity2. Compute context similarity
![Page 40: Natural Language Processing with Graph Databases and Neo4j](https://reader035.vdocuments.mx/reader035/viewer/2022081722/58f9a907760da3da068b6a27/html5/thumbnails/40.jpg)
Paradigmatic Similarity2. Compute context similarity
![Page 41: Natural Language Processing with Graph Databases and Neo4j](https://reader035.vdocuments.mx/reader035/viewer/2022081722/58f9a907760da3da068b6a27/html5/thumbnails/41.jpg)
Paradigmatic Similarity2. Compute context similarity
www.lyonwj.com/2015/06/16/nlp-with-neo4j/
![Page 42: Natural Language Processing with Graph Databases and Neo4j](https://reader035.vdocuments.mx/reader035/viewer/2022081722/58f9a907760da3da068b6a27/html5/thumbnails/42.jpg)
Paradigmatic Similarity3. Find words with high context similarity
http://earthlab.uoi.gr/theste/index.php/theste/article/viewFile/55/37CEEAUS corpus
![Page 43: Natural Language Processing with Graph Databases and Neo4j](https://reader035.vdocuments.mx/reader035/viewer/2022081722/58f9a907760da3da068b6a27/html5/thumbnails/43.jpg)
Paradigmatic Similarity
Example
http://www.lyonwj.com/2015/06/16/nlp-with-neo4j/
https://github.com/johnymontana/nlp-graph-notebooks
https://class.coursera.org/textanalytics-001
![Page 44: Natural Language Processing with Graph Databases and Neo4j](https://reader035.vdocuments.mx/reader035/viewer/2022081722/58f9a907760da3da068b6a27/html5/thumbnails/44.jpg)
Graph Based Summarization and Keyword Extraction
![Page 45: Natural Language Processing with Graph Databases and Neo4j](https://reader035.vdocuments.mx/reader035/viewer/2022081722/58f9a907760da3da068b6a27/html5/thumbnails/45.jpg)
![Page 46: Natural Language Processing with Graph Databases and Neo4j](https://reader035.vdocuments.mx/reader035/viewer/2022081722/58f9a907760da3da068b6a27/html5/thumbnails/46.jpg)
image credit: https://en.wikipedia.org/wiki/PageRank
https://web.eecs.umich.edu/~mihalcea/papers/mihalcea.emnlp04.pdf
https://github.com/summanlp/textrank
Keyword Extraction
![Page 47: Natural Language Processing with Graph Databases and Neo4j](https://reader035.vdocuments.mx/reader035/viewer/2022081722/58f9a907760da3da068b6a27/html5/thumbnails/47.jpg)
SummarizationOpinion mining
![Page 48: Natural Language Processing with Graph Databases and Neo4j](https://reader035.vdocuments.mx/reader035/viewer/2022081722/58f9a907760da3da068b6a27/html5/thumbnails/48.jpg)
• Opinion mining• Summarize major opinions• Concise and readable• Major complaints /
compliments
![Page 49: Natural Language Processing with Graph Databases and Neo4j](https://reader035.vdocuments.mx/reader035/viewer/2022081722/58f9a907760da3da068b6a27/html5/thumbnails/49.jpg)
http://kavita-ganesan.com/opinosis
1.Graph based representation of review corpus
2.Find and score candidate summaries
3.Select top scoring candidates as summary
![Page 50: Natural Language Processing with Graph Databases and Neo4j](https://reader035.vdocuments.mx/reader035/viewer/2022081722/58f9a907760da3da068b6a27/html5/thumbnails/50.jpg)
Opinion Mining - Example
• Best Buy API• Product reviews by SKU
![Page 51: Natural Language Processing with Graph Databases and Neo4j](https://reader035.vdocuments.mx/reader035/viewer/2022081722/58f9a907760da3da068b6a27/html5/thumbnails/51.jpg)
Opinion Mining - Example
![Page 52: Natural Language Processing with Graph Databases and Neo4j](https://reader035.vdocuments.mx/reader035/viewer/2022081722/58f9a907760da3da068b6a27/html5/thumbnails/52.jpg)
Opinion Mining - Example
![Page 53: Natural Language Processing with Graph Databases and Neo4j](https://reader035.vdocuments.mx/reader035/viewer/2022081722/58f9a907760da3da068b6a27/html5/thumbnails/53.jpg)
Opinion Mining - Example
1.Graph based representation of review corpus
2.Find and score candidate summaries
3.Select top scoring candidates as summary
![Page 54: Natural Language Processing with Graph Databases and Neo4j](https://reader035.vdocuments.mx/reader035/viewer/2022081722/58f9a907760da3da068b6a27/html5/thumbnails/54.jpg)
Opinion Mining - Example
Find highest ranked paths of 2-5 words
![Page 55: Natural Language Processing with Graph Databases and Neo4j](https://reader035.vdocuments.mx/reader035/viewer/2022081722/58f9a907760da3da068b6a27/html5/thumbnails/55.jpg)
Opinion Mining - Demo
“Easy to read in sunlight”
“Comfortable great sound quality”
“I love this washer”
![Page 56: Natural Language Processing with Graph Databases and Neo4j](https://reader035.vdocuments.mx/reader035/viewer/2022081722/58f9a907760da3da068b6a27/html5/thumbnails/56.jpg)
Opinion Mining - Demo
“Bought this smart TV for the price”
“Easy to use this vacuum”
![Page 57: Natural Language Processing with Graph Databases and Neo4j](https://reader035.vdocuments.mx/reader035/viewer/2022081722/58f9a907760da3da068b6a27/html5/thumbnails/57.jpg)
Opinion Mining - Demo
• iPython notebook
https://github.com/johnymontana/nlp-graph-notebooks
![Page 58: Natural Language Processing with Graph Databases and Neo4j](https://reader035.vdocuments.mx/reader035/viewer/2022081722/58f9a907760da3da068b6a27/html5/thumbnails/58.jpg)
Content Recommendation
![Page 59: Natural Language Processing with Graph Databases and Neo4j](https://reader035.vdocuments.mx/reader035/viewer/2022081722/58f9a907760da3da068b6a27/html5/thumbnails/59.jpg)
Content recommendation
“Networks give structure to the conversation while content mining gives meaning.”
http://breakthroughanalysis.com/2015/10/08/ltapreriitsouda/
- Preriit Souda
![Page 60: Natural Language Processing with Graph Databases and Neo4j](https://reader035.vdocuments.mx/reader035/viewer/2022081722/58f9a907760da3da068b6a27/html5/thumbnails/60.jpg)
Using Data Relationships for Recommendations
Content-based filtering Recommend items based on what users have liked in the past
Collaborative filtering Predict what users like based on the similarity of their behaviors, activities and preferences to others
Movie
Person
Person
RATED
SIMILARITY
rating: 7
value: .92
![Page 61: Natural Language Processing with Graph Databases and Neo4j](https://reader035.vdocuments.mx/reader035/viewer/2022081722/58f9a907760da3da068b6a27/html5/thumbnails/61.jpg)
Using Data Relationships for Recommendations
Content-based filtering Recommend items based on what users have liked in the past
Movie
Person
Person
RATED
SIMILARITY
rating: 7
value: .92
![Page 62: Natural Language Processing with Graph Databases and Neo4j](https://reader035.vdocuments.mx/reader035/viewer/2022081722/58f9a907760da3da068b6a27/html5/thumbnails/62.jpg)
The article graph - data model
![Page 63: Natural Language Processing with Graph Databases and Neo4j](https://reader035.vdocuments.mx/reader035/viewer/2022081722/58f9a907760da3da068b6a27/html5/thumbnails/63.jpg)
Building the article graph• Articles users have shared• Extract keywords using newspaper3k
python library• Insert in the graph• Scrape additional articles
https://github.com/johnymontana/nlp-graph-notebooks
![Page 64: Natural Language Processing with Graph Databases and Neo4j](https://reader035.vdocuments.mx/reader035/viewer/2022081722/58f9a907760da3da068b6a27/html5/thumbnails/64.jpg)
The article graph - example
![Page 65: Natural Language Processing with Graph Databases and Neo4j](https://reader035.vdocuments.mx/reader035/viewer/2022081722/58f9a907760da3da068b6a27/html5/thumbnails/65.jpg)
What are the keywords of the articles I liked?
![Page 66: Natural Language Processing with Graph Databases and Neo4j](https://reader035.vdocuments.mx/reader035/viewer/2022081722/58f9a907760da3da068b6a27/html5/thumbnails/66.jpg)
![Page 67: Natural Language Processing with Graph Databases and Neo4j](https://reader035.vdocuments.mx/reader035/viewer/2022081722/58f9a907760da3da068b6a27/html5/thumbnails/67.jpg)
Summary
• Property graph model• Represent text as a graph• Word associations• Opinion mining• Content recommendation
![Page 68: Natural Language Processing with Graph Databases and Neo4j](https://reader035.vdocuments.mx/reader035/viewer/2022081722/58f9a907760da3da068b6a27/html5/thumbnails/68.jpg)
Resources
![Page 70: Natural Language Processing with Graph Databases and Neo4j](https://reader035.vdocuments.mx/reader035/viewer/2022081722/58f9a907760da3da068b6a27/html5/thumbnails/70.jpg)
Resources
• http://kavita-ganesan.com/opinosis • http://jexp.de/blog/2015/01/natural-language-
analytics-made-simple-and-visual-with-neo4j/ • https://github.com/johnymontana/nlp-graph-notebooks
![Page 71: Natural Language Processing with Graph Databases and Neo4j](https://reader035.vdocuments.mx/reader035/viewer/2022081722/58f9a907760da3da068b6a27/html5/thumbnails/71.jpg)
Opinion Mining
• “Opinosis: A Graph Based Approach to Abstractive Summarization of Highly Redundant Opinions”
• - Kavita Ganesan, Cheng Xiang Zhai, Jiawei Han University of Illinois at Urbana-Champaign
• Multi-sentence compression: Finding shortest paths in word graphs
• - Proceedings of the 23rd International Conference on Computational Linguistics. COLING 10. Beijing, Cina Aug23-27, 2010. Katy Fillipova