big data & cloud | an introduction to neo4j (and doctor who) | ian robinson
DESCRIPTION
2011-11-01 | 10:40 AM - 11:40 AMDoctor Who is the world’s longest running science-fiction TV series. Battling daleks, cybermen and sontarans, and always accompanied by his trusted human companions, the last Timelord has saved earth from destruction more times than you’ve cursed Maven. Neo4j is the world’s leading open source graph database. Designed to interrogate densely connected data with lightning speed, it lets you traverse millions of nodes in a fraction of the time it takes to run a multi-join SQLquery. When these two meet, the result is an entertaining introduction to the complex history of a complex hero, and a rapid survey of the elegant APIs of a delightfully simple graph database. Armed only with a data store packed full of geeky Doctor Who facts, by the end of this session we’ll have you answering questions of the Doctor Who universe like a die-hard fan.TRANSCRIPT
An Introduc+on to Doctor Who (and Neo4j)
@iansrobinson
#neo4j
#neo4j
Neo4j is a Graph Database
#neo4j
Property Graph
#neo4j
Property Graph
#neo4j
Neo4j
#neo4j
32 billion nodes 32 billion rela+onships 64 billion proper+es
#neo4j
Community
Advanced
Enterprise
stole from
loves loves
enemy
enemy A Good Man Goes to War
appeared in
appeared in
appeared in
appeared in
Victory of the Daleks
appeared in
appeared in
companion
companion
enemy
GraphDatabaseService db = new EmbeddedGraphDatabase("/data/drwho");
GraphDatabaseService db = new EmbeddedGraphDatabase("/data/drwho");
Node theDoctor = db.createNode(); theDoctor.setProperty("name", "The Doctor");
Node daleks = db.createNode(); daleks.setProperty("name", "Daleks"); Node cybermen = db.createNode(); cybermen.setProperty("name", "Cybermen");
GraphDatabaseService db = new EmbeddedGraphDatabase("/data/drwho");
Node theDoctor = db.createNode(); theDoctor.setProperty("name", "The Doctor");
Node daleks = db.createNode(); daleks.setProperty("name", "Daleks"); Node cybermen = db.createNode(); cybermen.setProperty("name", "Cybermen");
theDoctor.createRelationshipTo(daleks, DynamicRelationshipType.withName("enemy")); theDoctor.createRelationshipTo(cybermen, DynamicRelationshipType.withName("enemy"));
enemy
enemy
GraphDatabaseService db = new EmbeddedGraphDatabase("/data/drwho"); Transaction tx = db.beginTx(); try {
Node theDoctor = db.createNode(); theDoctor.setProperty("name", "The Doctor");
Node daleks = db.createNode(); daleks.setProperty("name", "Daleks"); Node cybermen = db.createNode(); cybermen.setProperty("name", "Cybermen");
theDoctor.createRelationshipTo(daleks, DynamicRelationshipType.withName("enemy")); theDoctor.createRelationshipTo(cybermen, DynamicRelationshipType.withName("enemy"));
tx.success();
} finally { tx.finish();
}
enemy
enemy
#neo4j
How do I query the data?
hOp://opfm.jpl.nasa.gov/
hOp://news.xinhuanet.com
#neo4j
Dalek Props hOp://www.dalek6388.co.uk/
!tle:Power of the Daleks
species:Dalek
APPEARED_IN
USED_IN props:Daleks
name:Dalek Six-‐5
name:Dalek 7 name:Dalek 2 name:Dalek 1
MEMBER_OF MEMBER_OF
type:shoulders type:skirt
name:Dalek Six-‐5
name:Dalek 6 name:Dalek 5
ORIGINAL_PROP ORIGINAL_PROP
ORIGINAL_PROP ORIGINAL_PROP
name:Dalek 1 name:Dalek 2 name:Dalek 7
!tle:Power of the Daleks !tle:The
Daleks
!tle:The Dalek
Invasion of Earth
name:Dalek Two-‐1 name:Dalek One-‐5 name:Dalek Six-‐7
name: Dalek Six-‐5 name:Dalek 1 name:Dalek 2 name:Dalek 7
#neo4j
Supply Chain Traceability
#neo4j
Traversal Framework
• Visits (and returns) nodes based on traversal descrip+on
• Powerful; can customize: – Rela+onships followed – Branch selec+on policy – Node/rela+onship uniqueness constraints – Path evaluators
Node theDaleks = database.index().forNodes("species").get("name", "Dalek").getSingle(); Traverser traverser = Traversal.description() .depthFirst() .relationships(Rels.APPEARED_IN, Direction.OUTGOING) .relationships(Rels.USED_IN, Direction.INCOMING) .relationships(Rels.MEMBER_OF, Direction.INCOMING) .relationships(Rels.COMPOSED_OF, Direction.OUTGOING) .relationships(Rels.ORIGINAL_PROP, Direction.OUTGOING) .evaluator(new Evaluator() { @Override public Evaluation evaluate(Path path) { if (path.lastRelationship() != null && path.lastRelationship().isType(Rels.ORIGINAL_PROP)){ return Evaluation.INCLUDE_AND_PRUNE; } return Evaluation.EXCLUDE_AND_CONTINUE; } }) .uniqueness(Uniqueness.NONE) .traverse(theDaleks); Description
Node theDaleks = database.index().forNodes("species").get("name", "Dalek").getSingle(); Traverser traverser = Traversal.description() .depthFirst() .relationships(Rels.APPEARED_IN, Direction.OUTGOING) .relationships(Rels.USED_IN, Direction.INCOMING) .relationships(Rels.MEMBER_OF, Direction.INCOMING) .relationships(Rels.COMPOSED_OF, Direction.OUTGOING) .relationships(Rels.ORIGINAL_PROP, Direction.OUTGOING) .evaluator(new Evaluator() { @Override public Evaluation evaluate(Path path) { if (path.lastRelationship() != null && path.lastRelationship().isType(Rels.ORIGINAL_PROP)){ return Evaluation.INCLUDE_AND_PRUNE; } return Evaluation.EXCLUDE_AND_CONTINUE; } }) .uniqueness(Uniqueness.NONE) .traverse(theDaleks); Index Lookup
Node theDaleks = database.index().forNodes("species").get("name", "Dalek").getSingle(); Traverser traverser = Traversal.description() .depthFirst() .relationships(Rels.APPEARED_IN, Direction.OUTGOING) .relationships(Rels.USED_IN, Direction.INCOMING) .relationships(Rels.MEMBER_OF, Direction.INCOMING) .relationships(Rels.COMPOSED_OF, Direction.OUTGOING) .relationships(Rels.ORIGINAL_PROP, Direction.OUTGOING) .evaluator(new Evaluator() { @Override public Evaluation evaluate(Path path) { if (path.lastRelationship() != null && path.lastRelationship().isType(Rels.ORIGINAL_PROP)){ return Evaluation.INCLUDE_AND_PRUNE; } return Evaluation.EXCLUDE_AND_CONTINUE; } }) .uniqueness(Uniqueness.NONE) .traverse(theDaleks); Branching
Node theDaleks = database.index().forNodes("species").get("name", "Dalek").getSingle(); Traverser traverser = Traversal.description() .depthFirst() .relationships(Rels.APPEARED_IN, Direction.OUTGOING) .relationships(Rels.USED_IN, Direction.INCOMING) .relationships(Rels.MEMBER_OF, Direction.INCOMING) .relationships(Rels.COMPOSED_OF, Direction.OUTGOING) .relationships(Rels.ORIGINAL_PROP, Direction.OUTGOING) .evaluator(new Evaluator() { @Override public Evaluation evaluate(Path path) { if (path.lastRelationship() != null && path.lastRelationship().isType(Rels.ORIGINAL_PROP)){ return Evaluation.INCLUDE_AND_PRUNE; } return Evaluation.EXCLUDE_AND_CONTINUE; } }) .uniqueness(Uniqueness.NONE) .traverse(theDaleks); Relationships
Node theDaleks = database.index().forNodes("species").get("name", "Dalek").getSingle(); Traverser traverser = Traversal.description() .depthFirst() .relationships(Rels.APPEARED_IN, Direction.OUTGOING) .relationships(Rels.USED_IN, Direction.INCOMING) .relationships(Rels.MEMBER_OF, Direction.INCOMING) .relationships(Rels.COMPOSED_OF, Direction.OUTGOING) .relationships(Rels.ORIGINAL_PROP, Direction.OUTGOING) .evaluator(new Evaluator() { @Override public Evaluation evaluate(Path path) { if (path.lastRelationship() != null && path.lastRelationship().isType(Rels.ORIGINAL_PROP)){ return Evaluation.INCLUDE_AND_PRUNE; } return Evaluation.EXCLUDE_AND_CONTINUE; } }) .uniqueness(Uniqueness.NONE) .traverse(theDaleks); Evaluator
Node theDaleks = database.index().forNodes("species").get("name", "Dalek").getSingle(); Traverser traverser = Traversal.description() .depthFirst() .relationships(Rels.APPEARED_IN, Direction.OUTGOING) .relationships(Rels.USED_IN, Direction.INCOMING) .relationships(Rels.MEMBER_OF, Direction.INCOMING) .relationships(Rels.COMPOSED_OF, Direction.OUTGOING) .relationships(Rels.ORIGINAL_PROP, Direction.OUTGOING) .evaluator(new Evaluator() { @Override public Evaluation evaluate(Path path) { if (path.lastRelationship() != null && path.lastRelationship().isType(Rels.ORIGINAL_PROP)){ return Evaluation.INCLUDE_AND_PRUNE; } return Evaluation.EXCLUDE_AND_CONTINUE; } }) .uniqueness(Uniqueness.NONE) .traverse(theDaleks); Uniqueness
Node theDaleks = database.index().forNodes("species").get("name", "Dalek").getSingle(); Traverser traverser = Traversal.description() .depthFirst() .relationships(Rels.APPEARED_IN, Direction.OUTGOING) .relationships(Rels.USED_IN, Direction.INCOMING) .relationships(Rels.MEMBER_OF, Direction.INCOMING) .relationships(Rels.COMPOSED_OF, Direction.OUTGOING) .relationships(Rels.ORIGINAL_PROP, Direction.OUTGOING) .evaluator(new Evaluator() { @Override public Evaluation evaluate(Path path) { if (path.lastRelationship() != null && path.lastRelationship().isType(Rels.ORIGINAL_PROP)){ return Evaluation.INCLUDE_AND_PRUNE; } return Evaluation.EXCLUDE_AND_CONTINUE; } }) .uniqueness(Uniqueness.NONE) .traverse(theDaleks); Traverse
Iterable<Path> path.startNode() path.endNode()
APPEARED_IN USED_IN COMPOSED_OF MEMBER_OF ORIGINAL_PROP
species:Dalek +tle:Power of the Daleks props:Daleks name:Dalek 7 type:skirt name:Dalek 7
APPEARED_IN USED_IN COMPOSED_OF MEMBER_OF ORIGINAL_PROP
species:Daleks +tle:Power of the Daleks props:Daleks name:Dalek 7 type:shoulder name:Dalek 7
APPEARED_IN USED_IN COMPOSED_OF MEMBER_OF ORIGINAL_PROP
species:Daleks +tle:Power of the Daleks props:Daleks name:Dalek Six-‐5 type:skirt name:Dalek 5
#neo4j
Cypher
• Declara+ve graph pa8ern matching language – Humane regular expressions for graphs
• New in 1.4 – Con+nuing, rapid feature development
• Supports queries – Including aggrega+on, ordering and limits – Muta+ng opera+ons coming in 1.6
start daleks=node:species(species='Dalek') match daleks-‐[:APPEARED_IN]-‐>episode<-‐[:USED_IN]-‐ ()<-‐[:MEMBER_OF]-‐()-‐[:COMPOSED_OF]-‐> part-‐[:ORIGINAL_PROP]-‐>originalprop return originalprop.name, part.type, count(episode.Qtle) order by count(episode.Qtle) desc limit 1
Cypher Query
start daleks=node:species(species='Dalek') match daleks-‐[:APPEARED_IN]-‐>episode<-‐[:USED_IN]-‐ ()<-‐[:MEMBER_OF]-‐()-‐[:COMPOSED_OF]-‐> part-‐[:ORIGINAL_PROP]-‐>originalprop return originalprop.name, part.type, count(episode.Qtle) order by count(episode.Qtle) desc limit 1
Index Lookup
start daleks=node:species(species='Dalek') match daleks-‐[:APPEARED_IN]-‐>episode<-‐[:USED_IN]-‐ ()<-‐[:MEMBER_OF]-‐()-‐[:COMPOSED_OF]-‐> part-‐[:ORIGINAL_PROP]-‐>originalprop return originalprop.name, part.type, count(episode.Qtle) order by count(episode.Qtle) desc limit 1
Match Nodes & Relationships
match daleks-‐[:APPEARED_IN]-‐>episode<-‐[:USED_IN]-‐ ()<-‐[:MEMBER_OF]-‐()-‐[:COMPOSED_OF]-‐> part-‐[:ORIGINAL_PROP]-‐>originalprop
daleks episode
part originalprop
APPEARED_IN
USED_IN
MEMBER_OF
COMPOSED_OF
ORIGINAL_PROP
start daleks=node:species(species='Dalek') match daleks-‐[:APPEARED_IN]-‐>episode<-‐[:USED_IN]-‐ ()<-‐[:MEMBER_OF]-‐()-‐[:COMPOSED_OF]-‐> part-‐[:ORIGINAL_PROP]-‐>originalprop return originalprop.name, part.type, count(episode.Qtle) order by count(episode.Qtle) desc limit 1
Return Values
CypherParser parser = new CypherParser(); ExecuQonEngine engine = new ExecuQonEngine(universe.getDatabase()); String cql = "start daleks=node:species(species='Dalek')" + " match daleks-‐[:APPEARED_IN]-‐>episode<-‐[:USED_IN]-‐()<-‐[:MEMBER_OF]-‐()" + "-‐[:COMPOSED_OF]-‐>part-‐[:ORIGINAL_PROP]-‐>originalprop" + " return originalprop.name, part.type, count(episode.Qtle)" + " order by count(episode.Qtle) desc limit 1"; Query query = parser.parse(cql); ExecuQonResult result = engine.execute(query); String originalProp = result.javaColumnAs("originalprop.name").next().toString(); String part = result.javaColumnAs("part.type").next().toString(); int episodeCount = (Integer) result.javaColumnAs("count(episode.Qtle)").next();
In Code
CypherParser parser = new CypherParser(); ExecuQonEngine engine = new ExecuQonEngine(universe.getDatabase()); String cql = "start daleks=node:species(species='Dalek')" + " match daleks-‐[:APPEARED_IN]-‐>episode<-‐[:USED_IN]-‐()<-‐[:MEMBER_OF]-‐()" + "-‐[:COMPOSED_OF]-‐>part-‐[:ORIGINAL_PROP]-‐>originalprop" + " return originalprop.name, part.type, count(episode.Qtle)" + " order by count(episode.Qtle) desc limit 1"; Query query = parser.parse(cql); ExecuQonResult result = engine.execute(query); String originalProp = result.javaColumnAs("originalprop.name").next().toString(); String part = result.javaColumnAs("part.type").next().toString(); int episodeCount = (Integer) result.javaColumnAs("count(episode.Qtle)").next();
Setup
CypherParser parser = new CypherParser(); ExecuQonEngine engine = new ExecuQonEngine(universe.getDatabase()); String cql = "start daleks=node:species(species='Dalek')" + " match daleks-‐[:APPEARED_IN]-‐>episode<-‐[:USED_IN]-‐()<-‐[:MEMBER_OF]-‐()" + "-‐[:COMPOSED_OF]-‐>part-‐[:ORIGINAL_PROP]-‐>originalprop" + " return originalprop.name, part.type, count(episode.Qtle)" + " order by count(episode.Qtle) desc limit 1"; Query query = parser.parse(cql); ExecuQonResult result = engine.execute(query); String originalProp = result.javaColumnAs("originalprop.name").next().toString(); String part = result.javaColumnAs("part.type").next().toString(); int episodeCount = (Integer) result.javaColumnAs("count(episode.Qtle)").next();
Parse & Execute
CypherParser parser = new CypherParser(); ExecuQonEngine engine = new ExecuQonEngine(universe.getDatabase()); String cql = "start daleks=node:species(species='Dalek')" + " match daleks-‐[:APPEARED_IN]-‐>episode<-‐[:USED_IN]-‐()<-‐[:MEMBER_OF]-‐()" + "-‐[:COMPOSED_OF]-‐>part-‐[:ORIGINAL_PROP]-‐>originalprop" + " return originalprop.name, part.type, count(episode.Qtle)" + " order by count(episode.Qtle) desc limit 1"; Query query = parser.parse(cql); ExecuQonResult result = engine.execute(query); String originalProp = result.javaColumnAs("originalprop.name").next().toString(); String part = result.javaColumnAs("part.type").next().toString(); int episodeCount = (Integer) result.javaColumnAs("count(episode.Qtle)").next();
Results
In Webadmin
#neo4j
The Hardest Working Prop Part
hOp://www.dalek6388.co.uk/
Dalek One’s shoulders
#neo4j
Download
hOp://neo4j.org/download/ hOp://www.springsource.org/spring-‐data/neo4j
#neo4j
Tutorial
hOps://github.com/jimwebber/neo4j-‐tutorial
1. Subtract: Select rectangle then 3 black triangles
2. Union: All
1. Subtract: Select rectangle then 3 black triangles
2. Union: All
1. Subtract: Select rectangle then 3 black triangles
2. Union: All
daleks episode
part originalprop
match (daleks)-‐[:APPEARED_IN]-‐>(episode)<-‐[:USED_IN]-‐ ()<-‐[:MEMBER_OF]-‐()-‐[:COMPOSED_OF]-‐> (part)-‐[:ORIGINAL_PROP]-‐>(originalprop)
APPEARED_IN
USED_IN
MEMBER_OF
COMPOSED_OF
ORIGINAL_PROP