intro to graphs for fedict
TRANSCRIPT
Intro to Graph Databases in a NOSQL world
19th of May 2015
Agenda
• About Graphs • About Graph Databases – About Neo4j
• Graph Querying – Short demonstra:on
• Case Studies • Q&A
Introduc.on: about Graphs
Meet Leonhard Euler (again?)
• Swiss mathema:cian • Inventor of Graph Theory (1736)
Königsberg (Prussia) -‐ 1736
A
B
D
C
A
B
D
C
1
2
3
4
7
6
5
About Graph Databases
Complemen.ng
Relational Databases
VOLUME COMPLEXITY
NOT ONLY SQL
RDBMS
Living in a NOSQL World
Complexity
Column Family
Size
Key-‐Value Store
Document Databases
Graph Databases
90% of Use cases
Rela:onal Databases
Naviga:onal Databases
So what is a graph database?
• OLTP database • “end-‐user” transac:ons
• Model, store, manage data as a graph
What is a graph?
Vertex
Edge
What is a graph?
Node
Rela:onship
Contrast with Rela.onal
Graphs are o]en referred to as “Whiteboard Friendly”. The data model reflects the way a domain expert would naturally draw their data on a whiteboard
“The schema is the data”. Schema flexibility allows the system to change in response to a changing environment
Neo4j is a Graph Database
• JVM based • ACID transac:ons • Rich Java APIs • Query language • Using the Labeled
Property Graph model
Cypher: THE graph query language
• Learning from RDBMS’ evolu:on • Introduc:on of SQL!
• Key characteris:cs • Declara:ve: tell it what you want, not how to get it • Expressive: Op:mize for reading • Pagern matching: easy on your brain! • Idempotent: state change expressed idempotently
Labeled Property Graph Model
Author
Book
Reader
Reader
Author
Book
Author
Labeled Property Graph Summary
• Nodes • Containers for proper:es
• Grouped together in subgraphs by “Labels”
• Proper:es • Key-‐value pairs
• Primi:ve and array values
• Rela:onships • Name
• Direc:on
• May also contain proper:es
• Rela:onships (ctd.) • Must have a start node and an end node
(no dangling rela:onships)
• Start node and end node can be the same (e.g. ‘self’ rela:onships)
• Nodes can be connected by more than one rela:onship
What are graphs good for?
Complexity
Data Complexity
complexity = f(size, semi-structure, connectedness)
complexity = f(size, semi-structure, connectedness)
The Real Complexity
Semi-‐Structure
Semi-‐Structure
Email: [email protected] Email: [email protected] Twiger: @rvanbruggen Skype: rvanbruggen
USER
CONTACT
CONTACT_TYPE
FIRST_NAME LAST_NAME USER_ID EMAIL_1 EMAIL_2 TWITTER FACEBOOK SKYPE
Rik Van Bruggen 315 [email protected] [email protected] @rvanbruggen NULL rvanbruggen
complexity = f(size, semi-structure, connectedness)
The Real Complexity
Examples of Connectedness
When Should I Use Graph Databases?? • Densely-‐connected, semi-‐structured domains • Lots of join tables? Connectedness • Lots of sparse tables? Semi-‐structure
• Data Model Vola:lity • Easy to evolve
• “Graphy” Query pagerns • Deeps Join Complexity and Performance • Pathfinding opera:ons • Millions of ‘joins’ per second • Consistent query :mes as dataset grows
Graph Querying
Querying a Graph
• “Graph local” vs “Graph global” • Contextualized “ego-‐centric” queries
• “Parachute” into graph • Start node(s) • Found through Index lookups
• Crawl the surrounding graph • 2 million+ joins per second • No more Index lookups: Index-‐free adjacency
Queries: Pa\ern Matching
Pagern
Short demo
Case Studies
www.neo4j.com www.meetup.com/graphdb-‐belgium [email protected] or +32 478 686800
Q&A, Conclusion, Next Steps