intro to graphs for fedict

34
Intro to Graph Databases in a NOSQL world 19 th of May 2015

Upload: rik-van-bruggen

Post on 29-Jul-2015

296 views

Category:

Software


0 download

TRANSCRIPT

Page 1: Intro to Graphs for Fedict

Intro  to  Graph  Databases    in  a  NOSQL  world  

19th  of  May  2015  

Page 2: Intro to Graphs for Fedict

Agenda  

•  About  Graphs  •  About  Graph  Databases  –  About  Neo4j  

•  Graph  Querying  –  Short  demonstra:on  

•  Case  Studies  •  Q&A  

Page 3: Intro to Graphs for Fedict

Introduc.on:  about  Graphs  

Page 4: Intro to Graphs for Fedict
Page 5: Intro to Graphs for Fedict

Meet    Leonhard  Euler  (again?)  

•  Swiss  mathema:cian  •  Inventor  of  Graph  Theory  (1736)  

Page 6: Intro to Graphs for Fedict

Königsberg  (Prussia)  -­‐  1736  

Page 7: Intro to Graphs for Fedict

A  

B  

D  

C  

Page 8: Intro to Graphs for Fedict

A  

B  

D  

C  

1

2

3

4

7

6

5

Page 9: Intro to Graphs for Fedict

About  Graph  Databases  

Page 10: Intro to Graphs for Fedict

Complemen.ng    

Relational Databases

VOLUME   COMPLEXITY  

Page 11: Intro to Graphs for Fedict

NOT  ONLY  SQL  

Page 12: Intro to Graphs for Fedict

RDBMS  

Living  in  a  NOSQL  World  

Complexity

 

Column  Family  

Size  

Key-­‐Value  Store  

Document  Databases  

Graph  Databases  

90%  of  Use  cases  

Rela:onal  Databases  

Naviga:onal  Databases  

Page 13: Intro to Graphs for Fedict

So  what  is  a  graph  database?  

•  OLTP  database  •  “end-­‐user”  transac:ons  

•  Model,  store,  manage  data  as  a  graph  

Page 14: Intro to Graphs for Fedict

What  is  a  graph?  

Vertex  

Edge  

Page 15: Intro to Graphs for Fedict

What  is  a  graph?  

Node  

Rela:onship  

Page 16: Intro to Graphs for Fedict

Contrast  with  Rela.onal  

Graphs  are  o]en  referred  to  as  “Whiteboard  Friendly”.  The  data  model  reflects  the  way  a  domain  expert  would  naturally  draw  their  data  on  a  whiteboard  

“The  schema  is  the  data”.  Schema  flexibility  allows  the  system  to  change  in  response  to  a  changing  environment  

Page 17: Intro to Graphs for Fedict

Neo4j  is  a  Graph  Database  

•  JVM  based  •  ACID  transac:ons  •  Rich  Java  APIs  •  Query  language  •  Using  the  Labeled    

Property  Graph  model  

Page 18: Intro to Graphs for Fedict

Cypher:  THE  graph  query  language  

•  Learning  from  RDBMS’  evolu:on  •  Introduc:on  of  SQL!  

•  Key  characteris:cs  •  Declara:ve:  tell  it  what  you  want,  not  how  to  get  it  •  Expressive:  Op:mize  for  reading  •  Pagern  matching:  easy  on  your  brain!  •  Idempotent:  state  change  expressed  idempotently  

Page 19: Intro to Graphs for Fedict

Labeled  Property  Graph  Model  

Author

Book

Reader

Reader

Author

Book

Author

Page 20: Intro to Graphs for Fedict

Labeled  Property  Graph  Summary  

•  Nodes  •  Containers  for  proper:es  

•  Grouped  together  in  subgraphs  by  “Labels”  

•  Proper:es  •  Key-­‐value  pairs  

•  Primi:ve  and  array  values  

•  Rela:onships  •  Name  

•  Direc:on  

•  May  also  contain  proper:es  

•  Rela:onships  (ctd.)  •  Must  have  a  start  node  and  an  end  node  

(no  dangling  rela:onships)  

•  Start  node  and  end  node  can  be  the  same  (e.g.  ‘self’  rela:onships)  

•  Nodes  can  be  connected  by  more  than  one  rela:onship  

Page 21: Intro to Graphs for Fedict

What  are  graphs  good  for?  

Complexity  

Page 22: Intro to Graphs for Fedict

Data  Complexity  

complexity = f(size, semi-structure, connectedness)

Page 23: Intro to Graphs for Fedict

complexity = f(size, semi-structure, connectedness)

The  Real  Complexity  

Page 24: Intro to Graphs for Fedict

Semi-­‐Structure  

Page 25: Intro to Graphs for Fedict

Semi-­‐Structure  

Email:  [email protected]  Email:  [email protected]  Twiger:  @rvanbruggen  Skype:  rvanbruggen  

USER  

CONTACT  

CONTACT_TYPE  

FIRST_NAME   LAST_NAME  USER_ID   EMAIL_1   EMAIL_2   TWITTER  FACEBOOK   SKYPE  

Rik   Van  Bruggen  315   [email protected]   [email protected]   @rvanbruggen  NULL   rvanbruggen  

Page 26: Intro to Graphs for Fedict

complexity = f(size, semi-structure, connectedness)

The  Real  Complexity  

Page 27: Intro to Graphs for Fedict

Examples  of  Connectedness  

Page 28: Intro to Graphs for Fedict

When  Should  I  Use  Graph  Databases??  •  Densely-­‐connected,  semi-­‐structured  domains  •  Lots  of  join  tables?  Connectedness  •  Lots  of  sparse  tables?  Semi-­‐structure  

•  Data  Model  Vola:lity  •  Easy  to  evolve  

•  “Graphy”  Query  pagerns  •  Deeps  Join  Complexity  and  Performance  •  Pathfinding  opera:ons  •  Millions  of  ‘joins’  per  second  •  Consistent  query  :mes  as  dataset  grows  

Page 29: Intro to Graphs for Fedict

Graph  Querying  

Page 30: Intro to Graphs for Fedict

Querying  a  Graph  

•  “Graph  local”  vs  “Graph  global”  •  Contextualized  “ego-­‐centric”  queries  

•  “Parachute”  into  graph  •  Start  node(s)  •  Found  through  Index  lookups  

•  Crawl  the  surrounding  graph  •  2  million+  joins  per  second  •  No  more  Index  lookups:    Index-­‐free  adjacency  

Page 31: Intro to Graphs for Fedict

Queries:  Pa\ern  Matching  

Pagern  

Page 32: Intro to Graphs for Fedict

Short  demo  

Page 33: Intro to Graphs for Fedict

Case  Studies  

Page 34: Intro to Graphs for Fedict

www.neo4j.com    www.meetup.com/graphdb-­‐belgium    [email protected]  or  +32  478  686800  

Q&A,  Conclusion,  Next  Steps