the football graph - neo4j and the premier league

Post on 14-Sep-2014

602 Views

Category:

Sports

22 Downloads

Preview:

Click to see full reader

DESCRIPTION

 

TRANSCRIPT

In  a  League  of  their  Own:    Neo4j  and  Premiership  Football  

Mark  Needham  @markhneedham  

Outline  

•  Intro  to  graphs  •  When  do  we  need  a  graph?  •  Property  graph  model  •  Neo4j’s  query  language  •  The  football  graph  •  Using  Neo4j  from  .NET  

Let’s  talk  graphs  

You  mean  these?  

EaJng  Brains  

Dancing  With  Michael  Jackson  

Nope!  

EaJng  Brains  

Dancing  With  Michael  Jackson  Thes

e����������� ������������������  are����������� ������������������  Cha

rts!����������� ������������������  

����������� ������������������  NOT

����������� ������������������  Graphs!

����������� ������������������  

Ok  so  what’s  a  graph  then?  

Node  

RelaJonship  

The  tube  

The  social  network  (graph)  

Complexity  

What  are  graphs  good  for?  

complexity = f(size, semi-structure, connectedness)

Data  Complexity  

Size  

complexity = f(size, semi-structure, connectedness)

The  Real  Complexity  

Semi-­‐Structure  

Email:  mark.needham@neotechnology.com  Email:  m.h.needham@gmail.com  TwiXer:  @markhneedham  Skype:  mk_jnr1984  

USER  

CONTACT  

CONTACT_TYPE  

FIRST_NAME   LAST_NAME  USER_ID   EMAIL_1   EMAIL_2   TWITTER  FACEBOOK   SKYPE  

Mark   Needham  315   mark.needham@neotechnology.com  

m.h.needham@gmail.com   @markhneedham  NULL   mk_jnr1984  

Semi-­‐Structure  

complexity = f(size, semi-structure, connectedness)

The  Real  Complexity  

Connectedness  

Connectedness  

Connectedness  

When  do  we  need  a  graph?  

Densely  Connected  

Semi  Structured  

Densely  connected?  

Lots  of  join  tables  

Semi-­‐Structured?  

Lots  of  sparse  tables  

ProperJes  of  graph  databases  

• Millions  of  ‘joins’  per  second  •  Consistent  query  Jmes  as  dataset  grows  •  Join  Complexity  and  Performance  •  Easy  to  evolve  data  model  •  Easy  to  ‘layer’  different  types  of  data  together  

Property  Graph  Data  Model  

Nodes  

Nodes  can  have  properJes  

•  Used  to  represent  enJty  a"ributes  and/or  metadata  (e.g.  Jmestamps,  version)  

•  Key-­‐value  pairs  •  Java  primiJves  •  Arrays  •  null  is  not  a  valid  value  

•  Every  node  can  have  different  properJes  

What’s  a  node?  

RelaJonships  

RelaJonships  

•  RelaJonships  are  first  class  ciJzens    •  Every  relaJonship  has  a  name  and  a  direc.on  – Add  structure  to  the  graph  – Provide  semanJc  context  for  nodes  

•  ProperJes  used  to  represent  quality  or  weight  of  relaJonship,  or  metadata  

•  Every  relaJonship  must  have  a  start  node  and  end  node  

RelaJonships  

Nodes  can  have  more  than  one  relaJonship  

Self  relaJonships  are  allowed  

Nodes  can  be  connected  by  more  than  one  relaJonship  

Labels  

Think  Gmail  labels  

•  Nodes  – EnJJes  

•  RelaJonships  – Connect  enJJes  and  structure  domain  

•  ProperJes  – EnJty  aXributes,  relaJonship  qualiJes,  and  metadata  

•  Labels  – Group  nodes  by  role  

Four  Building  Blocks  

Purposeful  abstracJon  of  a  domain  designed  to  saJsfy  parJcular  applicaJon/end-­‐user  goals  

Models  

Model  Query  

Design  for  Queryability  

Model  Model  

Design  for  Queryability  

Model  Query  

Design  for  Queryability  

Introducing  Cypher  

•  DeclaraJve  PaXern-­‐Matching  language  •  SQL-­‐like  syntax  •  Designed  for  graphs  

PaXerns,  paXerns,  everywhere  

A

B C

(a) --> (b)

a b

It’s  all  about  the  ASCII  art!  

a b

The  most  basic  query  

MATCH (a)-->(b) RETURN a, b

(a)–[:ACTED_IN]->(m)

a m

Adding  in  a  relaJonship  type  

ACTED IN

a m

Adding  in  a  relaJonship  type  

MATCH (a)-[:ACTED_IN]->(m) RETURN a.name, m.name

ACTED IN

The  football  graph  

The  football  graph  

Find  Arsenal’s  away  matches  

Find  Arsenal’s  away  matches  

Find  Arsenal’s  away  matches  

MATCH (team:Team)<-[:away_team]-(game)

WHERE team.name = "Arsenal"

RETURN game

Graph  PaXern  

MATCH (team:Team)<-[:away_team]-(game)

WHERE team.name = "Arsenal"

RETURN game.name

Anchor  paXern  in  graph  

MATCH (team:Team)<-[:away_team]-(game)

WHERE team.name = "Arsenal"

RETURN game.name

Create  projecJon  of  results  

MATCH (team:Team)<-[:away_team]-(game)

WHERE team.name = "Arsenal"

RETURN game.name

Find  Arsenal’s  away  matches  

Evolving  the  football  graph  

Find  the  top  away  goal  scorers  

Find  the  top  away  goal  scorers  

MATCH (team)<-[:away_team]-(game:Game),

(game)<-[:contains_match]-(season:Season),

(team)<-[:for]-(stats)<-[:played]-(player),

(stats)-[:in]->(game)

WHERE season.name = "2012-2013"

RETURN player.name,

COLLECT(DISTINCT team.name),

SUM(stats.goals) as goals

ORDER BY goals DESC

LIMIT 10

MulJple  graph  paXerns  

MATCH (team)<-[:away_team]-(game:Game),

(game)<-[:contains_match]-(season:Season),

(team)<-[:for]-(stats)<-[:played]-(player),

(stats)-[:in]->(game)

WHERE season.name = "2012-2013"

RETURN player.name,

COLLECT(DISTINCT team.name),

SUM(stats.goals) as goals

ORDER BY goals DESC

LIMIT 10

Anchor  paXern  in  the  graph  

MATCH (team)<-[:away_team]-(game:Game),

(game)<-[:contains_match]-(season:Season),

(team)<-[:for]-(stats)<-[:played]-(player),

(stats)-[:in]->(game)

WHERE season.name = "2012-2013"

RETURN player.name,

COLLECT(DISTINCT team.name),

SUM(stats.goals) as goals

ORDER BY goals DESC

LIMIT 10

Group  by  player  

MATCH (team)<-[:away_team]-(game:Game),

(game)<-[:contains_match]-(season:Season),

(team)<-[:for]-(stats)<-[:played]-(player),

(stats)-[:in]->(game)

WHERE season.name = "2012-2013"

RETURN player.name,

COLLECT(DISTINCT team.name),

SUM(stats.goals) as goals

ORDER BY goals DESC

LIMIT 10

Find  the  top  away  goal  scorers  

Other  football  queries  

• Goals  scored  in  each  month  by  Michu  •  ToXenham  results  when  Gareth  Bale  scores  • What  did  Wayne  Rooney  do  in  April?  • Which  players  only  score  when  a  game  is  televised?  

Graph  Query  Design  

The  relaJonal  version  

Graph  vs  RelaJonal  

Rela%onal   Graphs  Tables  -­‐  assume  records  all  have  the        same  structure    

Nodes  -­‐  no  need  to  set  a  property  if  it          doesn’t  exist  

Foreign  keys  between  tables  -­‐  joins  calculated  at  run  Jme  -­‐  the  more  tables  you  join  to  a          query  the  slower  the  query  gets  

Rela%onships  -­‐  stored  as  a  ‘Pre-­‐computed          index’  at  write  Jme  -­‐  very  easy  to  do  lots  of  ‘hops’          between  relaJonships  

.NET  and  Neo4j  

REST  Client  

ApplicaJon  

H  T  T  P  

Neo4j  Server  

Neo4jClient    

.NET  and  Neo4j  

ApplicaJon  

H  T  T  P  

Neo4j  Server  

REST  Client  

.NET  and  Neo4j  

.NET  and  Neo4j  

.NET  and  Neo4j  

.NET  and  Neo4j  

.NET  and  Neo4j  

Thinking  in  graphs    

Graphs  should  be  fun!  

Ask  for  help  if  you  get  stuck  Last  Wednesday  of  the  month    

Come  take  a  copy,  it’s  free!  

Ian Robinson, Jim Webber & Emil Eifrem

Graph Databases

h

Compliments

of Neo Technology

www.graphdatabases.com  

QuesJons?  

Mark  Needham  @markhneedham  mark.needham@neotechnology.com  

top related