intro to cypher for the sql developer

136
Cypher for SQL Developers Mark Needham @markhneedham [email protected]

Upload: neo4j-the-fastest-and-most-scalable-native-graph-database

Post on 22-Jan-2017

520 views

Category:

Technology


2 download

TRANSCRIPT

Page 1: Intro to Cypher for the SQL Developer

Cypher for SQL Developers

Mark Needham @markhneedham [email protected]

Page 2: Intro to Cypher for the SQL Developer

Talk structure

‣ Introduce data set‣ Modeling‣ Import‣ Data Integrity‣ Queries‣ Migration/Refactoring‣ Query optimisation

Page 3: Intro to Cypher for the SQL Developer

Introducing our data set...

Page 4: Intro to Cypher for the SQL Developer

Exploring transfermarkt

Page 5: Intro to Cypher for the SQL Developer

Exploring transfermarkt

|---------+--------------------+-----------------------------------------+--------------------+------------|| season | playerName | playerUri | playerPosition | playerAge ||---------+--------------------+-----------------------------------------+--------------------+------------|| 90/91 | Aldair | /aldair/profil/spieler/4151 | Centre Back | 24 || 90/91 | Thomas Häßler | /thomas-hassler/profil/spieler/553 | Attacking Midfield | 24 || 90/91 | Roberto Baggio | /roberto-baggio/profil/spieler/4153 | Secondary Striker | 23 || 90/91 | Karl-Heinz Riedle | /karl-heinz-riedle/profil/spieler/13806 | Centre Forward | 24 || 90/91 | Henrik Larsen | /henrik-larsen/profil/spieler/101330 | Attacking Midfield | 24 || 90/91 | Gheorghe Hagi | /gheorghe-hagi/profil/spieler/7939 | Attacking Midfield | 25 || 90/91 | Hristo Stoichkov | /hristo-stoichkov/profil/spieler/7938 | Left Wing | 24 || 90/91 | Brian Laudrup | /brian-laudrup/profil/spieler/39667 | Centre Forward | 21 || 90/91 | Miguel Ángel Nadal | /miguel-angel-nadal/profil/spieler/7676 | Centre Back | 23 ||---------+--------------------+-----------------------------------------+--------------------+------------|

Page 6: Intro to Cypher for the SQL Developer

Exploring transfermarkt

|-------------------+---------------------+-------------------------------------+--------------------|| sellerClubName | sellerClubNameShort | sellerClubUri | sellerClubCountry ||-------------------+---------------------+-------------------------------------+--------------------|| SL Benfica | Benfica | /benfica/startseite/verein/294 | Portugal || 1. FC Köln | 1. FC Köln | /1-fc-koln/startseite/verein/3 | Germany || ACF Fiorentina | Fiorentina | /fiorentina/startseite/verein/430 | Italy || SV Werder Bremen | Werder Bremen | /werder-bremen/startseite/verein/86 | Germany || Lyngby BK | Lyngby BK | /lyngby-bk/startseite/verein/369 | Denmark || Steaua Bucharest | Steaua | /steaua/startseite/verein/301 | Romania || CSKA Sofia | CSKA Sofia | /cska-sofia/startseite/verein/208 | Bulgaria || KFC Uerdingen 05 | KFC Uerdingen | /kfc-uerdingen/startseite/verein/95 | Germany || RCD Mallorca | RCD Mallorca | /rcd-mallorca/startseite/verein/237 | Spain ||-------------------+---------------------+-------------------------------------+--------------------|

Page 7: Intro to Cypher for the SQL Developer

Exploring transfermarkt

|----------------+--------------------+-------------------------------------+-------------------|| buyerClubName | buyerClubNameShort | buyerClubUri | buyerClubCountry ||----------------+--------------------+-------------------------------------+-------------------|| AS Roma | AS Roma | /as-roma/startseite/verein/12 | Italy || Juventus FC | Juventus | /juventus/startseite/verein/506 | Italy || Juventus FC | Juventus | /juventus/startseite/verein/506 | Italy || SS Lazio | Lazio | /lazio/startseite/verein/398 | Italy || AC Pisa 1909 | AC Pisa | /ac-pisa/startseite/verein/4172 | Italy || Real Madrid | Real Madrid | /real-madrid/startseite/verein/418 | Spain || FC Barcelona | FC Barcelona | /fc-barcelona/startseite/verein/131 | Spain || Bayern Munich | Bayern Munich | /bayern-munich/startseite/verein/27 | Germany || FC Barcelona | FC Barcelona | /fc-barcelona/startseite/verein/131 | Spain ||----------------+--------------------+-------------------------------------+-------------------|

Page 8: Intro to Cypher for the SQL Developer

Exploring transfermarkt

|--------------------------------------------------------+-------------+---------------|| transferUri | transferFee | transferRank ||--------------------------------------------------------+-------------+---------------|| /jumplist/transfers/spieler/4151/transfer_id/6993 | £6.75m | 1 || /jumplist/transfers/spieler/553/transfer_id/2405 | £5.85m | 2 || /jumplist/transfers/spieler/4153/transfer_id/84533 | £5.81m | 3 || /jumplist/transfers/spieler/13806/transfer_id/19054 | £5.63m | 4 || /jumplist/transfers/spieler/101330/transfer_id/275067 | £5.03m | 5 || /jumplist/transfers/spieler/7939/transfer_id/19343 | £3.23m | 6 || /jumplist/transfers/spieler/7938/transfer_id/11563 | £2.25m | 7 || /jumplist/transfers/spieler/39667/transfer_id/90285 | £2.25m | 8 || /jumplist/transfers/spieler/7676/transfer_id/11828 | £2.10m | 9 ||--------------------------------------------------------+-------------+---------------|

Page 9: Intro to Cypher for the SQL Developer

Relational Model

players

id

name

position

clubs

id

name

country

transfers

id

fee

player_age

player_id

from_club_id

to_club_id

season

Page 10: Intro to Cypher for the SQL Developer

Graph model

Page 11: Intro to Cypher for the SQL Developer

Nodes

Page 12: Intro to Cypher for the SQL Developer

Relationships

Page 13: Intro to Cypher for the SQL Developer

Properties

Page 14: Intro to Cypher for the SQL Developer

Labels

Page 15: Intro to Cypher for the SQL Developer

Relational vs Graph

Records in tables

Nodes"Soft"

relationships computed at query time

"Hard" relationships built into the

data store

Page 16: Intro to Cypher for the SQL Developer

Relational Import

Page 17: Intro to Cypher for the SQL Developer

Create players table

CREATE TABLE players (

"id" character varying(100)

NOT NULL PRIMARY KEY,

"name" character varying(150) NOT NULL,

"position" character varying(20)

);

Page 18: Intro to Cypher for the SQL Developer

Insert players

INSERT INTO players

VALUES('/aldair/profil/spieler/4151', 'Aldair', 'Centre Back');

INSERT INTO players

VALUES('/thomas-hassler/profil/spieler/553', 'Thomas Häßler',

'Attacking Midfield');

INSERT INTO players VALUES('/roberto-

baggio/profil/spieler/4153', 'Roberto Baggio', 'Secondary

Striker');

Page 19: Intro to Cypher for the SQL Developer

Create clubs table

CREATE TABLE clubs (

"id" character varying(100)

NOT NULL PRIMARY KEY,

"name" character varying(50) NOT NULL,

"country" character varying(50)

);

Page 20: Intro to Cypher for the SQL Developer

Insert clubs

INSERT INTO clubs VALUES('/hertha-bsc/startseite/verein/44',

'Hertha BSC', 'Germany');

INSERT INTO clubs VALUES('/cfr-cluj/startseite/verein/7769',

'CFR Cluj', 'Romania');

INSERT INTO clubs VALUES('/real-sociedad/startseite/verein/681',

'Real Sociedad', 'Spain');

Page 21: Intro to Cypher for the SQL Developer

Create transfers table

CREATE TABLE transfers (

"id" character varying(100) NOT NULL PRIMARY KEY,

"fee" character varying(50) NOT NULL,

"numericFee" integer NOT NULL,

"player_age" smallint NOT NULL,

"season" character varying(5) NOT NULL,

"player_id" character varying(100) NOT NULL REFERENCES players (id),

"from_club_id" character varying(100) NOT NULL REFERENCES clubs (id),

"to_club_id" character varying(100) NOT NULL REFERENCES clubs (id)

);

Page 22: Intro to Cypher for the SQL Developer

Insert transfers

INSERT INTO transfers VALUES('/jumplist/transfers/spieler/4151/transfer_id/6993',

'£6.75m', 6750000, '90/91', 24, '/aldair/profil/spieler/4151',

'/benfica/startseite/verein/294', '/as-roma/startseite/verein/12');

INSERT INTO transfers VALUES('/jumplist/transfers/spieler/553/transfer_id/2405',

'£5.85m', 5850000, '90/91', 24, '/thomas-hassler/profil/spieler/553', '/1-fc-

koln/startseite/verein/3', '/juventus/startseite/verein/506');

INSERT INTO transfers VALUES('/jumplist/transfers/spieler/4153/transfer_id/84533',

'£5.81m', 5810000, '90/91', 23, '/roberto-baggio/profil/spieler/4153',

'/fiorentina/startseite/verein/430', '/juventus/startseite/verein/506');

Page 23: Intro to Cypher for the SQL Developer

Graph Import

Page 24: Intro to Cypher for the SQL Developer

LOAD CSV

‣ Tool for importing CSV files‣ Intended for data sets of ~10M records‣ Works against live database‣ Use Cypher constructs to define graph

Page 25: Intro to Cypher for the SQL Developer

LOAD CSV

[USING PERIODIC COMMIT [1000]]

LOAD CSV WITH HEADERS FROM "(file|http)://" AS row

MATCH (:Label {property: row.header})

CREATE (:Label {property: row.header})

MERGE (:Label {property: row.header})

Page 26: Intro to Cypher for the SQL Developer

LOAD CSV

[USING PERIODIC COMMIT [1000]]

LOAD CSV WITH HEADERS FROM "(file|http)://" AS row

MATCH (:Label {property: row.header})

CREATE (:Label {property: row.header})

MERGE (:Label {property: row.header})

Page 27: Intro to Cypher for the SQL Developer

LOAD CSV

[USING PERIODIC COMMIT [1000]]

LOAD CSV WITH HEADERS FROM "(file|http)://" AS row

MATCH (:Label {property: row.header})

CREATE (:Label {property: row.header})

MERGE (:Label {property: row.header})

Page 28: Intro to Cypher for the SQL Developer

LOAD CSV

[USING PERIODIC COMMIT [1000]]

LOAD CSV WITH HEADERS FROM "(file|http)://" AS row

MATCH (:Label {property: row.header})

CREATE (:Label {property: row.header})

MERGE (:Label {property: row.header})

Page 29: Intro to Cypher for the SQL Developer

LOAD CSV

[USING PERIODIC COMMIT [1000]]

LOAD CSV WITH HEADERS FROM "(file|http)://" AS row

MATCH (:Label {property: row.header})

CREATE (:Label {property: row.header})

MERGE (:Label {property: row.header})

Page 30: Intro to Cypher for the SQL Developer

LOAD CSV

[USING PERIODIC COMMIT [1000]]

LOAD CSV WITH HEADERS FROM "(file|http)://" AS row

MATCH (:Label {property: row.header})

CREATE (:Label {property: row.header})

MERGE (:Label {property: row.header})

Page 31: Intro to Cypher for the SQL Developer

Exploring the data

LOAD CSV WITH HEADERS

FROM "file:///transfers.csv"

AS row

RETURN COUNT(*)

Page 32: Intro to Cypher for the SQL Developer

Exploring the data

LOAD CSV WITH HEADERS

FROM "file:///transfers.csv"

AS row

RETURN COUNT(*)

Page 33: Intro to Cypher for the SQL Developer

Exploring the data

LOAD CSV WITH HEADERS

FROM "file:///transfers.csv"

AS row

RETURN row

LIMIT 1

Page 34: Intro to Cypher for the SQL Developer

Exploring the data

LOAD CSV WITH HEADERS

FROM "file:///transfers.csv"

AS row

RETURN row

LIMIT 1

Page 35: Intro to Cypher for the SQL Developer

Import players

USING PERIODIC COMMIT

LOAD CSV WITH HEADERS FROM "file:///transfers.csv" AS row

CREATE (player:Player {

id: row.playerUri,

name: row.playerName,

position: row.playerPosition

})

Page 36: Intro to Cypher for the SQL Developer

Import players

USING PERIODIC COMMIT

LOAD CSV WITH HEADERS FROM "file:///transfers.csv" AS row

CREATE (player:Player {

id: row.playerUri,

name: row.playerName,

position: row.playerPosition

})

Not so fast!

Page 37: Intro to Cypher for the SQL Developer

Ensure uniqueness of players

CREATE CONSTRAINT ON (player:Player)

ASSERT player.id IS UNIQUE

Page 38: Intro to Cypher for the SQL Developer

Import players

USING PERIODIC COMMIT

LOAD CSV WITH HEADERS FROM "file:///transfers.csv" AS row

CREATE (player:Player {

id: row.playerUri,

name: row.playerName,

position: row.playerPosition

})

Node 25 already exists with label Player and property "id"=[/peter-

lux/profil/spieler/84682]

Page 39: Intro to Cypher for the SQL Developer

Import players

USING PERIODIC COMMIT

LOAD CSV WITH HEADERS FROM "file:///transfers.csv" AS row

MERGE (player:Player {id: row.playerUri})

ON CREATE SET player.name = row.playerName,

player.position = row.playerPosition

Page 40: Intro to Cypher for the SQL Developer

Import clubs

CREATE CONSTRAINT ON (club:Club)

ASSERT club.id IS UNIQUE

Page 41: Intro to Cypher for the SQL Developer

Import selling clubs

USING PERIODIC COMMIT

LOAD CSV WITH HEADERS FROM "file:///transfers.csv" AS row

MERGE (club:Club {id: row.sellerClubUri})

ON CREATE SET club.name = row.sellerClubName,

club.country = row.sellerClubCountry

Page 42: Intro to Cypher for the SQL Developer

Import buying clubs

USING PERIODIC COMMIT

LOAD CSV WITH HEADERS FROM "file:///transfers.csv" AS row

MERGE (club:Club {id: row.buyerClubUri})

ON CREATE SET club.name = row.buyerClubName,

club.country = row.buyerClubCountry

Page 43: Intro to Cypher for the SQL Developer

Import transfers

CREATE CONSTRAINT ON (transfer:Transfer)

ASSERT transfer.id IS UNIQUE

Page 44: Intro to Cypher for the SQL Developer

Import transfers

LOAD CSV WITH HEADERS FROM "file:///transfers.csv" AS row

MATCH (player:Player {id: row.playerUri})

MATCH (source:Club {id: row.sellerClubUri})

MATCH (destination:Club {id: row.buyerClubUri})

MERGE (t:Transfer {id: row.transferUri})

ON CREATE SET t.season = row.season, t.rank = row.transferRank,

t.fee = row.transferFee

MERGE (t)-[:OF_PLAYER { age: row.playerAge }]->(player)

MERGE (t)-[:FROM_CLUB]->(source)

MERGE (t)-[:TO_CLUB]->(destination)

Page 45: Intro to Cypher for the SQL Developer

Schema

Page 46: Intro to Cypher for the SQL Developer

Optional Schema

‣ Unique node property constraint

Page 47: Intro to Cypher for the SQL Developer

Optional Schema

‣ Unique node property constraintCREATE CONSTRAINT ON (club:Club)

ASSERT club.id IS UNIQUE

Page 48: Intro to Cypher for the SQL Developer

Optional Schema

‣ Unique node property constraint‣ Node property existence constraint

Page 49: Intro to Cypher for the SQL Developer

Optional Schema

‣ Unique node property constraint‣ Node property existence constraintCREATE CONSTRAINT ON (club:Club)

ASSERT EXISTS(club.name)

Page 50: Intro to Cypher for the SQL Developer

Optional Schema

‣ Unique node property constraint‣ Node property existence constraint‣ Relationship property existence constraint

Page 51: Intro to Cypher for the SQL Developer

Optional Schema

‣ Unique node property constraint‣ Node property existence constraint‣ Relationship property existence constraintCREATE CONSTRAINT ON ()-[player:OF_PLAYER]-()

ASSERT exists(player.age)

Page 52: Intro to Cypher for the SQL Developer

SQL vs Cypher

Page 53: Intro to Cypher for the SQL Developer

Find player by name

Page 54: Intro to Cypher for the SQL Developer

SELECT *

FROM players

WHERE players.name = 'Cristiano Ronaldo'

Page 55: Intro to Cypher for the SQL Developer

SELECT *

FROM players

WHERE players.name = 'Cristiano Ronaldo'

MATCH (player:Player { name: "Cristiano Ronaldo" })

RETURN player

Page 56: Intro to Cypher for the SQL Developer

SELECT *

FROM players

WHERE players.name = 'Cristiano Ronaldo'

MATCH (player:Player { name: "Cristiano Ronaldo" })

RETURN player

Page 57: Intro to Cypher for the SQL Developer

SELECT *

FROM players

WHERE players.name = 'Cristiano Ronaldo'

MATCH (player:Player { name: "Cristiano Ronaldo" })

RETURN player

Page 58: Intro to Cypher for the SQL Developer
Page 59: Intro to Cypher for the SQL Developer

Find transfers between clubs

Page 60: Intro to Cypher for the SQL Developer

SELECT players.name, t."numericFee", t.season

FROM transfers AS t

JOIN clubs AS clubFrom ON t.from_club_id = clubFrom.id

JOIN clubs AS clubTo ON t.to_club_id = clubTo.id

JOIN players ON t.player_id = players.id

WHERE clubFrom.name = 'Tottenham Hotspur'

AND clubTo.name = 'Manchester United'

Page 61: Intro to Cypher for the SQL Developer

SELECT players.name, t."numericFee", t.season

FROM transfers AS t

JOIN clubs AS clubFrom ON t.from_club_id = clubFrom.id

JOIN clubs AS clubTo ON t.to_club_id = clubTo.id

JOIN players ON t.player_id = players.id

WHERE clubFrom.name = 'Tottenham Hotspur'

AND clubTo.name = 'Manchester United'

MATCH (from:Club)<-[:FROM_CLUB]-(transfer:Transfer)-[:TO_CLUB]->(to:Club),

(transfer)-[:OF_PLAYER]->(player)

WHERE from.name = "Tottenham Hotspur" AND to.name = "Manchester United"

RETURN player.name, transfer.numericFee, transfer.season

Page 62: Intro to Cypher for the SQL Developer

SELECT players.name, t."numericFee", t.season

FROM transfers AS t

JOIN clubs AS clubFrom ON t.from_club_id = clubFrom.id

JOIN clubs AS clubTo ON t.to_club_id = clubTo.id

JOIN players ON t.player_id = players.id

WHERE clubFrom.name = 'Tottenham Hotspur'

AND clubTo.name = 'Manchester United'

MATCH (from:Club)<-[:FROM_CLUB]-(transfer:Transfer)-[:TO_CLUB]->(to:Club),

(transfer)-[:OF_PLAYER]->(player)

WHERE from.name = "Tottenham Hotspur" AND to.name = "Manchester United"

RETURN player.name, transfer.numericFee, transfer.season

Page 63: Intro to Cypher for the SQL Developer
Page 64: Intro to Cypher for the SQL Developer

How does Neo4j use indexes?

Indexes are only used to find the starting point for queries.

Use index scans to look up rows in tables and join them with rows from other tables

Use indexes to find the starting points for a query.

Relational

Graph

Page 65: Intro to Cypher for the SQL Developer

How does Neo4j use indexes?

Page 66: Intro to Cypher for the SQL Developer

Migrating/refactoring the model

Page 67: Intro to Cypher for the SQL Developer

Player nationality

|------------------------------------------+--------------------+--------------------|

| playerUri | playerName | playerNationality |

|------------------------------------------+--------------------+--------------------|

| /aldair/profil/spieler/4151 | Aldair | Brazil |

| /thomas-hassler/profil/spieler/553 | Thomas Häßler | Germany |

| /roberto-baggio/profil/spieler/4153 | Roberto Baggio | Italy |

| /karl-heinz-riedle/profil/spieler/13806 | Karl-Heinz Riedle | Germany |

| /henrik-larsen/profil/spieler/101330 | Henrik Larsen | Denmark |

| /gheorghe-hagi/profil/spieler/7939 | Gheorghe Hagi | Romania |

| /hristo-stoichkov/profil/spieler/7938 | Hristo Stoichkov | Bulgaria |

| /brian-laudrup/profil/spieler/39667 | Brian Laudrup | Denmark |

| /miguel-angel-nadal/profil/spieler/7676 | Miguel Ángel Nadal | Spain |

|------------------------------------------+--------------------+--------------------|

Page 68: Intro to Cypher for the SQL Developer

Relational migration

Page 69: Intro to Cypher for the SQL Developer

Relational Model

players

id

name

position

nationality

clubs

id

name

country

transfers

id

fee

player_age

player_id

from_club_id

to_club_id

season

Page 70: Intro to Cypher for the SQL Developer

Add column to players table

ALTER TABLE players

ADD COLUMN nationality varying(30);

Page 71: Intro to Cypher for the SQL Developer

Update players table

UPDATE players

SET nationality = 'Brazil'

WHERE players.id = '/aldair/profil/spieler/4151';

UPDATE players

SET nationality = 'Germany'

WHERE players.id ='/ulf-kirsten/profil/spieler/74';

UPDATE players

SET nationality = 'England'

WHERE players.id ='/john-lukic/profil/spieler/28241';

Page 72: Intro to Cypher for the SQL Developer

Graph refactoring

Page 73: Intro to Cypher for the SQL Developer

Graph model

Page 74: Intro to Cypher for the SQL Developer

Add property to player nodes

USING PERIODIC COMMIT

LOAD CSV WITH HEADERS FROM "file:///transfers.csv" AS row

MATCH (player:Player {id: row.playerUri})

SET player.nationality = row.playerNationality

Page 75: Intro to Cypher for the SQL Developer

Find transfers of English players

Page 76: Intro to Cypher for the SQL Developer

SELECT players.name, clubFrom.name, clubTo.name, t."numericFee", t.season

FROM transfers AS t

JOIN clubs AS clubFrom ON t.from_club_id = clubFrom.id

JOIN clubs AS clubTo ON t.to_club_id = clubTo.id

JOIN players ON t.player_id = players.id

WHERE clubFrom.country = 'England' AND clubTo.country = 'England'

AND players.nationality = 'England'

ORDER BY t."numericFee" DESC

LIMIT 10

Page 77: Intro to Cypher for the SQL Developer

SELECT players.name, clubFrom.name, clubTo.name, t."numericFee", t.season

FROM transfers AS t

JOIN clubs AS clubFrom ON t.from_club_id = clubFrom.id

JOIN clubs AS clubTo ON t.to_club_id = clubTo.id

JOIN players ON t.player_id = players.id

WHERE clubFrom.country = 'England' AND clubTo.country = 'England'

AND players.nationality = 'England'

ORDER BY t."numericFee" DESC

LIMIT 10

MATCH (to:Club)<-[:TO_CLUB]-(t:Transfer)-[:FROM_CLUB]-(from:Club),

(t)-[:OF_PLAYER]->(player:Player)

WHERE to.country = "England" AND from.country = "England"

AND player.nationality = "England"

RETURN player.name, from.name, to.name, t.numericFee, t.season

ORDER BY t.numericFee DESC

LIMIT 10

Page 78: Intro to Cypher for the SQL Developer

SELECT players.name, clubFrom.name, clubTo.name, t."numericFee", t.season

FROM transfers AS t

JOIN clubs AS clubFrom ON t.from_club_id = clubFrom.id

JOIN clubs AS clubTo ON t.to_club_id = clubTo.id

JOIN players ON t.player_id = players.id

WHERE clubFrom.country = 'England' AND clubTo.country = 'England'

AND players.nationality = 'England'

ORDER BY t."numericFee" DESC

LIMIT 10

MATCH (to:Club)<-[:TO_CLUB]-(t:Transfer)-[:FROM_CLUB]-(from:Club),

(t)-[:OF_PLAYER]->(player:Player)

WHERE to.country = "England" AND from.country = "England"

AND player.nationality = "England"

RETURN player.name, from.name, to.name, t.numericFee, t.season

ORDER BY t.numericFee DESC

LIMIT 10

Page 79: Intro to Cypher for the SQL Developer
Page 80: Intro to Cypher for the SQL Developer

Countries and confederations|----------------------+----------------|| country | confederation ||----------------------+----------------|| Afghanistan | afc || Albania | uefa || Algeria | caf || American Samoa | ofc || Andorra | uefa || Angola | caf || Anguilla | concacaf || Antigua and Barbuda | concacaf || Argentina | conmebol ||----------------------+----------------|

|-----------+-----------+-------------------------------------------------|| urlName | shortName | region ||-----------+-----------+-------------------------------------------------|| afc | AFC | Asia || uefa | UEFA | Europe || ofc | OFC | Oceania || conmebol | CONMEBOL | South America || concacaf | CONCACAF | North American, Central American and Caribbean || caf | CAF | Africa ||-----------+-----------+-------------------------------------------------|

Page 81: Intro to Cypher for the SQL Developer

Relational migration

Page 82: Intro to Cypher for the SQL Developer

Relational Model

players

id

name

position

country_id

clubs

id

name

country_id

transfers

id

fee

player_age

player_id

from_club_id

to_club_id

season

countries

id

name

confederation_id

confederations

id

shortName

name

region

Page 83: Intro to Cypher for the SQL Developer

Create confederations table

CREATE TABLE confederations (

"id" character varying(10)

NOT NULL PRIMARY KEY,

"shortName" character varying(50) NOT NULL,

"name" character varying(100) NOT NULL,

"region" character varying(100) NOT NULL

);

Page 84: Intro to Cypher for the SQL Developer

Populate confederations

INSERT INTO confederations VALUES('afc', 'AFC', 'Asian Football

Confederation', 'Asia');

INSERT INTO confederations VALUES('uefa', 'UEFA', 'Union of European

Football Associations', 'Europe');

INSERT INTO confederations VALUES('ofc', 'OFC', 'Oceania Football

Confederation', 'Oceania');

Page 85: Intro to Cypher for the SQL Developer

Create countries table

CREATE TABLE countries (

"code" character varying(3)

NOT NULL PRIMARY KEY,

"name" character varying(50)

NOT NULL,

"federation" character varying(10) NOT NULL

REFERENCES confederations (id)

);

Page 86: Intro to Cypher for the SQL Developer

Populate countries

INSERT INTO countries VALUES('MNE', 'Montenegro', 'uefa');

INSERT INTO countries VALUES('LTU', 'Lithuania', 'uefa');

INSERT INTO countries VALUES('CAM', 'Cambodia', 'afc');

INSERT INTO countries VALUES('SUI', 'Switzerland', 'uefa');

INSERT INTO countries VALUES('ETH', 'Ethiopia', 'caf');

INSERT INTO countries VALUES('ARU', 'Aruba', 'concacaf');

INSERT INTO countries VALUES('SWZ', 'Swaziland', 'caf');

INSERT INTO countries VALUES('PLE', 'Palestine', 'afc');

Page 87: Intro to Cypher for the SQL Developer

Add column to clubs table

ALTER TABLE clubs

ADD COLUMN country_id character varying(3)

REFERENCES countries(code);

Page 88: Intro to Cypher for the SQL Developer

Update clubs

UPDATE clubs AS cl

SET country_id = c.code

FROM clubs

INNER JOIN countries AS c

ON c.name = clubs.country

WHERE cl.id = clubs.id;

Page 89: Intro to Cypher for the SQL Developer

Update clubs

# select * from clubs limit 5;

id | name | country | country_id

----------------------------------------+-----------------------------+---------------+------------

/san-jose-clash/startseite/verein/4942 | San Jose Clash | United States | USA

/chicago/startseite/verein/432 | Chicago Fire | United States | USA

/gz-evergrande/startseite/verein/10948 | Guangzhou Evergrande Taobao | China | CHN

/as-vita-club/startseite/verein/2225 | AS Vita Club Kinshasa | Congo DR | CGO

/vicenza/startseite/verein/2655 | Vicenza Calcio | Italy | ITA

(6 rows)

Page 90: Intro to Cypher for the SQL Developer

Remove country

ALTER TABLE clubs

DROP COLUMN country;

Page 91: Intro to Cypher for the SQL Developer

Add column to players table

ALTER TABLE players

ADD COLUMN country_id character varying(3)

REFERENCES countries(code);

Page 92: Intro to Cypher for the SQL Developer

Update players

UPDATE players AS p

SET country_id = c.code

FROM players

INNER JOIN countries AS c

ON c.name = players.nationality

WHERE p.id = players.id;

Page 93: Intro to Cypher for the SQL Developer

Update players

# select * from players limit 5;

id | name | position | nationality | country_id

-----------------------------------------+-------------------+--------------------+-------------+------------

/dalian-atkinson/profil/spieler/200738 | Dalian Atkinson | Attacking Midfield | England | ENG

/steve-redmond/profil/spieler/177056 | Steve Redmond | Centre Back | England | ENG

/bert-konterman/profil/spieler/6252 | Bert Konterman | Centre Back | Netherlands | NED

/lee-philpott/profil/spieler/228030 | Lee Philpott | Midfield | England | ENG

/tomasz-frankowski/profil/spieler/14911 | Tomasz Frankowski | Centre Forward | Poland | POL

(5 rows)

Page 94: Intro to Cypher for the SQL Developer

Remove nationality

ALTER TABLE players

DROP COLUMN nationality;

Page 95: Intro to Cypher for the SQL Developer

Graph refactoring

Page 96: Intro to Cypher for the SQL Developer

Graph model

Page 97: Intro to Cypher for the SQL Developer

Import confederations

LOAD CSV WITH HEADERS

FROM "file:///confederations.csv" AS row

MERGE (c:Confederation {id: row.urlName})

ON CREATE

SET c.shortName = row.shortName,

c.region = row.region,

c.name = row.name

Page 98: Intro to Cypher for the SQL Developer

Import countries

LOAD CSV WITH HEADERS FROM "file:///countries.csv"

AS row

MERGE (country:Country {id: row.countryCode})

ON CREATE SET country.name = row.country

WITH country, row

MATCH (conf:Confederation {id: row.confederation })

MERGE (country)-[:PART_OF]->(conf)

Page 99: Intro to Cypher for the SQL Developer

Refactor clubs

MATCH (club:Club)

MATCH (country:Country {name: club.country})

MERGE (club)-[:PART_OF]->(country)

REMOVE club.country

Page 100: Intro to Cypher for the SQL Developer

Refactor players

MATCH (player:Player)

MATCH (country:Country {name: player.nationality})

MERGE (player)-[:PLAYS_FOR]->(country)

REMOVE player.nationality

Page 101: Intro to Cypher for the SQL Developer

Recap: Find transfers of English players

Page 102: Intro to Cypher for the SQL Developer

SELECT players.name, clubFrom.name, clubTo.name, t."numericFee", t.season

FROM transfers AS t

JOIN clubs AS clubFrom ON t.from_club_id = clubFrom.id

JOIN clubs AS clubTo ON t.to_club_id = clubTo.id

JOIN players ON t.player_id = players.id

WHERE clubFrom.country = 'England' AND clubTo.country = 'England'

AND players.nationality = 'England'

ORDER BY t."numericFee" DESC

LIMIT 10

MATCH (to:Club)<-[:TO_CLUB]-(t:Transfer)-[:FROM_CLUB]-(from:Club),

(t)-[:OF_PLAYER]->(player:Player)

WHERE to.country = "England" AND from.country = "England"

AND player.nationality = "England"

RETURN player.name, from.name, to.name, t.numericFee, t.season

ORDER BY t.numericFee DESC

LIMIT 10

Page 103: Intro to Cypher for the SQL Developer

SELECT players.name, clubFrom.name, clubTo.name, t."numericFee", t.season

FROM transfers AS t

JOIN clubs AS clubFrom ON t.from_club_id = clubFrom.id

JOIN clubs AS clubTo ON t.to_club_id = clubTo.id

JOIN players ON t.player_id = players.id

JOIN countries AS fromCount ON clubFrom.country_id = fromCount.code

JOIN countries AS toCount ON clubTo.country_id = toCount.code

JOIN countries AS playerCount ON players.country_id = playerCount.code

WHERE fromCount.name = 'England' AND toCount.name = 'England' AND playerCount.name = 'England'

ORDER BY t."numericFee" DESC

LIMIT 10

MATCH (to:Club)<-[:TO_CLUB]-(t:Transfer)-[:FROM_CLUB]-(from:Club),

(t)-[:OF_PLAYER]->(player:Player)-[:PLAYS_FOR]->(country:Country),

(to)-[:PART_OF]->(country)<-[:PART_OF]-(from)

WHERE country.name = "England"

RETURN player.name, from.name, to.name, t.numericFee, t.season

ORDER BY t.numericFee DESC

LIMIT 10

Page 104: Intro to Cypher for the SQL Developer

SELECT players.name, clubFrom.name, clubTo.name, t."numericFee", t.season

FROM transfers AS t

JOIN clubs AS clubFrom ON t.from_club_id = clubFrom.id

JOIN clubs AS clubTo ON t.to_club_id = clubTo.id

JOIN players ON t.player_id = players.id

JOIN countries AS fromCount ON clubFrom.country_id = fromCount.code

JOIN countries AS toCount ON clubTo.country_id = toCount.code

JOIN countries AS playerCount ON players.country_id = playerCount.code

WHERE fromCount.name = 'England' AND toCount.name = 'England' AND playerCount.name = 'England'

ORDER BY t."numericFee" DESC

LIMIT 10

MATCH (to:Club)<-[:TO_CLUB]-(t:Transfer)-[:FROM_CLUB]-(from:Club),

(t)-[:OF_PLAYER]->(player:Player)-[:PLAYS_FOR]->(country:Country),

(to)-[:PART_OF]->(country)<-[:PART_OF]-(from)

WHERE country.name = "England"

RETURN player.name, from.name, to.name, t.numericFee, t.season

ORDER BY t.numericFee DESC

LIMIT 10

Page 105: Intro to Cypher for the SQL Developer

SELECT players.name, clubFrom.name, clubTo.name, t."numericFee", t.season

FROM transfers AS t

JOIN clubs AS clubFrom ON t.from_club_id = clubFrom.id

JOIN clubs AS clubTo ON t.to_club_id = clubTo.id

JOIN players ON t.player_id = players.id

JOIN countries AS fromCount ON clubFrom.country_id = fromCount.code

JOIN countries AS toCount ON clubTo.country_id = toCount.code

JOIN countries AS playerCount ON players.country_id = playerCount.code

WHERE fromCount.name = 'England' AND toCount.name = 'England' AND playerCount.name = 'England'

ORDER BY t."numericFee" DESC

LIMIT 10

MATCH (to:Club)<-[:TO_CLUB]-(t:Transfer)-[:FROM_CLUB]-(from:Club),

(t)-[:OF_PLAYER]->(player:Player)-[:PLAYS_FOR]->(country:Country),

(to)-[:PART_OF]->(country)<-[:PART_OF]-(from)

WHERE country.name = "England"

RETURN player.name, from.name, to.name, t.numericFee, t.season

ORDER BY t.numericFee DESC

LIMIT 10

Page 106: Intro to Cypher for the SQL Developer

Find transfers between different confederations

Page 107: Intro to Cypher for the SQL Developer

SELECT * FROM transfers AS t

JOIN clubs AS clubFrom ON t.from_club_id = clubFrom.id

JOIN clubs AS clubTo ON t.to_club_id = clubTo.id

JOIN players ON t.player_id = players.id

JOIN countries AS fromCountry ON clubFrom.country_id = fromCountry.code

JOIN countries AS toCountry ON clubTo.country_id = toCountry.code

JOIN confederations AS fromConfederation ON fromCountry.federation = fromConfederation.id

JOIN confederations AS toConfederation ON toCountry.federation = toConfederation.id

WHERE fromConfederation.id = 'afc' AND toConfederation.id = 'uefa'

ORDER BY t."numericFee" DESC

LIMIT 10

Page 108: Intro to Cypher for the SQL Developer

SELECT * FROM transfers AS t

JOIN clubs AS clubFrom ON t.from_club_id = clubFrom.id

JOIN clubs AS clubTo ON t.to_club_id = clubTo.id

JOIN players ON t.player_id = players.id

JOIN countries AS fromCountry ON clubFrom.country_id = fromCountry.code

JOIN countries AS toCountry ON clubTo.country_id = toCountry.code

JOIN confederations AS fromConfederation ON fromCountry.federation = fromConfederation.id

JOIN confederations AS toConfederation ON toCountry.federation = toConfederation.id

WHERE fromConfederation.id = 'afc' AND toConfederation.id = 'uefa'

ORDER BY t."numericFee" DESC

LIMIT 10

MATCH (to:Club)<-[:TO_CLUB]-(t:Transfer)-[:FROM_CLUB]-(from:Club),

(t)-[:OF_PLAYER]->(player:Player),

(from)-[:PART_OF*2]->(:Confederation {id: "afc"}),

(to)-[:PART_OF*2]->(:Confederation {id: "uefa"})

RETURN player.name, from.name, to.name, t.numericFee, t.season

ORDER BY t.numericFee DESC

LIMIT 10

Page 109: Intro to Cypher for the SQL Developer
Page 110: Intro to Cypher for the SQL Developer

What’s in my database?

Page 111: Intro to Cypher for the SQL Developer

Tables

# \dt List of relations

Schema | Name | Type | Owner

--------+----------------+-------+-------------

public | clubs | table | markneedham

public | confederations | table | markneedham

public | countries | table | markneedham

public | players | table | markneedham

public | transfers | table | markneedham

(5 rows)

Page 112: Intro to Cypher for the SQL Developer

Node labels

Page 113: Intro to Cypher for the SQL Developer

CALL db.labels()+=============+

|label |

+=============+

|Player |

+-------------+

|Club |

+-------------+

|Transfer |

+-------------+

|Loan |

+-------------+

|Confederation|

+-------------+

|Country |

+-------------+

Node labels

Page 114: Intro to Cypher for the SQL Developer

Table schema

# \d+ countries Table "public.countries"

Column | Type | Modifiers | Storage | Stats target | Description

------------+-----------------------+-----------+----------+--------------+-------------

code | character varying(3) | not null | extended | |

name | character varying(50) | not null | extended | |

federation | character varying(10) | not null | extended | |

Indexes:

"pk_countries" PRIMARY KEY, btree (code)

Foreign-key constraints:

"countries_federation_fkey" FOREIGN KEY (federation) REFERENCES confederations(id)

Referenced by:

TABLE "players" CONSTRAINT "playersfk" FOREIGN KEY (country_id) REFERENCES countries(code) MATCH FULL

Page 115: Intro to Cypher for the SQL Developer

:schema

Indexes

ON :Club(name) ONLINE

ON :Club(id) ONLINE (for uniqueness constraint)

ON :Player(name) ONLINE

ON :Player(id) ONLINE (for uniqueness constraint)

Constraints

ON (player:Player) ASSERT player.id IS UNIQUE

ON (club:Club) ASSERT exists(club.name)

ON (club:Club) ASSERT club.id IS UNIQUE

ON ()-[of_player:OF_PLAYER]-() ASSERT exists(of_player.age)

Graph schema

Page 116: Intro to Cypher for the SQL Developer

MATCH (country:Country)

RETURN keys(country), COUNT(*) AS times+-----------------------+

| keys(country) | times |

+-----------------------+

| ["id","name"] | 198 |

+-----------------------+

Graph schema

Page 117: Intro to Cypher for the SQL Developer

Graph schema

MATCH (club:Club)

RETURN keys(club), COUNT(*) AS times+---------------------------------+

| keys(club) | times |

+---------------------------------+

| ["id","name"] | 806 |

| ["name","country","id"] | 1 |

+---------------------------------+

Page 118: Intro to Cypher for the SQL Developer

Entity/Relationship diagram

Page 119: Intro to Cypher for the SQL Developer

Meta graph

Page 120: Intro to Cypher for the SQL Developer

Meta graph

MATCH (a)-[r]->(b) WITH head(labels(a)) AS l, head(labels(b)) AS l2, type(r) AS rel_type, count(*) as count CALL apoc.create.vNode([l],{name:l}) yield node as a CALL apoc.create.vNode([l2],{name:l2}) yield node as b CALL apoc.create.vRelationship(a,rel_type,{name:rel_type, count:count},b) YIELD rel RETURN *;

Page 121: Intro to Cypher for the SQL Developer

Data Integrity

Page 122: Intro to Cypher for the SQL Developer

Clubs without country

# SELECT * FROM clubs where country_id is null; id | name | country | country_id

---------------------------------------+-------------------------+---------------+------------

/unknown/startseite/verein/75 | Unknown | |

/pohang/startseite/verein/311 | Pohang Steelers | Korea, South |

/bluewings/startseite/verein/3301 | Suwon Samsung Bluewings | Korea, South |

/ulsan/startseite/verein/3535 | Ulsan Hyundai | Korea, South |

/africa-sports/startseite/verein/2936 | Africa Sports | Cote d'Ivoire |

/monaco/startseite/verein/162 | AS Monaco | Monaco |

/jeonbuk/startseite/verein/6502 | Jeonbuk Hyundai Motors | Korea, South |

/busan/startseite/verein/2582 | Busan IPark | Korea, South |

(8 rows)

Page 123: Intro to Cypher for the SQL Developer

Clubs without country

MATCH (club:Club)

WHERE NOT (club)-[:PART_OF]->()

RETURN club+=====================================================================+

|club |

+=====================================================================+

|{name: Unknown, id: /unknown/startseite/verein/75} |

+---------------------------------------------------------------------+

|{country: Monaco, name: AS Monaco, id: /monaco/startseite/verein/162}|

+---------------------------------------------------------------------+

Page 124: Intro to Cypher for the SQL Developer

Deleting data - SQL

# drop table countries;

ERROR: cannot drop table countries because other objects depend on

it

DETAIL: constraint playersfk on table players depends on table

countries

HINT: Use DROP ... CASCADE to drop the dependent objects too.

Page 125: Intro to Cypher for the SQL Developer

MATCH (country:Country)

DELETE country

org.neo4j.kernel.api.exceptions.TransactionFailureException: Node

record Node[11306,used=false,rel=24095,prop=-1,labels=Inline(0x0:

[]),light] still has relationships

Deleting data - Cypher

Page 126: Intro to Cypher for the SQL Developer

MATCH (country:Country)

DETACH DELETE country

Deleted 198 nodes, deleted 5071 relationships, statement executed

in 498 ms.

Deleting data - Cypher

Page 127: Intro to Cypher for the SQL Developer

Query Optimisation

Page 128: Intro to Cypher for the SQL Developer

Optimising queries

‣ Use EXPLAIN/PROFILE to see what your queries are doing under the covers

‣ Index the starting points of queries‣ Reduce work in progress of intermediate

parts of the query where possible‣ Look at the warnings in the Neo4j browser -

they are often helpful!

Page 129: Intro to Cypher for the SQL Developer

Optimising queries - useful links

‣ Tuning Your Cypherhttps://www.youtube.com/watch?v=tYtyoYcd_e8

‣ Neo4j 2.2 Query Tuninghttp://neo4j.com/blog/neo4j-2-2-query-tuning/

‣ Ask for help on Stack Overflow/Neo4j Slackhttp://neo4j-users-slack-invite.herokuapp.com

Page 130: Intro to Cypher for the SQL Developer

One more thing...

Page 131: Intro to Cypher for the SQL Developer

‣ New in Neo4j 3.0.0!

Procedures

Page 132: Intro to Cypher for the SQL Developer

‣ New in Neo4j 3.0.0!‣ We’ve already seen an example!

CALL db.labels()

‣ Michael Hunger has created a set of procedures (APOC) at:https://github.com/jexp/neo4j-apoc-procedures

Procedures

Page 133: Intro to Cypher for the SQL Developer

WITH "https://api.github.com/search/repositories?q=neo4j"

AS githubUri

CALL apoc.load.json(githubUri)

YIELD value AS document

UNWIND document.items AS item

RETURN item.full_name, item.watchers_count, item.forks

ORDER BY item.forks DESC

Querying github

Page 134: Intro to Cypher for the SQL Developer

WITH "https://api.github.com/search/repositories?q=neo4j"

AS githubUri

CALL apoc.load.json(githubUri)

YIELD value AS document

UNWIND document.items AS item

RETURN item.full_name, item.watchers_count, item.forks

ORDER BY item.forks DESC

Querying github

Page 135: Intro to Cypher for the SQL Developer

+------------------------------------------------------------------------+

| item.full_name | item.watchers_count | item.forks |

+------------------------------------------------------------------------+

| "neo4j/neo4j" | 2472 | 872 |

| "spring-projects/spring-data-neo4j" | 403 | 476 |

| "neo4j-contrib/developer-resources" | 106 | 295 |

| "neo4jrb/neo4j" | 1014 | 190 |

| "jadell/neo4jphp" | 507 | 140 |

| "thingdom/node-neo4j" | 780 | 127 |

| "aseemk/node-neo4j-template" | 176 | 91 |

| "jimwebber/neo4j-tutorial" | 268 | 87 |

| "rickardoberg/neo4j-jdbc" | 33 | 68 |

| "FaKod/neo4j-scala" | 194 | 64 |

+------------------------------------------------------------------------+

Querying github

Page 136: Intro to Cypher for the SQL Developer

Questions? :-)

Mark [email protected] @markhneedham

https://github.com/neo4j-meetups/cypher-for-sql-developers