intro to cypher for the sql developer

Post on 22-Jan-2017

520 Views

Category:

Technology

2 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Cypher for SQL Developers

Mark Needham @markhneedham mark@neo4j.com

Talk structure

‣ Introduce data set‣ Modeling‣ Import‣ Data Integrity‣ Queries‣ Migration/Refactoring‣ Query optimisation

Introducing our data set...

Exploring transfermarkt

Exploring transfermarkt

|---------+--------------------+-----------------------------------------+--------------------+------------|| season | playerName | playerUri | playerPosition | playerAge ||---------+--------------------+-----------------------------------------+--------------------+------------|| 90/91 | Aldair | /aldair/profil/spieler/4151 | Centre Back | 24 || 90/91 | Thomas Häßler | /thomas-hassler/profil/spieler/553 | Attacking Midfield | 24 || 90/91 | Roberto Baggio | /roberto-baggio/profil/spieler/4153 | Secondary Striker | 23 || 90/91 | Karl-Heinz Riedle | /karl-heinz-riedle/profil/spieler/13806 | Centre Forward | 24 || 90/91 | Henrik Larsen | /henrik-larsen/profil/spieler/101330 | Attacking Midfield | 24 || 90/91 | Gheorghe Hagi | /gheorghe-hagi/profil/spieler/7939 | Attacking Midfield | 25 || 90/91 | Hristo Stoichkov | /hristo-stoichkov/profil/spieler/7938 | Left Wing | 24 || 90/91 | Brian Laudrup | /brian-laudrup/profil/spieler/39667 | Centre Forward | 21 || 90/91 | Miguel Ángel Nadal | /miguel-angel-nadal/profil/spieler/7676 | Centre Back | 23 ||---------+--------------------+-----------------------------------------+--------------------+------------|

Exploring transfermarkt

|-------------------+---------------------+-------------------------------------+--------------------|| sellerClubName | sellerClubNameShort | sellerClubUri | sellerClubCountry ||-------------------+---------------------+-------------------------------------+--------------------|| SL Benfica | Benfica | /benfica/startseite/verein/294 | Portugal || 1. FC Köln | 1. FC Köln | /1-fc-koln/startseite/verein/3 | Germany || ACF Fiorentina | Fiorentina | /fiorentina/startseite/verein/430 | Italy || SV Werder Bremen | Werder Bremen | /werder-bremen/startseite/verein/86 | Germany || Lyngby BK | Lyngby BK | /lyngby-bk/startseite/verein/369 | Denmark || Steaua Bucharest | Steaua | /steaua/startseite/verein/301 | Romania || CSKA Sofia | CSKA Sofia | /cska-sofia/startseite/verein/208 | Bulgaria || KFC Uerdingen 05 | KFC Uerdingen | /kfc-uerdingen/startseite/verein/95 | Germany || RCD Mallorca | RCD Mallorca | /rcd-mallorca/startseite/verein/237 | Spain ||-------------------+---------------------+-------------------------------------+--------------------|

Exploring transfermarkt

|----------------+--------------------+-------------------------------------+-------------------|| buyerClubName | buyerClubNameShort | buyerClubUri | buyerClubCountry ||----------------+--------------------+-------------------------------------+-------------------|| AS Roma | AS Roma | /as-roma/startseite/verein/12 | Italy || Juventus FC | Juventus | /juventus/startseite/verein/506 | Italy || Juventus FC | Juventus | /juventus/startseite/verein/506 | Italy || SS Lazio | Lazio | /lazio/startseite/verein/398 | Italy || AC Pisa 1909 | AC Pisa | /ac-pisa/startseite/verein/4172 | Italy || Real Madrid | Real Madrid | /real-madrid/startseite/verein/418 | Spain || FC Barcelona | FC Barcelona | /fc-barcelona/startseite/verein/131 | Spain || Bayern Munich | Bayern Munich | /bayern-munich/startseite/verein/27 | Germany || FC Barcelona | FC Barcelona | /fc-barcelona/startseite/verein/131 | Spain ||----------------+--------------------+-------------------------------------+-------------------|

Exploring transfermarkt

|--------------------------------------------------------+-------------+---------------|| transferUri | transferFee | transferRank ||--------------------------------------------------------+-------------+---------------|| /jumplist/transfers/spieler/4151/transfer_id/6993 | £6.75m | 1 || /jumplist/transfers/spieler/553/transfer_id/2405 | £5.85m | 2 || /jumplist/transfers/spieler/4153/transfer_id/84533 | £5.81m | 3 || /jumplist/transfers/spieler/13806/transfer_id/19054 | £5.63m | 4 || /jumplist/transfers/spieler/101330/transfer_id/275067 | £5.03m | 5 || /jumplist/transfers/spieler/7939/transfer_id/19343 | £3.23m | 6 || /jumplist/transfers/spieler/7938/transfer_id/11563 | £2.25m | 7 || /jumplist/transfers/spieler/39667/transfer_id/90285 | £2.25m | 8 || /jumplist/transfers/spieler/7676/transfer_id/11828 | £2.10m | 9 ||--------------------------------------------------------+-------------+---------------|

Relational Model

players

id

name

position

clubs

id

name

country

transfers

id

fee

player_age

player_id

from_club_id

to_club_id

season

Graph model

Nodes

Relationships

Properties

Labels

Relational vs Graph

Records in tables

Nodes"Soft"

relationships computed at query time

"Hard" relationships built into the

data store

Relational Import

Create players table

CREATE TABLE players (

"id" character varying(100)

NOT NULL PRIMARY KEY,

"name" character varying(150) NOT NULL,

"position" character varying(20)

);

Insert players

INSERT INTO players

VALUES('/aldair/profil/spieler/4151', 'Aldair', 'Centre Back');

INSERT INTO players

VALUES('/thomas-hassler/profil/spieler/553', 'Thomas Häßler',

'Attacking Midfield');

INSERT INTO players VALUES('/roberto-

baggio/profil/spieler/4153', 'Roberto Baggio', 'Secondary

Striker');

Create clubs table

CREATE TABLE clubs (

"id" character varying(100)

NOT NULL PRIMARY KEY,

"name" character varying(50) NOT NULL,

"country" character varying(50)

);

Insert clubs

INSERT INTO clubs VALUES('/hertha-bsc/startseite/verein/44',

'Hertha BSC', 'Germany');

INSERT INTO clubs VALUES('/cfr-cluj/startseite/verein/7769',

'CFR Cluj', 'Romania');

INSERT INTO clubs VALUES('/real-sociedad/startseite/verein/681',

'Real Sociedad', 'Spain');

Create transfers table

CREATE TABLE transfers (

"id" character varying(100) NOT NULL PRIMARY KEY,

"fee" character varying(50) NOT NULL,

"numericFee" integer NOT NULL,

"player_age" smallint NOT NULL,

"season" character varying(5) NOT NULL,

"player_id" character varying(100) NOT NULL REFERENCES players (id),

"from_club_id" character varying(100) NOT NULL REFERENCES clubs (id),

"to_club_id" character varying(100) NOT NULL REFERENCES clubs (id)

);

Insert transfers

INSERT INTO transfers VALUES('/jumplist/transfers/spieler/4151/transfer_id/6993',

'£6.75m', 6750000, '90/91', 24, '/aldair/profil/spieler/4151',

'/benfica/startseite/verein/294', '/as-roma/startseite/verein/12');

INSERT INTO transfers VALUES('/jumplist/transfers/spieler/553/transfer_id/2405',

'£5.85m', 5850000, '90/91', 24, '/thomas-hassler/profil/spieler/553', '/1-fc-

koln/startseite/verein/3', '/juventus/startseite/verein/506');

INSERT INTO transfers VALUES('/jumplist/transfers/spieler/4153/transfer_id/84533',

'£5.81m', 5810000, '90/91', 23, '/roberto-baggio/profil/spieler/4153',

'/fiorentina/startseite/verein/430', '/juventus/startseite/verein/506');

Graph Import

LOAD CSV

‣ Tool for importing CSV files‣ Intended for data sets of ~10M records‣ Works against live database‣ Use Cypher constructs to define graph

LOAD CSV

[USING PERIODIC COMMIT [1000]]

LOAD CSV WITH HEADERS FROM "(file|http)://" AS row

MATCH (:Label {property: row.header})

CREATE (:Label {property: row.header})

MERGE (:Label {property: row.header})

LOAD CSV

[USING PERIODIC COMMIT [1000]]

LOAD CSV WITH HEADERS FROM "(file|http)://" AS row

MATCH (:Label {property: row.header})

CREATE (:Label {property: row.header})

MERGE (:Label {property: row.header})

LOAD CSV

[USING PERIODIC COMMIT [1000]]

LOAD CSV WITH HEADERS FROM "(file|http)://" AS row

MATCH (:Label {property: row.header})

CREATE (:Label {property: row.header})

MERGE (:Label {property: row.header})

LOAD CSV

[USING PERIODIC COMMIT [1000]]

LOAD CSV WITH HEADERS FROM "(file|http)://" AS row

MATCH (:Label {property: row.header})

CREATE (:Label {property: row.header})

MERGE (:Label {property: row.header})

LOAD CSV

[USING PERIODIC COMMIT [1000]]

LOAD CSV WITH HEADERS FROM "(file|http)://" AS row

MATCH (:Label {property: row.header})

CREATE (:Label {property: row.header})

MERGE (:Label {property: row.header})

LOAD CSV

[USING PERIODIC COMMIT [1000]]

LOAD CSV WITH HEADERS FROM "(file|http)://" AS row

MATCH (:Label {property: row.header})

CREATE (:Label {property: row.header})

MERGE (:Label {property: row.header})

Exploring the data

LOAD CSV WITH HEADERS

FROM "file:///transfers.csv"

AS row

RETURN COUNT(*)

Exploring the data

LOAD CSV WITH HEADERS

FROM "file:///transfers.csv"

AS row

RETURN COUNT(*)

Exploring the data

LOAD CSV WITH HEADERS

FROM "file:///transfers.csv"

AS row

RETURN row

LIMIT 1

Exploring the data

LOAD CSV WITH HEADERS

FROM "file:///transfers.csv"

AS row

RETURN row

LIMIT 1

Import players

USING PERIODIC COMMIT

LOAD CSV WITH HEADERS FROM "file:///transfers.csv" AS row

CREATE (player:Player {

id: row.playerUri,

name: row.playerName,

position: row.playerPosition

})

Import players

USING PERIODIC COMMIT

LOAD CSV WITH HEADERS FROM "file:///transfers.csv" AS row

CREATE (player:Player {

id: row.playerUri,

name: row.playerName,

position: row.playerPosition

})

Not so fast!

Ensure uniqueness of players

CREATE CONSTRAINT ON (player:Player)

ASSERT player.id IS UNIQUE

Import players

USING PERIODIC COMMIT

LOAD CSV WITH HEADERS FROM "file:///transfers.csv" AS row

CREATE (player:Player {

id: row.playerUri,

name: row.playerName,

position: row.playerPosition

})

Node 25 already exists with label Player and property "id"=[/peter-

lux/profil/spieler/84682]

Import players

USING PERIODIC COMMIT

LOAD CSV WITH HEADERS FROM "file:///transfers.csv" AS row

MERGE (player:Player {id: row.playerUri})

ON CREATE SET player.name = row.playerName,

player.position = row.playerPosition

Import clubs

CREATE CONSTRAINT ON (club:Club)

ASSERT club.id IS UNIQUE

Import selling clubs

USING PERIODIC COMMIT

LOAD CSV WITH HEADERS FROM "file:///transfers.csv" AS row

MERGE (club:Club {id: row.sellerClubUri})

ON CREATE SET club.name = row.sellerClubName,

club.country = row.sellerClubCountry

Import buying clubs

USING PERIODIC COMMIT

LOAD CSV WITH HEADERS FROM "file:///transfers.csv" AS row

MERGE (club:Club {id: row.buyerClubUri})

ON CREATE SET club.name = row.buyerClubName,

club.country = row.buyerClubCountry

Import transfers

CREATE CONSTRAINT ON (transfer:Transfer)

ASSERT transfer.id IS UNIQUE

Import transfers

LOAD CSV WITH HEADERS FROM "file:///transfers.csv" AS row

MATCH (player:Player {id: row.playerUri})

MATCH (source:Club {id: row.sellerClubUri})

MATCH (destination:Club {id: row.buyerClubUri})

MERGE (t:Transfer {id: row.transferUri})

ON CREATE SET t.season = row.season, t.rank = row.transferRank,

t.fee = row.transferFee

MERGE (t)-[:OF_PLAYER { age: row.playerAge }]->(player)

MERGE (t)-[:FROM_CLUB]->(source)

MERGE (t)-[:TO_CLUB]->(destination)

Schema

Optional Schema

‣ Unique node property constraint

Optional Schema

‣ Unique node property constraintCREATE CONSTRAINT ON (club:Club)

ASSERT club.id IS UNIQUE

Optional Schema

‣ Unique node property constraint‣ Node property existence constraint

Optional Schema

‣ Unique node property constraint‣ Node property existence constraintCREATE CONSTRAINT ON (club:Club)

ASSERT EXISTS(club.name)

Optional Schema

‣ Unique node property constraint‣ Node property existence constraint‣ Relationship property existence constraint

Optional Schema

‣ Unique node property constraint‣ Node property existence constraint‣ Relationship property existence constraintCREATE CONSTRAINT ON ()-[player:OF_PLAYER]-()

ASSERT exists(player.age)

SQL vs Cypher

Find player by name

SELECT *

FROM players

WHERE players.name = 'Cristiano Ronaldo'

SELECT *

FROM players

WHERE players.name = 'Cristiano Ronaldo'

MATCH (player:Player { name: "Cristiano Ronaldo" })

RETURN player

SELECT *

FROM players

WHERE players.name = 'Cristiano Ronaldo'

MATCH (player:Player { name: "Cristiano Ronaldo" })

RETURN player

SELECT *

FROM players

WHERE players.name = 'Cristiano Ronaldo'

MATCH (player:Player { name: "Cristiano Ronaldo" })

RETURN player

Find transfers between clubs

SELECT players.name, t."numericFee", t.season

FROM transfers AS t

JOIN clubs AS clubFrom ON t.from_club_id = clubFrom.id

JOIN clubs AS clubTo ON t.to_club_id = clubTo.id

JOIN players ON t.player_id = players.id

WHERE clubFrom.name = 'Tottenham Hotspur'

AND clubTo.name = 'Manchester United'

SELECT players.name, t."numericFee", t.season

FROM transfers AS t

JOIN clubs AS clubFrom ON t.from_club_id = clubFrom.id

JOIN clubs AS clubTo ON t.to_club_id = clubTo.id

JOIN players ON t.player_id = players.id

WHERE clubFrom.name = 'Tottenham Hotspur'

AND clubTo.name = 'Manchester United'

MATCH (from:Club)<-[:FROM_CLUB]-(transfer:Transfer)-[:TO_CLUB]->(to:Club),

(transfer)-[:OF_PLAYER]->(player)

WHERE from.name = "Tottenham Hotspur" AND to.name = "Manchester United"

RETURN player.name, transfer.numericFee, transfer.season

SELECT players.name, t."numericFee", t.season

FROM transfers AS t

JOIN clubs AS clubFrom ON t.from_club_id = clubFrom.id

JOIN clubs AS clubTo ON t.to_club_id = clubTo.id

JOIN players ON t.player_id = players.id

WHERE clubFrom.name = 'Tottenham Hotspur'

AND clubTo.name = 'Manchester United'

MATCH (from:Club)<-[:FROM_CLUB]-(transfer:Transfer)-[:TO_CLUB]->(to:Club),

(transfer)-[:OF_PLAYER]->(player)

WHERE from.name = "Tottenham Hotspur" AND to.name = "Manchester United"

RETURN player.name, transfer.numericFee, transfer.season

How does Neo4j use indexes?

Indexes are only used to find the starting point for queries.

Use index scans to look up rows in tables and join them with rows from other tables

Use indexes to find the starting points for a query.

Relational

Graph

How does Neo4j use indexes?

Migrating/refactoring the model

Player nationality

|------------------------------------------+--------------------+--------------------|

| playerUri | playerName | playerNationality |

|------------------------------------------+--------------------+--------------------|

| /aldair/profil/spieler/4151 | Aldair | Brazil |

| /thomas-hassler/profil/spieler/553 | Thomas Häßler | Germany |

| /roberto-baggio/profil/spieler/4153 | Roberto Baggio | Italy |

| /karl-heinz-riedle/profil/spieler/13806 | Karl-Heinz Riedle | Germany |

| /henrik-larsen/profil/spieler/101330 | Henrik Larsen | Denmark |

| /gheorghe-hagi/profil/spieler/7939 | Gheorghe Hagi | Romania |

| /hristo-stoichkov/profil/spieler/7938 | Hristo Stoichkov | Bulgaria |

| /brian-laudrup/profil/spieler/39667 | Brian Laudrup | Denmark |

| /miguel-angel-nadal/profil/spieler/7676 | Miguel Ángel Nadal | Spain |

|------------------------------------------+--------------------+--------------------|

Relational migration

Relational Model

players

id

name

position

nationality

clubs

id

name

country

transfers

id

fee

player_age

player_id

from_club_id

to_club_id

season

Add column to players table

ALTER TABLE players

ADD COLUMN nationality varying(30);

Update players table

UPDATE players

SET nationality = 'Brazil'

WHERE players.id = '/aldair/profil/spieler/4151';

UPDATE players

SET nationality = 'Germany'

WHERE players.id ='/ulf-kirsten/profil/spieler/74';

UPDATE players

SET nationality = 'England'

WHERE players.id ='/john-lukic/profil/spieler/28241';

Graph refactoring

Graph model

Add property to player nodes

USING PERIODIC COMMIT

LOAD CSV WITH HEADERS FROM "file:///transfers.csv" AS row

MATCH (player:Player {id: row.playerUri})

SET player.nationality = row.playerNationality

Find transfers of English players

SELECT players.name, clubFrom.name, clubTo.name, t."numericFee", t.season

FROM transfers AS t

JOIN clubs AS clubFrom ON t.from_club_id = clubFrom.id

JOIN clubs AS clubTo ON t.to_club_id = clubTo.id

JOIN players ON t.player_id = players.id

WHERE clubFrom.country = 'England' AND clubTo.country = 'England'

AND players.nationality = 'England'

ORDER BY t."numericFee" DESC

LIMIT 10

SELECT players.name, clubFrom.name, clubTo.name, t."numericFee", t.season

FROM transfers AS t

JOIN clubs AS clubFrom ON t.from_club_id = clubFrom.id

JOIN clubs AS clubTo ON t.to_club_id = clubTo.id

JOIN players ON t.player_id = players.id

WHERE clubFrom.country = 'England' AND clubTo.country = 'England'

AND players.nationality = 'England'

ORDER BY t."numericFee" DESC

LIMIT 10

MATCH (to:Club)<-[:TO_CLUB]-(t:Transfer)-[:FROM_CLUB]-(from:Club),

(t)-[:OF_PLAYER]->(player:Player)

WHERE to.country = "England" AND from.country = "England"

AND player.nationality = "England"

RETURN player.name, from.name, to.name, t.numericFee, t.season

ORDER BY t.numericFee DESC

LIMIT 10

SELECT players.name, clubFrom.name, clubTo.name, t."numericFee", t.season

FROM transfers AS t

JOIN clubs AS clubFrom ON t.from_club_id = clubFrom.id

JOIN clubs AS clubTo ON t.to_club_id = clubTo.id

JOIN players ON t.player_id = players.id

WHERE clubFrom.country = 'England' AND clubTo.country = 'England'

AND players.nationality = 'England'

ORDER BY t."numericFee" DESC

LIMIT 10

MATCH (to:Club)<-[:TO_CLUB]-(t:Transfer)-[:FROM_CLUB]-(from:Club),

(t)-[:OF_PLAYER]->(player:Player)

WHERE to.country = "England" AND from.country = "England"

AND player.nationality = "England"

RETURN player.name, from.name, to.name, t.numericFee, t.season

ORDER BY t.numericFee DESC

LIMIT 10

Countries and confederations|----------------------+----------------|| country | confederation ||----------------------+----------------|| Afghanistan | afc || Albania | uefa || Algeria | caf || American Samoa | ofc || Andorra | uefa || Angola | caf || Anguilla | concacaf || Antigua and Barbuda | concacaf || Argentina | conmebol ||----------------------+----------------|

|-----------+-----------+-------------------------------------------------|| urlName | shortName | region ||-----------+-----------+-------------------------------------------------|| afc | AFC | Asia || uefa | UEFA | Europe || ofc | OFC | Oceania || conmebol | CONMEBOL | South America || concacaf | CONCACAF | North American, Central American and Caribbean || caf | CAF | Africa ||-----------+-----------+-------------------------------------------------|

Relational migration

Relational Model

players

id

name

position

country_id

clubs

id

name

country_id

transfers

id

fee

player_age

player_id

from_club_id

to_club_id

season

countries

id

name

confederation_id

confederations

id

shortName

name

region

Create confederations table

CREATE TABLE confederations (

"id" character varying(10)

NOT NULL PRIMARY KEY,

"shortName" character varying(50) NOT NULL,

"name" character varying(100) NOT NULL,

"region" character varying(100) NOT NULL

);

Populate confederations

INSERT INTO confederations VALUES('afc', 'AFC', 'Asian Football

Confederation', 'Asia');

INSERT INTO confederations VALUES('uefa', 'UEFA', 'Union of European

Football Associations', 'Europe');

INSERT INTO confederations VALUES('ofc', 'OFC', 'Oceania Football

Confederation', 'Oceania');

Create countries table

CREATE TABLE countries (

"code" character varying(3)

NOT NULL PRIMARY KEY,

"name" character varying(50)

NOT NULL,

"federation" character varying(10) NOT NULL

REFERENCES confederations (id)

);

Populate countries

INSERT INTO countries VALUES('MNE', 'Montenegro', 'uefa');

INSERT INTO countries VALUES('LTU', 'Lithuania', 'uefa');

INSERT INTO countries VALUES('CAM', 'Cambodia', 'afc');

INSERT INTO countries VALUES('SUI', 'Switzerland', 'uefa');

INSERT INTO countries VALUES('ETH', 'Ethiopia', 'caf');

INSERT INTO countries VALUES('ARU', 'Aruba', 'concacaf');

INSERT INTO countries VALUES('SWZ', 'Swaziland', 'caf');

INSERT INTO countries VALUES('PLE', 'Palestine', 'afc');

Add column to clubs table

ALTER TABLE clubs

ADD COLUMN country_id character varying(3)

REFERENCES countries(code);

Update clubs

UPDATE clubs AS cl

SET country_id = c.code

FROM clubs

INNER JOIN countries AS c

ON c.name = clubs.country

WHERE cl.id = clubs.id;

Update clubs

# select * from clubs limit 5;

id | name | country | country_id

----------------------------------------+-----------------------------+---------------+------------

/san-jose-clash/startseite/verein/4942 | San Jose Clash | United States | USA

/chicago/startseite/verein/432 | Chicago Fire | United States | USA

/gz-evergrande/startseite/verein/10948 | Guangzhou Evergrande Taobao | China | CHN

/as-vita-club/startseite/verein/2225 | AS Vita Club Kinshasa | Congo DR | CGO

/vicenza/startseite/verein/2655 | Vicenza Calcio | Italy | ITA

(6 rows)

Remove country

ALTER TABLE clubs

DROP COLUMN country;

Add column to players table

ALTER TABLE players

ADD COLUMN country_id character varying(3)

REFERENCES countries(code);

Update players

UPDATE players AS p

SET country_id = c.code

FROM players

INNER JOIN countries AS c

ON c.name = players.nationality

WHERE p.id = players.id;

Update players

# select * from players limit 5;

id | name | position | nationality | country_id

-----------------------------------------+-------------------+--------------------+-------------+------------

/dalian-atkinson/profil/spieler/200738 | Dalian Atkinson | Attacking Midfield | England | ENG

/steve-redmond/profil/spieler/177056 | Steve Redmond | Centre Back | England | ENG

/bert-konterman/profil/spieler/6252 | Bert Konterman | Centre Back | Netherlands | NED

/lee-philpott/profil/spieler/228030 | Lee Philpott | Midfield | England | ENG

/tomasz-frankowski/profil/spieler/14911 | Tomasz Frankowski | Centre Forward | Poland | POL

(5 rows)

Remove nationality

ALTER TABLE players

DROP COLUMN nationality;

Graph refactoring

Graph model

Import confederations

LOAD CSV WITH HEADERS

FROM "file:///confederations.csv" AS row

MERGE (c:Confederation {id: row.urlName})

ON CREATE

SET c.shortName = row.shortName,

c.region = row.region,

c.name = row.name

Import countries

LOAD CSV WITH HEADERS FROM "file:///countries.csv"

AS row

MERGE (country:Country {id: row.countryCode})

ON CREATE SET country.name = row.country

WITH country, row

MATCH (conf:Confederation {id: row.confederation })

MERGE (country)-[:PART_OF]->(conf)

Refactor clubs

MATCH (club:Club)

MATCH (country:Country {name: club.country})

MERGE (club)-[:PART_OF]->(country)

REMOVE club.country

Refactor players

MATCH (player:Player)

MATCH (country:Country {name: player.nationality})

MERGE (player)-[:PLAYS_FOR]->(country)

REMOVE player.nationality

Recap: Find transfers of English players

SELECT players.name, clubFrom.name, clubTo.name, t."numericFee", t.season

FROM transfers AS t

JOIN clubs AS clubFrom ON t.from_club_id = clubFrom.id

JOIN clubs AS clubTo ON t.to_club_id = clubTo.id

JOIN players ON t.player_id = players.id

WHERE clubFrom.country = 'England' AND clubTo.country = 'England'

AND players.nationality = 'England'

ORDER BY t."numericFee" DESC

LIMIT 10

MATCH (to:Club)<-[:TO_CLUB]-(t:Transfer)-[:FROM_CLUB]-(from:Club),

(t)-[:OF_PLAYER]->(player:Player)

WHERE to.country = "England" AND from.country = "England"

AND player.nationality = "England"

RETURN player.name, from.name, to.name, t.numericFee, t.season

ORDER BY t.numericFee DESC

LIMIT 10

SELECT players.name, clubFrom.name, clubTo.name, t."numericFee", t.season

FROM transfers AS t

JOIN clubs AS clubFrom ON t.from_club_id = clubFrom.id

JOIN clubs AS clubTo ON t.to_club_id = clubTo.id

JOIN players ON t.player_id = players.id

JOIN countries AS fromCount ON clubFrom.country_id = fromCount.code

JOIN countries AS toCount ON clubTo.country_id = toCount.code

JOIN countries AS playerCount ON players.country_id = playerCount.code

WHERE fromCount.name = 'England' AND toCount.name = 'England' AND playerCount.name = 'England'

ORDER BY t."numericFee" DESC

LIMIT 10

MATCH (to:Club)<-[:TO_CLUB]-(t:Transfer)-[:FROM_CLUB]-(from:Club),

(t)-[:OF_PLAYER]->(player:Player)-[:PLAYS_FOR]->(country:Country),

(to)-[:PART_OF]->(country)<-[:PART_OF]-(from)

WHERE country.name = "England"

RETURN player.name, from.name, to.name, t.numericFee, t.season

ORDER BY t.numericFee DESC

LIMIT 10

SELECT players.name, clubFrom.name, clubTo.name, t."numericFee", t.season

FROM transfers AS t

JOIN clubs AS clubFrom ON t.from_club_id = clubFrom.id

JOIN clubs AS clubTo ON t.to_club_id = clubTo.id

JOIN players ON t.player_id = players.id

JOIN countries AS fromCount ON clubFrom.country_id = fromCount.code

JOIN countries AS toCount ON clubTo.country_id = toCount.code

JOIN countries AS playerCount ON players.country_id = playerCount.code

WHERE fromCount.name = 'England' AND toCount.name = 'England' AND playerCount.name = 'England'

ORDER BY t."numericFee" DESC

LIMIT 10

MATCH (to:Club)<-[:TO_CLUB]-(t:Transfer)-[:FROM_CLUB]-(from:Club),

(t)-[:OF_PLAYER]->(player:Player)-[:PLAYS_FOR]->(country:Country),

(to)-[:PART_OF]->(country)<-[:PART_OF]-(from)

WHERE country.name = "England"

RETURN player.name, from.name, to.name, t.numericFee, t.season

ORDER BY t.numericFee DESC

LIMIT 10

SELECT players.name, clubFrom.name, clubTo.name, t."numericFee", t.season

FROM transfers AS t

JOIN clubs AS clubFrom ON t.from_club_id = clubFrom.id

JOIN clubs AS clubTo ON t.to_club_id = clubTo.id

JOIN players ON t.player_id = players.id

JOIN countries AS fromCount ON clubFrom.country_id = fromCount.code

JOIN countries AS toCount ON clubTo.country_id = toCount.code

JOIN countries AS playerCount ON players.country_id = playerCount.code

WHERE fromCount.name = 'England' AND toCount.name = 'England' AND playerCount.name = 'England'

ORDER BY t."numericFee" DESC

LIMIT 10

MATCH (to:Club)<-[:TO_CLUB]-(t:Transfer)-[:FROM_CLUB]-(from:Club),

(t)-[:OF_PLAYER]->(player:Player)-[:PLAYS_FOR]->(country:Country),

(to)-[:PART_OF]->(country)<-[:PART_OF]-(from)

WHERE country.name = "England"

RETURN player.name, from.name, to.name, t.numericFee, t.season

ORDER BY t.numericFee DESC

LIMIT 10

Find transfers between different confederations

SELECT * FROM transfers AS t

JOIN clubs AS clubFrom ON t.from_club_id = clubFrom.id

JOIN clubs AS clubTo ON t.to_club_id = clubTo.id

JOIN players ON t.player_id = players.id

JOIN countries AS fromCountry ON clubFrom.country_id = fromCountry.code

JOIN countries AS toCountry ON clubTo.country_id = toCountry.code

JOIN confederations AS fromConfederation ON fromCountry.federation = fromConfederation.id

JOIN confederations AS toConfederation ON toCountry.federation = toConfederation.id

WHERE fromConfederation.id = 'afc' AND toConfederation.id = 'uefa'

ORDER BY t."numericFee" DESC

LIMIT 10

SELECT * FROM transfers AS t

JOIN clubs AS clubFrom ON t.from_club_id = clubFrom.id

JOIN clubs AS clubTo ON t.to_club_id = clubTo.id

JOIN players ON t.player_id = players.id

JOIN countries AS fromCountry ON clubFrom.country_id = fromCountry.code

JOIN countries AS toCountry ON clubTo.country_id = toCountry.code

JOIN confederations AS fromConfederation ON fromCountry.federation = fromConfederation.id

JOIN confederations AS toConfederation ON toCountry.federation = toConfederation.id

WHERE fromConfederation.id = 'afc' AND toConfederation.id = 'uefa'

ORDER BY t."numericFee" DESC

LIMIT 10

MATCH (to:Club)<-[:TO_CLUB]-(t:Transfer)-[:FROM_CLUB]-(from:Club),

(t)-[:OF_PLAYER]->(player:Player),

(from)-[:PART_OF*2]->(:Confederation {id: "afc"}),

(to)-[:PART_OF*2]->(:Confederation {id: "uefa"})

RETURN player.name, from.name, to.name, t.numericFee, t.season

ORDER BY t.numericFee DESC

LIMIT 10

What’s in my database?

Tables

# \dt List of relations

Schema | Name | Type | Owner

--------+----------------+-------+-------------

public | clubs | table | markneedham

public | confederations | table | markneedham

public | countries | table | markneedham

public | players | table | markneedham

public | transfers | table | markneedham

(5 rows)

Node labels

CALL db.labels()+=============+

|label |

+=============+

|Player |

+-------------+

|Club |

+-------------+

|Transfer |

+-------------+

|Loan |

+-------------+

|Confederation|

+-------------+

|Country |

+-------------+

Node labels

Table schema

# \d+ countries Table "public.countries"

Column | Type | Modifiers | Storage | Stats target | Description

------------+-----------------------+-----------+----------+--------------+-------------

code | character varying(3) | not null | extended | |

name | character varying(50) | not null | extended | |

federation | character varying(10) | not null | extended | |

Indexes:

"pk_countries" PRIMARY KEY, btree (code)

Foreign-key constraints:

"countries_federation_fkey" FOREIGN KEY (federation) REFERENCES confederations(id)

Referenced by:

TABLE "players" CONSTRAINT "playersfk" FOREIGN KEY (country_id) REFERENCES countries(code) MATCH FULL

:schema

Indexes

ON :Club(name) ONLINE

ON :Club(id) ONLINE (for uniqueness constraint)

ON :Player(name) ONLINE

ON :Player(id) ONLINE (for uniqueness constraint)

Constraints

ON (player:Player) ASSERT player.id IS UNIQUE

ON (club:Club) ASSERT exists(club.name)

ON (club:Club) ASSERT club.id IS UNIQUE

ON ()-[of_player:OF_PLAYER]-() ASSERT exists(of_player.age)

Graph schema

MATCH (country:Country)

RETURN keys(country), COUNT(*) AS times+-----------------------+

| keys(country) | times |

+-----------------------+

| ["id","name"] | 198 |

+-----------------------+

Graph schema

Graph schema

MATCH (club:Club)

RETURN keys(club), COUNT(*) AS times+---------------------------------+

| keys(club) | times |

+---------------------------------+

| ["id","name"] | 806 |

| ["name","country","id"] | 1 |

+---------------------------------+

Entity/Relationship diagram

Meta graph

Meta graph

MATCH (a)-[r]->(b) WITH head(labels(a)) AS l, head(labels(b)) AS l2, type(r) AS rel_type, count(*) as count CALL apoc.create.vNode([l],{name:l}) yield node as a CALL apoc.create.vNode([l2],{name:l2}) yield node as b CALL apoc.create.vRelationship(a,rel_type,{name:rel_type, count:count},b) YIELD rel RETURN *;

Data Integrity

Clubs without country

# SELECT * FROM clubs where country_id is null; id | name | country | country_id

---------------------------------------+-------------------------+---------------+------------

/unknown/startseite/verein/75 | Unknown | |

/pohang/startseite/verein/311 | Pohang Steelers | Korea, South |

/bluewings/startseite/verein/3301 | Suwon Samsung Bluewings | Korea, South |

/ulsan/startseite/verein/3535 | Ulsan Hyundai | Korea, South |

/africa-sports/startseite/verein/2936 | Africa Sports | Cote d'Ivoire |

/monaco/startseite/verein/162 | AS Monaco | Monaco |

/jeonbuk/startseite/verein/6502 | Jeonbuk Hyundai Motors | Korea, South |

/busan/startseite/verein/2582 | Busan IPark | Korea, South |

(8 rows)

Clubs without country

MATCH (club:Club)

WHERE NOT (club)-[:PART_OF]->()

RETURN club+=====================================================================+

|club |

+=====================================================================+

|{name: Unknown, id: /unknown/startseite/verein/75} |

+---------------------------------------------------------------------+

|{country: Monaco, name: AS Monaco, id: /monaco/startseite/verein/162}|

+---------------------------------------------------------------------+

Deleting data - SQL

# drop table countries;

ERROR: cannot drop table countries because other objects depend on

it

DETAIL: constraint playersfk on table players depends on table

countries

HINT: Use DROP ... CASCADE to drop the dependent objects too.

MATCH (country:Country)

DELETE country

org.neo4j.kernel.api.exceptions.TransactionFailureException: Node

record Node[11306,used=false,rel=24095,prop=-1,labels=Inline(0x0:

[]),light] still has relationships

Deleting data - Cypher

MATCH (country:Country)

DETACH DELETE country

Deleted 198 nodes, deleted 5071 relationships, statement executed

in 498 ms.

Deleting data - Cypher

Query Optimisation

Optimising queries

‣ Use EXPLAIN/PROFILE to see what your queries are doing under the covers

‣ Index the starting points of queries‣ Reduce work in progress of intermediate

parts of the query where possible‣ Look at the warnings in the Neo4j browser -

they are often helpful!

Optimising queries - useful links

‣ Tuning Your Cypherhttps://www.youtube.com/watch?v=tYtyoYcd_e8

‣ Neo4j 2.2 Query Tuninghttp://neo4j.com/blog/neo4j-2-2-query-tuning/

‣ Ask for help on Stack Overflow/Neo4j Slackhttp://neo4j-users-slack-invite.herokuapp.com

One more thing...

‣ New in Neo4j 3.0.0!

Procedures

‣ New in Neo4j 3.0.0!‣ We’ve already seen an example!

CALL db.labels()

‣ Michael Hunger has created a set of procedures (APOC) at:https://github.com/jexp/neo4j-apoc-procedures

Procedures

WITH "https://api.github.com/search/repositories?q=neo4j"

AS githubUri

CALL apoc.load.json(githubUri)

YIELD value AS document

UNWIND document.items AS item

RETURN item.full_name, item.watchers_count, item.forks

ORDER BY item.forks DESC

Querying github

WITH "https://api.github.com/search/repositories?q=neo4j"

AS githubUri

CALL apoc.load.json(githubUri)

YIELD value AS document

UNWIND document.items AS item

RETURN item.full_name, item.watchers_count, item.forks

ORDER BY item.forks DESC

Querying github

+------------------------------------------------------------------------+

| item.full_name | item.watchers_count | item.forks |

+------------------------------------------------------------------------+

| "neo4j/neo4j" | 2472 | 872 |

| "spring-projects/spring-data-neo4j" | 403 | 476 |

| "neo4j-contrib/developer-resources" | 106 | 295 |

| "neo4jrb/neo4j" | 1014 | 190 |

| "jadell/neo4jphp" | 507 | 140 |

| "thingdom/node-neo4j" | 780 | 127 |

| "aseemk/node-neo4j-template" | 176 | 91 |

| "jimwebber/neo4j-tutorial" | 268 | 87 |

| "rickardoberg/neo4j-jdbc" | 33 | 68 |

| "FaKod/neo4j-scala" | 194 | 64 |

+------------------------------------------------------------------------+

Querying github

Questions? :-)

Mark Needhammark@neo4j.com @markhneedham

https://github.com/neo4j-meetups/cypher-for-sql-developers

top related