an overview · neo4j graph data science library an overview max kießling - open source neo4j...

12
Neo4j Graph Data Science Library An Overview Max Kießling

Upload: others

Post on 22-Sep-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: An Overview · Neo4j Graph Data Science Library An Overview Max Kießling - Open Source Neo4j Add-On for graph analytics - Provides a set of high performance graph algorithms - Community

Neo4j Graph Data Science LibraryAn Overview

Max Kießling

Page 2: An Overview · Neo4j Graph Data Science Library An Overview Max Kießling - Open Source Neo4j Add-On for graph analytics - Provides a set of high performance graph algorithms - Community

- Open Source Neo4j Add-On for graph analytics- Provides a set of high performance graph algorithms

- Community Detection / Clustering (e.g. Label Propagation)

- Similarity Calculation (e.g. NodeSimilarity)- Centrality Algorithms (e.g. PageRank)- PathFinding (e.g. Dijkstra)- Link Prediction (e.g. Adamic Adar)- and more

- APIs for implementing custom algorithms (e.g. Pregel)2

What is the Graph Data Science Library?

Page 3: An Overview · Neo4j Graph Data Science Library An Overview Max Kießling - Open Source Neo4j Add-On for graph analytics - Provides a set of high performance graph algorithms - Community

3

Neo4j GDS - Timeline

Development started as

Neo4j Contrib - Graph Algorithms

organized by Neo4j Labs, developed by AVGL

Q1 2017

Q1 2019

Neo4j Product Engineering

takes over the project

Productization of the library

Open Source Preview Release

Q1 2020

Q2 2020

Neo4j Graph Data Science Library

Release 1.0

Page 4: An Overview · Neo4j Graph Data Science Library An Overview Max Kießling - Open Source Neo4j Add-On for graph analytics - Provides a set of high performance graph algorithms - Community

Local Patterns to Global Computation

4

Query (e.g. Cypher/SQL) Real-time, local decisioning

and pattern matching

Graph Algorithms LibrariesGlobal analysis and iterations

You know what you’re looking for and making a

decision

You’re learning the overall structure of a network, updating data, and

predicting

Local Patterns

Global Computation

Page 5: An Overview · Neo4j Graph Data Science Library An Overview Max Kießling - Open Source Neo4j Add-On for graph analytics - Provides a set of high performance graph algorithms - Community

Workflow

5

RAM RAM

Load graph projection into main memory

Run algorithm via Cypher procedure

Consume result

Page 6: An Overview · Neo4j Graph Data Science Library An Overview Max Kießling - Open Source Neo4j Add-On for graph analytics - Provides a set of high performance graph algorithms - Community

• Parallel Breadth First Search• Parallel Depth First Search• Shortest Path• Minimum Spanning Tree• A* Shortest Path• Yen’s K Shortest Path• K-Spanning Tree (MST)• Random Walk

Pathfinding & Search

• PageRank• Personalized PageRank• Degree Centrality• Closeness Centrality• Betweenness Centrality• ArticleRank • Eigenvector Centrality

Centrality / Importance

• Label Propagation• Louvain• Weakly Connected Components• Triangle Count• Clustering Coefficients• Strongly Connected Components• Balanced Triad (identification)

Community Detection

• Node Similarity• Euclidean Distance• Cosine Similarity • Overlap Similarity• Pearson Similarity

Similarity

Link Prediction

• Adamic Adar• Common Neighbors• Preferential Attachment• Resource Allocations• Same Community• Total Neighbors

6

Available Algorithms

Page 7: An Overview · Neo4j Graph Data Science Library An Overview Max Kießling - Open Source Neo4j Add-On for graph analytics - Provides a set of high performance graph algorithms - Community

Demo Time!

7

Page 8: An Overview · Neo4j Graph Data Science Library An Overview Max Kießling - Open Source Neo4j Add-On for graph analytics - Provides a set of high performance graph algorithms - Community

GDS - Algo Syntax

8

CALL gds.<algo-name>.<mode>(

graphName: STRING,

configuration: MAP)

Available Modes:

● write: writes results to the Neo4j database and returns a summary of the results.

● stats: runs the algorithm and only reportsstatistics.

● stream: streams results back to the user.

CALL gds.wcc.write(

"got-interactions",

{

writeProperty: "component",

consecutiveIds: true

}) YIELDS writeMillis, componentCount

CALL gds.wcc.stream(

"got-interactions",

{}) YIELDS nodeId, componentId

Page 9: An Overview · Neo4j Graph Data Science Library An Overview Max Kießling - Open Source Neo4j Add-On for graph analytics - Provides a set of high performance graph algorithms - Community

Take a look!

9

The Neo4j Graph Data Science Library is Open Source

https://github.com/neo4j/graph-data-science

Page 10: An Overview · Neo4j Graph Data Science Library An Overview Max Kießling - Open Source Neo4j Add-On for graph analytics - Provides a set of high performance graph algorithms - Community

Section Slide

10

Page 11: An Overview · Neo4j Graph Data Science Library An Overview Max Kießling - Open Source Neo4j Add-On for graph analytics - Provides a set of high performance graph algorithms - Community

Section Slide

11

Page 12: An Overview · Neo4j Graph Data Science Library An Overview Max Kießling - Open Source Neo4j Add-On for graph analytics - Provides a set of high performance graph algorithms - Community

Section Slide

12