give sense to your big data w/ apache tinkerpop™ & property graph databases

44
Give sense to your Big Data with Apache TinkerPop™ and property-graph databases DuyHai DOAN Apache Cassandra™ evangelist

Upload: datastax

Post on 22-Jan-2018

242 views

Category:

Technology


1 download

TRANSCRIPT

Page 1: Give sense to your Big Data w/ Apache TinkerPop™ & property graph databases

Give sense to your Big Data with Apache

TinkerPop™ and property-graph databases

DuyHai DOAN

Apache Cassandra™ evangelist

Page 2: Give sense to your Big Data w/ Apache TinkerPop™ & property graph databases

@doanduyhai

Who Am I ?

2

• Technical advocate for Apache Cassandra™ at Datastax

• Committer for Apache Zeppelin™ and maintainer of Zeppelin/Cassandra

interpreter

[email protected]

• @doanduyhai

Page 3: Give sense to your Big Data w/ Apache TinkerPop™ & property graph databases

@doanduyhai

Who is Datastax

3

• Company offering Datastax Enterprise, a commercial distribution of Apache

Cassandra™

• Datastax Enterprise == Apache Cassandra™ ++ features

Page 4: Give sense to your Big Data w/ Apache TinkerPop™ & property graph databases

Why graph databases ?

Page 5: Give sense to your Big Data w/ Apache TinkerPop™ & property graph databases

@doanduyhai

As of 2017

5

Page 6: Give sense to your Big Data w/ Apache TinkerPop™ & property graph databases

@doanduyhai

Who is not using any of those apps ?

6

Page 7: Give sense to your Big Data w/ Apache TinkerPop™ & property graph databases

@doanduyhai

Needle in a haystack

Page 8: Give sense to your Big Data w/ Apache TinkerPop™ & property graph databases

@doanduyhai

Finding patterns

Page 9: Give sense to your Big Data w/ Apache TinkerPop™ & property graph databases

@doanduyhai

Root-cause analysis

Impact propagation

Page 10: Give sense to your Big Data w/ Apache TinkerPop™ & property graph databases

@doanduyhai

Everything is connected

Page 11: Give sense to your Big Data w/ Apache TinkerPop™ & property graph databases

@doanduyhai

Graph databases are trending

11

Page 12: Give sense to your Big Data w/ Apache TinkerPop™ & property graph databases

Graph vs

Relational

Page 13: Give sense to your Big Data w/ Apache TinkerPop™ & property graph databases

@doanduyhai

Relational databases

13

UserspersonId firstname lastname …

MoviesmovieId title country …

ViewpersonId movieId view_time …

Page 14: Give sense to your Big Data w/ Apache TinkerPop™ & property graph databases

@doanduyhai

Relational databases

14

• Define the relationships between entities

• Store the entities and relationships in a normalized fashion (normal forms)

Page 15: Give sense to your Big Data w/ Apache TinkerPop™ & property graph databases

@doanduyhai

Graph databases

15

User Movieview

Page 16: Give sense to your Big Data w/ Apache TinkerPop™ & property graph databases

@doanduyhai

Graph databases

16

• Define the relationships between entities

• Store the entities and relationships

• Allow end-users to explore the relationships

• Allow end-users to discover unexpected relations between entities

Page 17: Give sense to your Big Data w/ Apache TinkerPop™ & property graph databases

The value of data is

proportional to the number of

meaningful relationships

Page 18: Give sense to your Big Data w/ Apache TinkerPop™ & property graph databases

@doanduyhai

When to use graph databases ?

18

Page 19: Give sense to your Big Data w/ Apache TinkerPop™ & property graph databases

Apache TinkerPop™ introduction

Page 20: Give sense to your Big Data w/ Apache TinkerPop™ & property graph databases

What is TinkerPop ?

Page 21: Give sense to your Big Data w/ Apache TinkerPop™ & property graph databases

@doanduyhai

Apache TinkerPop™

21

• Open-source graph computing framework

• Started in 2009 by Marko A. Rodriguez, Josh Shinavier, and Peter Neubauer

• Join ASF since January 2015

• Currently version 3.2.4Frame

Furnac

e

Pipe

BluePrint

RexsterGremlin

Page 22: Give sense to your Big Data w/ Apache TinkerPop™ & property graph databases

@doanduyhai

TinkerPop stack

22

Real-time Batch

Page 23: Give sense to your Big Data w/ Apache TinkerPop™ & property graph databases

@doanduyhai

Graph databases family

23

• RDF (Resource Description Framework)

• AllegroGraph, BlazeGraph, OntoText, OpenLink Virtuoso …

• Property-graph

• Neo4J, Titan, Datastax Enterprise (DSE) Graph, OrientDB …

Page 24: Give sense to your Big Data w/ Apache TinkerPop™ & property graph databases

Property Graph

Page 25: Give sense to your Big Data w/ Apache TinkerPop™ & property graph databases

@doanduyhai

A graph is

25

• A set of vertices (nodes) and edges (arcs)

• Formal definition: G = (V, E)

User Movie

Vertices

Edge

Page 26: Give sense to your Big Data w/ Apache TinkerPop™ & property graph databases

@doanduyhai

A property-graph is

26

• A directed

User

Page 27: Give sense to your Big Data w/ Apache TinkerPop™ & property graph databases

@doanduyhai

A property-graph is

27

• A directed, binary,

User Movie

Page 28: Give sense to your Big Data w/ Apache TinkerPop™ & property graph databases

@doanduyhai

A property-graph is

28

• A directed, binary, attributed multi-graph

User Movie

name: DuyHai

age: 35

title: The Jedi Return

categories: [SF, action,

space]

view

view_time: xxx

knows

Page 29: Give sense to your Big Data w/ Apache TinkerPop™ & property graph databases

@doanduyhai

Some definitions

29

User Movie

Vertex Properties

name: DuyHai

age: 35

title: The Jedi Return

categories: [SF, action,

space]

Vertex Properties

Page 30: Give sense to your Big Data w/ Apache TinkerPop™ & property graph databases

@doanduyhai

Some definitions

30

User Movie

Edge

Edge

EdgeLabe

l

EdgeLabel

view

knows

Page 31: Give sense to your Big Data w/ Apache TinkerPop™ & property graph databases

@doanduyhai

Some definitions

31

User Movie

Edge

Edge

Properties

view

knows

view_time: xxx

Page 32: Give sense to your Big Data w/ Apache TinkerPop™ & property graph databases

Gremlin graph traversal

Page 33: Give sense to your Big Data w/ Apache TinkerPop™ & property graph databases

@doanduyhai

Graph vs Hardware allegory

33

Page 34: Give sense to your Big Data w/ Apache TinkerPop™ & property graph databases

Graph Traversal

Page 35: Give sense to your Big Data w/ Apache TinkerPop™ & property graph databases

@doanduyhai

Example of graph traversal

35

User

friendWith

Movie

like

Person

actor director

Give me all movies in which "Harrison Ford"

has played as actor

name

gender

title

year

id

name

rating

Page 36: Give sense to your Big Data w/ Apache TinkerPop™ & property graph databases

@doanduyhai

Example of graph traversal

36

g.V()

User

friendWith

Movie

like

Person

actor director

Give me all movies in which "Harrison Ford »

has played as actor

name

gender

title

year

id

name

rating

Page 37: Give sense to your Big Data w/ Apache TinkerPop™ & property graph databases

@doanduyhai

Example of graph traversal

37

g.V().hasLabel("Person")

User

friendWith

Movie

like

Person

actor director

Give me all movies in which "Harrison Ford »

has played as actor

name

gender

title

year

id

name

rating

Page 38: Give sense to your Big Data w/ Apache TinkerPop™ & property graph databases

@doanduyhai

Example of graph traversal

38

User

friendWith

Movie

like

Person

actor director

Give me all movies in which "Harrison Ford"

has played as actor

name

gender

title

year

id

name

rating

g.V().hasLabel("Person")

.has("name","Harrison Ford")

Page 39: Give sense to your Big Data w/ Apache TinkerPop™ & property graph databases

@doanduyhai

Example of graph traversal

39

User

friendWith

Movie

like

Person

actor director

Give me all movies in which "Harrison Ford"

has played as actor

name

gender

title

year

id

name

rating

g.V().hasLabel("Person")

.has("name","Harrison Ford")

.out("actor")

Page 40: Give sense to your Big Data w/ Apache TinkerPop™ & property graph databases

@doanduyhai

Example of graph traversal

40

User

friendWith

Movie

like

Person

actor director

Give me all movies in which "Harrison Ford"

has played as actor with mean rating > 7

name

gender

title

year

id

name

rating

g.V().hasLabel("Person")

.has("name","Harrison Ford")

.out("actor")

.where(inE("like").values("rating").mean().is(

gt(7)))

Page 41: Give sense to your Big Data w/ Apache TinkerPop™ & property graph databases

More Examples

Page 42: Give sense to your Big Data w/ Apache TinkerPop™ & property graph databases

Demo

42

Page 43: Give sense to your Big Data w/ Apache TinkerPop™ & property graph databases

43

Q & A

Page 44: Give sense to your Big Data w/ Apache TinkerPop™ & property graph databases

44

@doanduyhai

[email protected]

https://academy.datastax.com/

Thank You