gluecon infinitegraph/db

12
The following is an excerpt of presentation delivered at Gluecon 2010 in Broomfield Colorado. The presentation is not a presentation on the InfiniteGraph/DB, but an overview of managing distributed graph data in a graph database. Copyright © InfiniteGraph

Upload: wdavidson16

Post on 14-Jan-2015

1.301 views

Category:

Technology


1 download

DESCRIPTION

Distributed graph database managment

TRANSCRIPT

Page 1: Gluecon InfiniteGraph/DB

The following is an excerpt of presentation delivered at Gluecon 2010 in Broomfield

Colorado.

The presentation is not a presentation on the InfiniteGraph/DB, but an overview of

managing distributed graph data in a graph database.

Copyright © InfiniteGraph

Page 2: Gluecon InfiniteGraph/DB

Scaling the [Social] Graphin the [Cloud]

Darren WoodLead Architect, InfiniteGraph

Page 3: Gluecon InfiniteGraph/DB

Graph Databases (Quickly)

• Optimized around data relationships

• Small focused API (typically not SQL)

• Typical Use Cases :

– Social Graph Analysis

– Catching Bad Guys (see Booth 16)

– Fraud / Financial (more bad guys)

– Data Intensive Science

– Web / Advertising Analytics

Copyright © InfiniteGraph

Page 4: Gluecon InfiniteGraph/DB

Graph Databases (Almost Done)

Copyright © InfiniteGraph

Vertex alice = myGraph.addVertex(new Person(“Alice”)); Vertex bob = myGraph.addVertex(new Person(“Bob”)); Vertex carlos = myGraph.addVertex(new Person(“Carlos”)); Vertex charlie = myGraph.addVertex(new Person(“Charlie”));

alice.addEdge(new Meeting(“Denver”, “5-27-10”), bob);bob.addEdge(new Call(timestamp), carlos);carlos.addEdge(new Payment(100000.00), charlie);bob.addEdge(new Call(timestamp), charlie);

Alice Carlos CharlieBobMeets Calls Pays

Calls

Page 5: Gluecon InfiniteGraph/DB

What’s So Difficult Then ?

• Graphs grow quickly

– Billions of phone calls / day in US

– Emails, social media events, IP Traffic

– Financial transactions

• Some analytics require navigation of large sections of the graph

• Each step (often) depends on the last

• Must distribute data and go parallel

Copyright © InfiniteGraph

Page 6: Gluecon InfiniteGraph/DB

First Some Good News…

• Graph algorithms naturally branch

• Can be automated or guided

Copyright © InfiniteGraph

Alice

Carlos CharlieBobMeets Calls Pays

Dave EveChuck

Calls

Lives With

Meets

Page 7: Gluecon InfiniteGraph/DB

Distributed API

Application(s)

Partition 1 Partition 3Partition 2 Partition ...n

Processor Processor Processor Processor

Big Distributed Data(Traditional - Huge Generalization)

Copyright © InfiniteGraph

Page 8: Gluecon InfiniteGraph/DB

Distributed API

Application(s)

Partition 1 Partition 3Partition 2 Partition ...n

Processor Processor Processor Processor

Big Distributed Data(Graph)

Copyright © InfiniteGraph

Page 9: Gluecon InfiniteGraph/DB

Processor

Distributed API

Partition 1 Partition 2

Processor

So What Are The Answers?Best Effort Partitioning

Copyright © InfiniteGraph

Page 10: Gluecon InfiniteGraph/DB

Processor

Distributed API

Partition 1 Partition 2

Processor

So What Are The Answers?The Look Ahead Example

Copyright © InfiniteGraph

Application

A

XY

B

C

D

E

Page 11: Gluecon InfiniteGraph/DB

Which of These Work ?

• A carefully orchestrated combination of various options

• Can be tuned (degree of look ahead)

• Healing graph can be expensive (write cost)

• This can also be tuned/configured (external edge thresholds)

Copyright © InfiniteGraph

Page 12: Gluecon InfiniteGraph/DB

Thankyou !

Copyright © InfiniteGraph

[email protected]

twitter.com/infinitegraph