titan @ gitpro conference 2014
DESCRIPTION
Presents Titan and Faunus at the Gitpro conference help April 12, 2014.TRANSCRIPT
![Page 1: Titan @ Gitpro Conference 2014](https://reader033.vdocuments.mx/reader033/viewer/2022061218/54b7a3724a795993718b4780/html5/thumbnails/1.jpg)
AURELIUS THINKAURELIUS.COM
TITAN Scalable Graph Database
Matthias Broecheler @mbroecheler April 12th, MMXIV
![Page 2: Titan @ Gitpro Conference 2014](https://reader033.vdocuments.mx/reader033/viewer/2022061218/54b7a3724a795993718b4780/html5/thumbnails/2.jpg)
Database
L?;F NCG?
BCAB NBLIOABJON
NL;HM;=NCIH;F
![Page 3: Titan @ Gitpro Conference 2014](https://reader033.vdocuments.mx/reader033/viewer/2022061218/54b7a3724a795993718b4780/html5/thumbnails/3.jpg)
Graph Database
![Page 4: Titan @ Gitpro Conference 2014](https://reader033.vdocuments.mx/reader033/viewer/2022061218/54b7a3724a795993718b4780/html5/thumbnails/4.jpg)
Graph Database
M=;F;<F?
CHN?AL;N?>
IJ?H MIOL=?
![Page 5: Titan @ Gitpro Conference 2014](https://reader033.vdocuments.mx/reader033/viewer/2022061218/54b7a3724a795993718b4780/html5/thumbnails/5.jpg)
name: Newton type: user
name: Hercules type: user
title: “How to deal with Father issues” type: book
title: “Muscle building for beginners” type: book
title: “Dancing with the Stars” type: DVD
title: “Friends forever bracelet” type: Accessory
![Page 6: Titan @ Gitpro Conference 2014](https://reader033.vdocuments.mx/reader033/viewer/2022061218/54b7a3724a795993718b4780/html5/thumbnails/6.jpg)
name: Newton type: user
name: Hercules type: user
bought
bought
bought
viewed
in-Cart
title: “How to deal with Father issues” type: book
title: “Muscle building for beginners” type: book
title: “Dancing with the Stars” type: DVD
title: “Friends forever bracelet” type: Accessory
![Page 7: Titan @ Gitpro Conference 2014](https://reader033.vdocuments.mx/reader033/viewer/2022061218/54b7a3724a795993718b4780/html5/thumbnails/7.jpg)
name: Newton type: user
name: Hercules type: user
bought
time:24
bought
bought
time:22
time:20
viewed
in-Cart
time:05
time:09
title: “How to deal with Father issues” type: book
title: “Muscle building for beginners” type: book
title: “Dancing with the Stars” type: DVD
title: “Friends forever bracelet” type: Accessory
![Page 8: Titan @ Gitpro Conference 2014](https://reader033.vdocuments.mx/reader033/viewer/2022061218/54b7a3724a795993718b4780/html5/thumbnails/8.jpg)
name: Newton type: user
name: Hercules type: user
bought
time:24
bought
bought
time:22
time:20
viewed
in-Cart
time:05
time:09
title: “How to deal with Father issues” type: book
title: “Muscle building for beginners” type: book
title: “Dancing with the Stars” type: DVD
title: “Friends forever bracelet” type: Accessory
![Page 9: Titan @ Gitpro Conference 2014](https://reader033.vdocuments.mx/reader033/viewer/2022061218/54b7a3724a795993718b4780/html5/thumbnails/9.jpg)
1. Home-grown solution
2. Relational Database
3. Graph Database
![Page 10: Titan @ Gitpro Conference 2014](https://reader033.vdocuments.mx/reader033/viewer/2022061218/54b7a3724a795993718b4780/html5/thumbnails/10.jpg)
Home-grown Solution
! Start with your favorite NoSQL database ! Cassandra, MongoDB, HBase, etc
1. Error-prone
2. Data model moves into application code
3. Maintainability hazard
4. No query language support
5. No performance optimization
![Page 11: Titan @ Gitpro Conference 2014](https://reader033.vdocuments.mx/reader033/viewer/2022061218/54b7a3724a795993718b4780/html5/thumbnails/11.jpg)
Relational Database
! Relationship tables, SQL and joins
1. Join processing is expensive
2. Join processing on large tables does not scale
3. Cumbersome query language
4. Inflexible data model
![Page 12: Titan @ Gitpro Conference 2014](https://reader033.vdocuments.mx/reader033/viewer/2022061218/54b7a3724a795993718b4780/html5/thumbnails/12.jpg)
SELECT P.title FROM
User U1 JOIN Purchase P1 ON P1.buyerid = U1.userid JOIN Purchase P2 ON P1.productid=P2.productid JOIN Purchase P3 ON P2.buyerid=P3.buyerid JOIN Product P ON P3.productid = P.productid
WHERE U1.name=“xyz” AND P1.time>T1 AND P2.time>T1
![Page 13: Titan @ Gitpro Conference 2014](https://reader033.vdocuments.mx/reader033/viewer/2022061218/54b7a3724a795993718b4780/html5/thumbnails/13.jpg)
Relational Database
! Relationship tables, joins, and SQL
1. Join processing is expensive
2. Join processing on large tables does not scale
3. Cumbersome query language
4. Inflexible data model
![Page 14: Titan @ Gitpro Conference 2014](https://reader033.vdocuments.mx/reader033/viewer/2022061218/54b7a3724a795993718b4780/html5/thumbnails/14.jpg)
name: Newton type: user
name: Hercules type: user
bought
friends
time:24
bought
bought
time:22
time:20
viewed
in-Cart
time:05 duration: 60
time:09
name: Saturn type: author author
author
title: “How to deal with Father issues” type: book
title: “Muscle building for beginners” type: book
title: “Dancing with the Stars” type: DVD
title: “Friends forever bracelet” type: Accessory
![Page 15: Titan @ Gitpro Conference 2014](https://reader033.vdocuments.mx/reader033/viewer/2022061218/54b7a3724a795993718b4780/html5/thumbnails/15.jpg)
1. Home-grown solution
2. Relational Database
3. Graph Database
![Page 16: Titan @ Gitpro Conference 2014](https://reader033.vdocuments.mx/reader033/viewer/2022061218/54b7a3724a795993718b4780/html5/thumbnails/16.jpg)
UML
Entity Relationship Model
![Page 17: Titan @ Gitpro Conference 2014](https://reader033.vdocuments.mx/reader033/viewer/2022061218/54b7a3724a795993718b4780/html5/thumbnails/17.jpg)
name: Hercules type: user
bought
time:24
6?LN?R
%>A? ,;<?F
%>A?
0LIJ?LNS t E?S q P;FO?
title: “Muscle building for beginners” type: book
![Page 18: Titan @ Gitpro Conference 2014](https://reader033.vdocuments.mx/reader033/viewer/2022061218/54b7a3724a795993718b4780/html5/thumbnails/18.jpg)
name: Newton type: user
name: Hercules type: user
bought
friends
time:24
bought
bought
time:22
time:20
viewed
in-Cart
time:05 duration: 60
time:09
name: Saturn type: author author
author
title: “How to deal with Father issues” type: book
title: “Muscle building for beginners” type: book
title: “Dancing with the Stars” type: DVD
title: “Friends forever bracelet” type: Accessory
![Page 19: Titan @ Gitpro Conference 2014](https://reader033.vdocuments.mx/reader033/viewer/2022061218/54b7a3724a795993718b4780/html5/thumbnails/19.jpg)
g.V.has(‘name’,’xyz’).outE(‘bought’).has(‘time’,gt,T1).inV .inE(‘bought’).has(‘time’,gt,T1).outV .out(‘bought’).title
http://gremlindocs.com/
![Page 20: Titan @ Gitpro Conference 2014](https://reader033.vdocuments.mx/reader033/viewer/2022061218/54b7a3724a795993718b4780/html5/thumbnails/20.jpg)
Architecture Analogy
MyISAM
![Page 21: Titan @ Gitpro Conference 2014](https://reader033.vdocuments.mx/reader033/viewer/2022061218/54b7a3724a795993718b4780/html5/thumbnails/21.jpg)
Flexible Persistence
Partitionability
Availability Consistency
![Page 22: Titan @ Gitpro Conference 2014](https://reader033.vdocuments.mx/reader033/viewer/2022061218/54b7a3724a795993718b4780/html5/thumbnails/22.jpg)
Vertex-Centric Indices
! Sort and index edges per vertex by sor tkey ! Sort key can be composite
! Enables efficient focused traversals ! Only retrieve edges that matter
! Uses push down predicates for quick, index-driven retrieval
![Page 23: Titan @ Gitpro Conference 2014](https://reader033.vdocuments.mx/reader033/viewer/2022061218/54b7a3724a795993718b4780/html5/thumbnails/23.jpg)
Token Ring
Graph Partitioning
;MMCAHM C>M NI G;J P?LNC=?M CHNI “IJNCG;F” NIE?H L;HA?
,INM I@ CHN?L?MNCHA KO?MNCIHM @IL@ONOL? QILE
OM?M "/0
![Page 24: Titan @ Gitpro Conference 2014](https://reader033.vdocuments.mx/reader033/viewer/2022061218/54b7a3724a795993718b4780/html5/thumbnails/24.jpg)
![Page 25: Titan @ Gitpro Conference 2014](https://reader033.vdocuments.mx/reader033/viewer/2022061218/54b7a3724a795993718b4780/html5/thumbnails/25.jpg)
Educating the Planet
![Page 26: Titan @ Gitpro Conference 2014](https://reader033.vdocuments.mx/reader033/viewer/2022061218/54b7a3724a795993718b4780/html5/thumbnails/26.jpg)
Person
Person Student Teacher
Course
Institution
Concept
Discussion
Comment
Share
enrolledIn
teaches
relatesTo
hasCourse
belongsTo
follows
author
references
hasComment relatesTo
author
partOf
relatesTo
![Page 27: Titan @ Gitpro Conference 2014](https://reader033.vdocuments.mx/reader033/viewer/2022061218/54b7a3724a795993718b4780/html5/thumbnails/27.jpg)
121 Billion Edges 6.2 Billion Vertices
U -CFFCIH 5HCP?LMCNC?M W . Y "CFFCIH 3NO>?HNM
![Page 28: Titan @ Gitpro Conference 2014](https://reader033.vdocuments.mx/reader033/viewer/2022061218/54b7a3724a795993718b4780/html5/thumbnails/28.jpg)
0F;=?G?HN 'LIOJ
BCU .4RF
Setup
![Page 29: Titan @ Gitpro Conference 2014](https://reader033.vdocuments.mx/reader033/viewer/2022061218/54b7a3724a795993718b4780/html5/thumbnails/29.jpg)
1.1 million edges / sec
OMCHA <;N=B GI>?
Data Ingestion
![Page 30: Titan @ Gitpro Conference 2014](https://reader033.vdocuments.mx/reader033/viewer/2022061218/54b7a3724a795993718b4780/html5/thumbnails/30.jpg)
\^ GU .G?>COG
![Page 31: Titan @ Gitpro Conference 2014](https://reader033.vdocuments.mx/reader033/viewer/2022061218/54b7a3724a795993718b4780/html5/thumbnails/31.jpg)
10,200 transactions / sec
UZ L;H>IGFS =BIM?H =IGJF?R NL;P?LM;F N?GJF;N?M
Throughput
![Page 32: Titan @ Gitpro Conference 2014](https://reader033.vdocuments.mx/reader033/viewer/2022061218/54b7a3724a795993718b4780/html5/thumbnails/32.jpg)
Transaction Description Avg (ms) Stdev (ms) Student retrieves all content for a single course in their course list 279.32 81.83
Student follows another student 193.72 22.77 Student is recommended people to follow 241.33 256.48
Student reads their stream and shares an item with followers 284.07 68.20
Student retrieves their profile 53.740 22.61 Student reads the most recent comments for their courses 211.07 45.56
![Page 33: Titan @ Gitpro Conference 2014](https://reader033.vdocuments.mx/reader033/viewer/2022061218/54b7a3724a795993718b4780/html5/thumbnails/33.jpg)
x = [] as Set; m = [:]!m = user.out('follows').aggregate(x)[0..(num*2)]!!.out('follows').except(x)[0..limit]!!.groupCount(m);!
!m.sort{-it.value}[0..num]._()!!.transform{ [userid: it.key.id, !! ! ! ! ! ! points: it.value]};!
&IFFIQ 2?=IGG?H>;NCIH
![Page 34: Titan @ Gitpro Conference 2014](https://reader033.vdocuments.mx/reader033/viewer/2022061218/54b7a3724a795993718b4780/html5/thumbnails/34.jpg)
AURELIUS THINKAURELIUS.COM
Faunus Batch Graph Analytics
![Page 35: Titan @ Gitpro Conference 2014](https://reader033.vdocuments.mx/reader033/viewer/2022061218/54b7a3724a795993718b4780/html5/thumbnails/35.jpg)
! Hadoop-based Graph Computing Framework
! Graph Analytics
! Breadth-first Traversals
! Global Graph Computations
! Batch Big Graph Data
Faunus Features
![Page 36: Titan @ Gitpro Conference 2014](https://reader033.vdocuments.mx/reader033/viewer/2022061218/54b7a3724a795993718b4780/html5/thumbnails/36.jpg)
Faunus Architecture
g._()!
![Page 37: Titan @ Gitpro Conference 2014](https://reader033.vdocuments.mx/reader033/viewer/2022061218/54b7a3724a795993718b4780/html5/thumbnails/37.jpg)
Faunus Work Flow
hdfs://user/ubuntu/
output/job-0/
output/job-1/
output/job-2/ { graph*
sideeffect*
g.V.out .out .count()
Compressed HDFS Graphs ! stored in sequence files ! variable length encoding ! prefix compression
![Page 38: Titan @ Gitpro Conference 2014](https://reader033.vdocuments.mx/reader033/viewer/2022061218/54b7a3724a795993718b4780/html5/thumbnails/38.jpg)
Degree Distribution
GitHub Network
g.V.sideEffect{ it.degree = it.out(‘follows’).count()
}.degree.groupCount
![Page 39: Titan @ Gitpro Conference 2014](https://reader033.vdocuments.mx/reader033/viewer/2022061218/54b7a3724a795993718b4780/html5/thumbnails/39.jpg)
Degree Distribution
P(k) ~ k-γ
γ = 2.2
![Page 40: Titan @ Gitpro Conference 2014](https://reader033.vdocuments.mx/reader033/viewer/2022061218/54b7a3724a795993718b4780/html5/thumbnails/40.jpg)
Global Recommendations
gremlin> g.E.has('label','pushed','to').keep.!! ! !V.out('pushed').out('to').!! ! !in('to').in('pushed').!! ! !sideEffect('{it.score =it.pathCounter}').!! ! !score.order(F.decr,'name')!
!# Top 5:!Jippi ! ! ! !60892182927!garbear ! ! !30095282886!FakeHeal ! ! !30038040349!brianchandotcom !24684133382!nyarla! ! !15230275746!
![Page 41: Titan @ Gitpro Conference 2014](https://reader033.vdocuments.mx/reader033/viewer/2022061218/54b7a3724a795993718b4780/html5/thumbnails/41.jpg)
Aurelius Graph Cluster
OLTP OLAP
Hadoop MapReduce
Analysis results back into Titan
Apache 2
g.V.label.groupCount g.v(101).out
titan.thinkaurelius.com faunus.thinkaurelius.com
![Page 42: Titan @ Gitpro Conference 2014](https://reader033.vdocuments.mx/reader033/viewer/2022061218/54b7a3724a795993718b4780/html5/thumbnails/42.jpg)
AURELIUS THINKAURELIUS.COM
@AURELIUSGRAPHS