introducing venmoplus.com 6/27 version
TRANSCRIPT
![Page 1: Introducing VenmoPlus.com 6/27 version](https://reader031.vdocuments.mx/reader031/viewer/2022030307/58e6b1061a28abfd418b65e3/html5/thumbnails/1.jpg)
Introducing VenmoPlus.comExplore your Venmo network
Qingpeng “Q.P.” ZhangInsight Data Engineering Fellow
![Page 2: Introducing VenmoPlus.com 6/27 version](https://reader031.vdocuments.mx/reader031/viewer/2022030307/58e6b1061a28abfd418b65e3/html5/thumbnails/2.jpg)
Venmo ~= Facebook + Paypal
![Page 3: Introducing VenmoPlus.com 6/27 version](https://reader031.vdocuments.mx/reader031/viewer/2022030307/58e6b1061a28abfd418b65e3/html5/thumbnails/3.jpg)
Demo VenmoPlus.com
http://venmoplus.com:8999/#/
![Page 4: Introducing VenmoPlus.com 6/27 version](https://reader031.vdocuments.mx/reader031/viewer/2022030307/58e6b1061a28abfd418b65e3/html5/thumbnails/4.jpg)
Pipeline
Historical transactions
![Page 5: Introducing VenmoPlus.com 6/27 version](https://reader031.vdocuments.mx/reader031/viewer/2022030307/58e6b1061a28abfd418b65e3/html5/thumbnails/5.jpg)
Pipeline
![Page 6: Introducing VenmoPlus.com 6/27 version](https://reader031.vdocuments.mx/reader031/viewer/2022030307/58e6b1061a28abfd418b65e3/html5/thumbnails/6.jpg)
Historical transactions
Real time transactions
Pipeline
![Page 7: Introducing VenmoPlus.com 6/27 version](https://reader031.vdocuments.mx/reader031/viewer/2022030307/58e6b1061a28abfd418b65e3/html5/thumbnails/7.jpg)
2013
Biggest Challenge:
● Calculate/Query graph distance in real time
![Page 8: Introducing VenmoPlus.com 6/27 version](https://reader031.vdocuments.mx/reader031/viewer/2022030307/58e6b1061a28abfd418b65e3/html5/thumbnails/8.jpg)
● Cache of 2nd degree friends list● Partitioned GraphDB● Good for Linkedin (hundreds of million
users, with higher degree)
● 5 million vertices (users)● 32 million distinct edges (transactions)● 88 million total edges (transactions)
![Page 9: Introducing VenmoPlus.com 6/27 version](https://reader031.vdocuments.mx/reader031/viewer/2022030307/58e6b1061a28abfd418b65e3/html5/thumbnails/9.jpg)
● Cache of 2nd degree friends list● Partitioned GraphDB● Good for Linkedin (hundreds of million
users, with higher degree)
● 5 million vertices (users)● 32 million distinct edges (transactions)● 88 million total edges (transactions)
No cache (precalculation)?No GraphDB?
![Page 10: Introducing VenmoPlus.com 6/27 version](https://reader031.vdocuments.mx/reader031/viewer/2022030307/58e6b1061a28abfd418b65e3/html5/thumbnails/10.jpg)
Historical transactions
Real time transactions
Two Databases
![Page 11: Introducing VenmoPlus.com 6/27 version](https://reader031.vdocuments.mx/reader031/viewer/2022030307/58e6b1061a28abfd418b65e3/html5/thumbnails/11.jpg)
Two Databases
420890 Graham Hadley
1630476 Leon Tang
810029 Harminder Toor
1371353 Ephraim Park
562884 Paul Min
420890 set(14935158, 562884)
1630476 set(1371353)
810029 set(190230,14935158)
1371353 set(810029,971156)
562884 set(196371,1371353)
![Page 12: Introducing VenmoPlus.com 6/27 version](https://reader031.vdocuments.mx/reader031/viewer/2022030307/58e6b1061a28abfd418b65e3/html5/thumbnails/12.jpg)
Two Databases
![Page 13: Introducing VenmoPlus.com 6/27 version](https://reader031.vdocuments.mx/reader031/viewer/2022030307/58e6b1061a28abfd418b65e3/html5/thumbnails/13.jpg)
This, or that? - to build graph
![Page 14: Introducing VenmoPlus.com 6/27 version](https://reader031.vdocuments.mx/reader031/viewer/2022030307/58e6b1061a28abfd418b65e3/html5/thumbnails/14.jpg)
This, or that? - for fast searching
![Page 15: Introducing VenmoPlus.com 6/27 version](https://reader031.vdocuments.mx/reader031/viewer/2022030307/58e6b1061a28abfd418b65e3/html5/thumbnails/15.jpg)
Historical transactions
Real time transactions
Two Databases
![Page 16: Introducing VenmoPlus.com 6/27 version](https://reader031.vdocuments.mx/reader031/viewer/2022030307/58e6b1061a28abfd418b65e3/html5/thumbnails/16.jpg)
Lesson learned
![Page 17: Introducing VenmoPlus.com 6/27 version](https://reader031.vdocuments.mx/reader031/viewer/2022030307/58e6b1061a28abfd418b65e3/html5/thumbnails/17.jpg)
![Page 18: Introducing VenmoPlus.com 6/27 version](https://reader031.vdocuments.mx/reader031/viewer/2022030307/58e6b1061a28abfd418b65e3/html5/thumbnails/18.jpg)
![Page 19: Introducing VenmoPlus.com 6/27 version](https://reader031.vdocuments.mx/reader031/viewer/2022030307/58e6b1061a28abfd418b65e3/html5/thumbnails/19.jpg)
![Page 20: Introducing VenmoPlus.com 6/27 version](https://reader031.vdocuments.mx/reader031/viewer/2022030307/58e6b1061a28abfd418b65e3/html5/thumbnails/20.jpg)
VenmoPlus.com
m4.xlarge
m4.large
m4.xlarge
m4.large
t2.micro
$29.11/day
![Page 21: Introducing VenmoPlus.com 6/27 version](https://reader031.vdocuments.mx/reader031/viewer/2022030307/58e6b1061a28abfd418b65e3/html5/thumbnails/21.jpg)
About Me
● PhD in Computer Science● BS in Physics
Volunteers:
● Software Carpentry● Data Carpentry● American Red Cross
Christmas Eve 2014, ice storm, Michigan
![Page 22: Introducing VenmoPlus.com 6/27 version](https://reader031.vdocuments.mx/reader031/viewer/2022030307/58e6b1061a28abfd418b65e3/html5/thumbnails/22.jpg)
Algorithm Optimization
Shortest distance -> intersection of sets (friend lists)
● 1st degree friends of A ∩ 1st degree friends of B == [] ?● 2nd degree friends of A ∩ 1st degree friends of B == []?
![Page 23: Introducing VenmoPlus.com 6/27 version](https://reader031.vdocuments.mx/reader031/viewer/2022030307/58e6b1061a28abfd418b65e3/html5/thumbnails/23.jpg)
Algorithms Design -2
Query distance between vertices in a historic moment in a constantly changing graph (because we don’t pre-calculate the distance….)
● A recent transaction for a user is history and has changed the graph● Query distance of the two users at that moment.
○ not considering that specific transaction)○ Remove the influence of that specific transaction temporarily and restore
■ Test if that transaction is the first between the pair of users.
![Page 24: Introducing VenmoPlus.com 6/27 version](https://reader031.vdocuments.mx/reader031/viewer/2022030307/58e6b1061a28abfd418b65e3/html5/thumbnails/24.jpg)
1 Spark m4.large 0.12 2.88
2 Spark m4.large 0.12 2.88
3 redis m4.xlarge 0.24 5.76
4 Elasticsearch
m4.xlarge 0.24 5.76
5 Elasticsearch
m4.xlarge 0.24 5.76
6 Kafka, producer
m4.large 0.12 2.88
7 kafka m4.large 0.12 2.88
8 webserver t2.micro 0.013 0.312
https://github.com/qingpeng/VenmoPlus for more details!
$29.11/24hours
![Page 25: Introducing VenmoPlus.com 6/27 version](https://reader031.vdocuments.mx/reader031/viewer/2022030307/58e6b1061a28abfd418b65e3/html5/thumbnails/25.jpg)
AlgorithmsDistance detection between vertices in graph (1st, 2nd, 3rd friends?)
● 1st degree friends of A ∩ 1st degree friends of B == [] ?● 2nd degree friends of A ∩ 1st degree friends of B == []?
![Page 26: Introducing VenmoPlus.com 6/27 version](https://reader031.vdocuments.mx/reader031/viewer/2022030307/58e6b1061a28abfd418b65e3/html5/thumbnails/26.jpg)
Producer [10]
[7,8] [1-6]
[1-6]
[4,5,6]
[1]
Backend/API [9]
Frontend [9]
[2,3]
![Page 27: Introducing VenmoPlus.com 6/27 version](https://reader031.vdocuments.mx/reader031/viewer/2022030307/58e6b1061a28abfd418b65e3/html5/thumbnails/27.jpg)
Redis:
● Graph Edges: userID -> userID● Graph Vertices: userID -> userName
In memory DB -> Fast graph updating, graph traversal, in real time
ElasticSearch:
● Everything about the transactions
Distributed -> Data storage and full text search, in real time
Big Challenge:
● Graph distance + Common connections in real time