presto: distributed machine learning and graph processing with sparse matrices
DESCRIPTION
TRANSCRIPT
![Page 1: Presto: Distributed Machine Learning and Graph Processing with Sparse Matrices](https://reader035.vdocuments.mx/reader035/viewer/2022070304/54b92a634a795970598b4613/html5/thumbnails/1.jpg)
Distributed Machine Learning and Graph Processing with Sparse Matrices
Speaker: LIN Qianhttp://www.comp.nus.edu.sg/~linqian/
![Page 2: Presto: Distributed Machine Learning and Graph Processing with Sparse Matrices](https://reader035.vdocuments.mx/reader035/viewer/2022070304/54b92a634a795970598b4613/html5/thumbnails/2.jpg)
Big Data, Complex Algorithms
PageRank(Dominant eigenvector)
Recommendations(Matrix factorization)
Anomaly detection(Top-K eigenvalues)
User Importance(Vertex Centrality)
Machine learning + Graph algorithms
![Page 3: Presto: Distributed Machine Learning and Graph Processing with Sparse Matrices](https://reader035.vdocuments.mx/reader035/viewer/2022070304/54b92a634a795970598b4613/html5/thumbnails/3.jpg)
Large-Scale Processing Frameworks
Data-parallel frameworks – MapReduce/Dryad (2004)– Process each record in parallel – Use case: Computing sufficient statistics, analytics queries
Graph-centric frameworks – Pregel/GraphLab (2010)– Process each vertex in parallel– Use case: Graphical models
Array-based frameworks – MadLINQ (2012)– Process blocks of array in parallel– Use case: Linear Algebra Operations
![Page 4: Presto: Distributed Machine Learning and Graph Processing with Sparse Matrices](https://reader035.vdocuments.mx/reader035/viewer/2022070304/54b92a634a795970598b4613/html5/thumbnails/4.jpg)
PageRank using Matrices
Power Method Dominant
eigenvector
Mp
M = web graph matrixp = PageRank vector
Simplified algorithm repeat { p = M*p }
Linear Algebra Operations on Sparse Matrices
p
![Page 5: Presto: Distributed Machine Learning and Graph Processing with Sparse Matrices](https://reader035.vdocuments.mx/reader035/viewer/2022070304/54b92a634a795970598b4613/html5/thumbnails/5.jpg)
Statistical software
moderately-sized datasetssingle server, entirely in memory
![Page 6: Presto: Distributed Machine Learning and Graph Processing with Sparse Matrices](https://reader035.vdocuments.mx/reader035/viewer/2022070304/54b92a634a795970598b4613/html5/thumbnails/6.jpg)
Work-around for massive dataset
Vertical scalabilitySampling
![Page 7: Presto: Distributed Machine Learning and Graph Processing with Sparse Matrices](https://reader035.vdocuments.mx/reader035/viewer/2022070304/54b92a634a795970598b4613/html5/thumbnails/7.jpg)
MapReduce
Limited to aggregation processing
![Page 8: Presto: Distributed Machine Learning and Graph Processing with Sparse Matrices](https://reader035.vdocuments.mx/reader035/viewer/2022070304/54b92a634a795970598b4613/html5/thumbnails/8.jpg)
Data analytics
Deep vs. Scalable
Statistical software(R, MATLAB, SPASS, SAS) MapReduce
![Page 9: Presto: Distributed Machine Learning and Graph Processing with Sparse Matrices](https://reader035.vdocuments.mx/reader035/viewer/2022070304/54b92a634a795970598b4613/html5/thumbnails/9.jpg)
Improvement ways
1. Statistical sw. += large-scale data mgnt2. MapReduce += statistical functionality3. Combining both existing technologies
![Page 10: Presto: Distributed Machine Learning and Graph Processing with Sparse Matrices](https://reader035.vdocuments.mx/reader035/viewer/2022070304/54b92a634a795970598b4613/html5/thumbnails/10.jpg)
Parallel MATLAB, pR
![Page 11: Presto: Distributed Machine Learning and Graph Processing with Sparse Matrices](https://reader035.vdocuments.mx/reader035/viewer/2022070304/54b92a634a795970598b4613/html5/thumbnails/11.jpg)
HAMA, SciHadoop
![Page 12: Presto: Distributed Machine Learning and Graph Processing with Sparse Matrices](https://reader035.vdocuments.mx/reader035/viewer/2022070304/54b92a634a795970598b4613/html5/thumbnails/12.jpg)
MadLINQ [EuroSys’12]
Linear algebra platform on DryadNot efficient for sparse matrix comp.
![Page 13: Presto: Distributed Machine Learning and Graph Processing with Sparse Matrices](https://reader035.vdocuments.mx/reader035/viewer/2022070304/54b92a634a795970598b4613/html5/thumbnails/13.jpg)
Ricardo [SIGMOD’10]
But ends up inheriting the inefficiencies of the MapReduce
interface
R Hadoopaggregation-processing queries
aggregated data
![Page 14: Presto: Distributed Machine Learning and Graph Processing with Sparse Matrices](https://reader035.vdocuments.mx/reader035/viewer/2022070304/54b92a634a795970598b4613/html5/thumbnails/14.jpg)
Array-basedSingle-threaded
Limited support for scaling
![Page 15: Presto: Distributed Machine Learning and Graph Processing with Sparse Matrices](https://reader035.vdocuments.mx/reader035/viewer/2022070304/54b92a634a795970598b4613/html5/thumbnails/15.jpg)
Challenge 1: Sparse Matrices
![Page 16: Presto: Distributed Machine Learning and Graph Processing with Sparse Matrices](https://reader035.vdocuments.mx/reader035/viewer/2022070304/54b92a634a795970598b4613/html5/thumbnails/16.jpg)
Challenge 1 – Sparse Matrices
1 7 13 19 25 31 37 43 49 55 61 67 73 79 85 91 971
10
100
1000
10000
LiveJournal Netflix ClueWeb-1B
Block ID
Blo
ck d
ensi
ty (n
orm
aliz
ed )
1000x more data Computation imbalance
![Page 17: Presto: Distributed Machine Learning and Graph Processing with Sparse Matrices](https://reader035.vdocuments.mx/reader035/viewer/2022070304/54b92a634a795970598b4613/html5/thumbnails/17.jpg)
Challenge 2 – Data Sharing
Sharing data through pipes/network
Time-inefficient (sending copies)Space-inefficient (extra copies)
Process
copy of data
local copy
Process
data
Process
copy of data
Process
copy of data
Server 1
network copy
network copy
Server 2
Sparse matrices Communication overhead
![Page 18: Presto: Distributed Machine Learning and Graph Processing with Sparse Matrices](https://reader035.vdocuments.mx/reader035/viewer/2022070304/54b92a634a795970598b4613/html5/thumbnails/18.jpg)
Extend R – make it scalable, distributedLarge-scale machine learning and
graph processing on sparse matrices
![Page 19: Presto: Distributed Machine Learning and Graph Processing with Sparse Matrices](https://reader035.vdocuments.mx/reader035/viewer/2022070304/54b92a634a795970598b4613/html5/thumbnails/19.jpg)
Presto architecture
![Page 20: Presto: Distributed Machine Learning and Graph Processing with Sparse Matrices](https://reader035.vdocuments.mx/reader035/viewer/2022070304/54b92a634a795970598b4613/html5/thumbnails/20.jpg)
Presto architecture
WorkerWorker
Master
R instanceR instance
DRA
M
R instance R instanceR instance
DRA
M
R instance
![Page 21: Presto: Distributed Machine Learning and Graph Processing with Sparse Matrices](https://reader035.vdocuments.mx/reader035/viewer/2022070304/54b92a634a795970598b4613/html5/thumbnails/21.jpg)
Distributed array (darray)PartitionedSharedDynamic
![Page 22: Presto: Distributed Machine Learning and Graph Processing with Sparse Matrices](https://reader035.vdocuments.mx/reader035/viewer/2022070304/54b92a634a795970598b4613/html5/thumbnails/22.jpg)
foreach
Parallel execution of the loop body
f (x)
Barrier
Call Update to publish changes
![Page 23: Presto: Distributed Machine Learning and Graph Processing with Sparse Matrices](https://reader035.vdocuments.mx/reader035/viewer/2022070304/54b92a634a795970598b4613/html5/thumbnails/23.jpg)
PageRank Using Presto
M darray(dim=c(N,N),blocks=(s,N))P darray(dim=c(N,1),blocks=(s,1))while(..){ foreach(i,1:len,
calculate(m=splits(M,i), x=splits(P), p=splits(P,i)) { p m*x
} )}
Create Distributed Array
M p
P1
P2
PN/s
![Page 24: Presto: Distributed Machine Learning and Graph Processing with Sparse Matrices](https://reader035.vdocuments.mx/reader035/viewer/2022070304/54b92a634a795970598b4613/html5/thumbnails/24.jpg)
PageRank Using Presto
M darray(dim=c(N,N),blocks=(s,N))P darray(dim=c(N,1),blocks=(s,1))while(..){ foreach(i,1:len,
calculate(m=splits(M,i), x=splits(P), p=splits(P,i)) { p m*x
} )}
Execute function in a cluster
Pass array partitions
p
P1
P2
PN/s
M
![Page 25: Presto: Distributed Machine Learning and Graph Processing with Sparse Matrices](https://reader035.vdocuments.mx/reader035/viewer/2022070304/54b92a634a795970598b4613/html5/thumbnails/25.jpg)
Dynamic repartitioning
To address load imbalanceCorrectness
![Page 26: Presto: Distributed Machine Learning and Graph Processing with Sparse Matrices](https://reader035.vdocuments.mx/reader035/viewer/2022070304/54b92a634a795970598b4613/html5/thumbnails/26.jpg)
Repartitioning Matrices
Profile execution
Repartition
Partition if
![Page 27: Presto: Distributed Machine Learning and Graph Processing with Sparse Matrices](https://reader035.vdocuments.mx/reader035/viewer/2022070304/54b92a634a795970598b4613/html5/thumbnails/27.jpg)
Invariants
compatibility in array sizes
![Page 28: Presto: Distributed Machine Learning and Graph Processing with Sparse Matrices](https://reader035.vdocuments.mx/reader035/viewer/2022070304/54b92a634a795970598b4613/html5/thumbnails/28.jpg)
Maintaining Size Invariants
invariant(mat, vec, type=ROW)
![Page 29: Presto: Distributed Machine Learning and Graph Processing with Sparse Matrices](https://reader035.vdocuments.mx/reader035/viewer/2022070304/54b92a634a795970598b4613/html5/thumbnails/29.jpg)
Data sharing for multi-core
Zero-copy sharing across cores
![Page 30: Presto: Distributed Machine Learning and Graph Processing with Sparse Matrices](https://reader035.vdocuments.mx/reader035/viewer/2022070304/54b92a634a795970598b4613/html5/thumbnails/30.jpg)
Data sharing challenges
1. Garbage collection2. Header conflict
R object data partR object header
R instance R instanceGarbage collect
ReadWrite Write
![Page 31: Presto: Distributed Machine Learning and Graph Processing with Sparse Matrices](https://reader035.vdocuments.mx/reader035/viewer/2022070304/54b92a634a795970598b4613/html5/thumbnails/31.jpg)
Overriding R’s allocator
Allocate process-local headersMap data in shared memory
page
Shared R object data partLocal R object header
page boundary page boundary
![Page 32: Presto: Distributed Machine Learning and Graph Processing with Sparse Matrices](https://reader035.vdocuments.mx/reader035/viewer/2022070304/54b92a634a795970598b4613/html5/thumbnails/32.jpg)
Immutable partitions Safe sharing
Only share read-only data
![Page 33: Presto: Distributed Machine Learning and Graph Processing with Sparse Matrices](https://reader035.vdocuments.mx/reader035/viewer/2022070304/54b92a634a795970598b4613/html5/thumbnails/33.jpg)
Versioning arrays
To ensure correctness when arrays are shared across machines
![Page 34: Presto: Distributed Machine Learning and Graph Processing with Sparse Matrices](https://reader035.vdocuments.mx/reader035/viewer/2022070304/54b92a634a795970598b4613/html5/thumbnails/34.jpg)
Fault tolerance
Master: primary-backup replicationWorker: heartbeat-based failure detection
![Page 35: Presto: Distributed Machine Learning and Graph Processing with Sparse Matrices](https://reader035.vdocuments.mx/reader035/viewer/2022070304/54b92a634a795970598b4613/html5/thumbnails/35.jpg)
Presto applications
Presto doubles LOC w.r.t. purely programming in R.
![Page 36: Presto: Distributed Machine Learning and Graph Processing with Sparse Matrices](https://reader035.vdocuments.mx/reader035/viewer/2022070304/54b92a634a795970598b4613/html5/thumbnails/36.jpg)
Evaluation
Faster than Spark and Hadoop using in-memory data
![Page 37: Presto: Distributed Machine Learning and Graph Processing with Sparse Matrices](https://reader035.vdocuments.mx/reader035/viewer/2022070304/54b92a634a795970598b4613/html5/thumbnails/37.jpg)
![Page 38: Presto: Distributed Machine Learning and Graph Processing with Sparse Matrices](https://reader035.vdocuments.mx/reader035/viewer/2022070304/54b92a634a795970598b4613/html5/thumbnails/38.jpg)
Multi-core support benefits
![Page 39: Presto: Distributed Machine Learning and Graph Processing with Sparse Matrices](https://reader035.vdocuments.mx/reader035/viewer/2022070304/54b92a634a795970598b4613/html5/thumbnails/39.jpg)
Data sharing benefits
10
20
40
4.45
2.49
1.63
0.71
0.7
0.72
Core
S
10
20
40
4.38
2.21
1.22
1.22
2.12
4.16
Compute TransferCO
RES
No sharing
Sharing
![Page 40: Presto: Distributed Machine Learning and Graph Processing with Sparse Matrices](https://reader035.vdocuments.mx/reader035/viewer/2022070304/54b92a634a795970598b4613/html5/thumbnails/40.jpg)
Repartitioning benefits
0 20 40 60 80 100 120 140 160
Transfer ComputeW
orke
rs
0 20 40 60 80 100 120 140 160
Wor
kers
No Repartition
Repartition
![Page 41: Presto: Distributed Machine Learning and Graph Processing with Sparse Matrices](https://reader035.vdocuments.mx/reader035/viewer/2022070304/54b92a634a795970598b4613/html5/thumbnails/41.jpg)
Repartitioning benefits
0 2 4 6 8 10 12 14 16 18 202000
3000
4000
5000
6000
7000
8000
0
50
100
150
200
250
300
350
400Convergence TimeTime spent partitioning
Number of Repartitions
Tim
e to
con
verg
ence
(s)
Cum
ulati
ve p
artiti
onin
g tim
e (s
)
![Page 42: Presto: Distributed Machine Learning and Graph Processing with Sparse Matrices](https://reader035.vdocuments.mx/reader035/viewer/2022070304/54b92a634a795970598b4613/html5/thumbnails/42.jpg)
Limitations
1. In-memory computation2. One writer per partition
3. Array-based programming
![Page 43: Presto: Distributed Machine Learning and Graph Processing with Sparse Matrices](https://reader035.vdocuments.mx/reader035/viewer/2022070304/54b92a634a795970598b4613/html5/thumbnails/43.jpg)
• Presto: Large scale array-based framework extends R
• Challenges with Sparse matrices• Repartitioning, sharing versioned arrays
Conclusion
![Page 44: Presto: Distributed Machine Learning and Graph Processing with Sparse Matrices](https://reader035.vdocuments.mx/reader035/viewer/2022070304/54b92a634a795970598b4613/html5/thumbnails/44.jpg)
IMDb Rating: 8.5Release Date: 27 June 2008Director: Doug SweetlandStudio: PixarRuntime: 5 min
Brief: A stage magician’s rabbit gets into a magical onstage brawl against his neglectful guardian with two magic hats.