caffeonspark: deep learning on spark cluster
TRANSCRIPT
![Page 1: CaffeOnSpark: Deep Learning On Spark Cluster](https://reader031.vdocuments.mx/reader031/viewer/2022030310/58f9a936760da3da068b6c3b/html5/thumbnails/1.jpg)
CaffeOnSpark: Deep Learning on Spark Cluster
Andy Feng, Jun Shi and Mridul JainYahoo! Inc.
![Page 2: CaffeOnSpark: Deep Learning On Spark Cluster](https://reader031.vdocuments.mx/reader031/viewer/2022030310/58f9a936760da3da068b6c3b/html5/thumbnails/2.jpg)
Agenda
2
• Why Deep Learning on Spark? • CaffeOnSpark
– Architecture – API: Scala + Python
• Demo– CaffeOnSpark on Python Notebook
![Page 3: CaffeOnSpark: Deep Learning On Spark Cluster](https://reader031.vdocuments.mx/reader031/viewer/2022030310/58f9a936760da3da068b6c3b/html5/thumbnails/3.jpg)
Deep Learning
3
Handwritten digits (MNIST) Deep Neural Network
forward backward
![Page 4: CaffeOnSpark: Deep Learning On Spark Cluster](https://reader031.vdocuments.mx/reader031/viewer/2022030310/58f9a936760da3da068b6c3b/html5/thumbnails/4.jpg)
• Photos organized according to 70 categories
• Empowered by deep learning & machine learning
Flickr Magic View:https://flickr.com/cameraroll
![Page 5: CaffeOnSpark: Deep Learning On Spark Cluster](https://reader031.vdocuments.mx/reader031/viewer/2022030310/58f9a936760da3da068b6c3b/html5/thumbnails/5.jpg)
(4)Apply
ML Model @ Scale
Flickr DL/ML Pipeline
(3) Non-deepLearning@ Scale
* http://bit.ly/1KIDfof by Pierre Garrigues, Deep Learning Summit 2015
(2)Deep
Learning@ Scale
(1)PrepareDatasets@ Scale
![Page 6: CaffeOnSpark: Deep Learning On Spark Cluster](https://reader031.vdocuments.mx/reader031/viewer/2022030310/58f9a936760da3da068b6c3b/html5/thumbnails/6.jpg)
Deep Learning vs. Spark
6
![Page 7: CaffeOnSpark: Deep Learning On Spark Cluster](https://reader031.vdocuments.mx/reader031/viewer/2022030310/58f9a936760da3da068b6c3b/html5/thumbnails/7.jpg)
Deep Learning Frameworks• Theano• Torch• Caffe
– Popular choice for vision community– Widely used in Yahoo
• TensorFlow• …
![Page 8: CaffeOnSpark: Deep Learning On Spark Cluster](https://reader031.vdocuments.mx/reader031/viewer/2022030310/58f9a936760da3da068b6c3b/html5/thumbnails/8.jpg)
Deep Learning on Spark
8
![Page 9: CaffeOnSpark: Deep Learning On Spark Cluster](https://reader031.vdocuments.mx/reader031/viewer/2022030310/58f9a936760da3da068b6c3b/html5/thumbnails/9.jpg)
Related Work: SparkNet & DL4J
1) [driver] sc.broadcast(model) to executors2) [executor] apply DL training against a mini-batch of dataset to
update models locally3) [driver] aggregate(models) to produce a new model
RE
PE
AT
![Page 10: CaffeOnSpark: Deep Learning On Spark Cluster](https://reader031.vdocuments.mx/reader031/viewer/2022030310/58f9a936760da3da068b6c3b/html5/thumbnails/10.jpg)
• Apache 2.0 license• Distributed deep learning
– GPU or CPU– Ethernet or InfiniBand
• Easily deployed on public cloud or private cloud
10
CaffeOnSpark Open Sourced
github.com/yahoo/CaffeOnSpark
![Page 11: CaffeOnSpark: Deep Learning On Spark Cluster](https://reader031.vdocuments.mx/reader031/viewer/2022030310/58f9a936760da3da068b6c3b/html5/thumbnails/11.jpg)
CaffeOnSpark: Scalable Architecture
11
![Page 12: CaffeOnSpark: Deep Learning On Spark Cluster](https://reader031.vdocuments.mx/reader031/viewer/2022030310/58f9a936760da3da068b6c3b/html5/thumbnails/12.jpg)
CaffeOnSpark: Deployment Options
12
• Single node– Spark-submit –master local
• Multiple nodes w/ ethernet connection– Spark-submit –master URL –connection ethernet– Ex. EC2
• Multiple nodes w/ Infiniband connection– Spark-submit –master URL –connection infiniband– Ex., Yahoo Hadoop cluster
![Page 13: CaffeOnSpark: Deep Learning On Spark Cluster](https://reader031.vdocuments.mx/reader031/viewer/2022030310/58f9a936760da3da068b6c3b/html5/thumbnails/13.jpg)
Deep Learning: 19x Speedup (est.)
Training latency (hours)
Top-
5 Va
lidat
ion
Erro
r
![Page 14: CaffeOnSpark: Deep Learning On Spark Cluster](https://reader031.vdocuments.mx/reader031/viewer/2022030310/58f9a936760da3da068b6c3b/html5/thumbnails/14.jpg)
Spark CLI• spark-submit
--num-executors #_Processes--class com.yahoo.ml.CaffeOnSparkcaffe-on-spark.jar-devices #_gpus_per_proc-conf solver_config_file-model model_file-train | -test | -feature
Caffe Configuration
layer {name: "data"type: "MemoryData"source_class=“com.yahoo.ml.caffe.LMDB”memory_data_param {source: ”hdfs:///mnist/trainingdata/"batch_size: 64;channels: 1;height: 28;width: 28;
}…
}
14
CaffeOnSpark: DL Made Easy
![Page 15: CaffeOnSpark: Deep Learning On Spark Cluster](https://reader031.vdocuments.mx/reader031/viewer/2022030310/58f9a936760da3da068b6c3b/html5/thumbnails/15.jpg)
CaffeOnSpark: One Program (Scala) http://bit.ly/21ZY1c2
15
cos = new CaffeOnSpark(ctx)conf = new Config(ctx, args).init()
// (1) training DL modeldl_train_source = DataSource.getSource(conf, true)cos.train(dl_train_source) // (2) extract features via DLlr_raw_source = DataSource.getSource(conf, false)ext_df = cos.features(lr_raw_source) // (3) apply MLlr_input=ext_df.withColumn(“L", cos.floats2doubleUDF(ext_df(conf.label))).withColumn(“F", cos.floats2doublesUDF(ext_df(conf.features(0))))lr = new LogisticRegression().setLabelCol(”L").setFeaturesCol(”F")lr_model = lr.fit(lr_input_df)
Non-deep
Learning
Deep L
earning
![Page 16: CaffeOnSpark: Deep Learning On Spark Cluster](https://reader031.vdocuments.mx/reader031/viewer/2022030310/58f9a936760da3da068b6c3b/html5/thumbnails/16.jpg)
CaffeOnSpark: One Notebook (Python) http://bit.ly/1REZ0cN
16
![Page 17: CaffeOnSpark: Deep Learning On Spark Cluster](https://reader031.vdocuments.mx/reader031/viewer/2022030310/58f9a936760da3da068b6c3b/html5/thumbnails/17.jpg)
17
CaffeOnSpark: UI & Logs
![Page 18: CaffeOnSpark: Deep Learning On Spark Cluster](https://reader031.vdocuments.mx/reader031/viewer/2022030310/58f9a936760da3da068b6c3b/html5/thumbnails/18.jpg)
Demo: CaffeOnSpark on EC2• https://github.com/yahoo/CaffeOnSpark/wiki
– Get started on EC2– Python for CaffeOnSpark
![Page 19: CaffeOnSpark: Deep Learning On Spark Cluster](https://reader031.vdocuments.mx/reader031/viewer/2022030310/58f9a936760da3da068b6c3b/html5/thumbnails/19.jpg)
Summary
19
• CaffeOnSpark open sourced– https://github.com/yahoo/CaffeOnSpark
– Empower Flickr and other Yahoo services– Scalable DL made easy
![Page 20: CaffeOnSpark: Deep Learning On Spark Cluster](https://reader031.vdocuments.mx/reader031/viewer/2022030310/58f9a936760da3da068b6c3b/html5/thumbnails/20.jpg)
THANK [email protected]