guagua an iterative computing framework on hadoop

41

Upload: pengshanzhang

Post on 14-Jul-2015

121 views

Category:

Software


0 download

TRANSCRIPT

Page 1: Guagua an iterative computing framework on hadoop
Page 2: Guagua an iterative computing framework on hadoop

Guagua: An Iterative Computing Framework on Hadoop

Zhang Pengshan(David), PayPal

Page 3: Guagua an iterative computing framework on hadoop

AGENDA

• Introduction

• Distributed Neural Network Algorithm

• What is Guagua?

• Guagua Advanced Features

• Shifu on Guagua

• Future Plans

Page 4: Guagua an iterative computing framework on hadoop

AGENDA

• Introduction

• Distributed Neural Network Algorithm

• What is Guagua?

• Guagua Advanced Features

• Shifu on Guagua

• Future Plans

Page 5: Guagua an iterative computing framework on hadoop

ALIPAY vs. PAYPAL

Q: Where is risk control in PayPal?

A: Risk control is everywhere in paypal.com.

Page 6: Guagua an iterative computing framework on hadoop

FRAUD TYPES IN PAYPAL

Fraud Types in PayPal

Account Take Over

Stolen Financials

INR/SNAD

Credit Cards

INR: Item Not ReceivedSNAD: Significantly Not as Described

Page 7: Guagua an iterative computing framework on hadoop

RISK CONTROL IN PAYPAL

Models Rules Agents

Page 8: Guagua an iterative computing framework on hadoop

RISK MODELING IN PAYPAL

MODELING CHALLENGES

Thousands of Features

Algorithms(LR, NN, DT)

Big Training

Data

SLA(Online)

Simulation

Page 9: Guagua an iterative computing framework on hadoop

RISK MODELING IN PAYPAL

MODELING CHALLENGES

Thousands of Features

Algorithms(LR, NN, DT)

Big Training

Data

SLA(Online)

Simulation

Q: How to train models with TB data and thousands of features?

Page 10: Guagua an iterative computing framework on hadoop

AGENDA

• Introduction

• Distributed Neural Network Algorithm

• What is Guagua?

• Guagua Advanced Features

• Shifu on Guagua

• Future Plans

Page 11: Guagua an iterative computing framework on hadoop

DISTRIBUTED NEURAL NETWORK ALGORITHM*

Worker Worker Worker

Master

Worker Worker Worker

Master

1st iteration

2nd iteration

GRADIENTS: DOUBLE []

WEIGHTS: DOUBLE [] WEIGHTS: DOUBLE [] WEIGHTS: DOUBLE []

WEIGHTS: DOUBLE [] WEIGHTS: DOUBLE [] WEIGHTS: DOUBLE []

ACCUMULATE GRADIENTS

UPDATE WEIGHTS

ACCUMULATE GRADIENTS

UPDATE WEIGHTS

GRADIENTS: DOUBLE [] GRADIENTS: DOUBLE []

GRADIENTS: DOUBLE [] GRADIENTS: DOUBLE [] GRADIENTS: DOUBLE []

* Distributed batch gradient descent algorithm.

Page 12: Guagua an iterative computing framework on hadoop

DISTRIBUTED NEURAL NETWORK ALGORITHM

Worker Worker Worker

Master

Worker Worker Worker

Master

1st iteration

2nd iteration

GRADIENTS: DOUBLE []

WEIGHTS: DOUBLE [] WEIGHTS: DOUBLE [] WEIGHTS: DOUBLE []

WEIGHTS: DOUBLE [] WEIGHTS: DOUBLE [] WEIGHTS: DOUBLE []

ACCUMULATE GRADIENTS

UPDATE WEIGHTS

ACCUMULATE GRADIENTS

UPDATE WEIGHTS

GRADIENTS: DOUBLE [] GRADIENTS: DOUBLE []

GRADIENTS: DOUBLE [] GRADIENTS: DOUBLE [] GRADIENTS: DOUBLE []

Q: How to implement it?

Page 13: Guagua an iterative computing framework on hadoop

WHY NOT MAHOUT OR SPARK?

Mahout

• No distributed logistic regression & neural network.

• Iterative through Hadoop jobs, bad performance.

Spark• No independent Spark cluster.

• Hadoop cluster is still 1.0 based, not YARN.

Q: How to implement it in Hadoop?

Page 14: Guagua an iterative computing framework on hadoop

POSSIBLE SOLUTIONS

Hadoop YARN Hadoop MapReduce

Pros

Flexible framework for framework Works well on all Hadoop versions

Self resource management Mature computing model

Internal fault tolerance, splits, UI …

Cons

2.0.3-Alpha Different computing model

PayPal Clusters: Hadoop 0.20.2 How to do iterative coordination?

Extra fault tolerance, splits, UI …

Q: How to implement it in Hadoop MapReduce?

Page 15: Guagua an iterative computing framework on hadoop

AGENDA

• Introduction

• Distributed Neural Network Algorithm

• What is Guagua?

• Guagua Advanced Features

• Shifu on Guagua

• Future Plans

Page 16: Guagua an iterative computing framework on hadoop

ITERATIVE COMPUTING MODEL IN GUAGUA

Worker Worker Worker

Master

Worker Worker Worker

Master

1st iteration

2nd iteration

WORKER RESULT

MASTER RESULT MASTER RESULT MASTER RESULT

MASTER RESULT MASTER RESULT MASTER RESULT

WORKER RESULT WORKER RESULT

WORKER RESULT WORKER RESULT WORKER RESULT

Guagua is a framework over such iterative computing model, compared with Hadoop 1.0 over MapReduce.

Page 17: Guagua an iterative computing framework on hadoop

GUAGUA APIMasterComputable

WorkerComputable

Page 18: Guagua an iterative computing framework on hadoop

GUAGUA OVERVIEW

IterativeComputingFramework

CORE

MapReduce Adapter(For Hadoop 1.0)

YARN Adapter(For Hadoop 2.0)

Consistent Client

Distributed Neural Network Application

Master-Workers Core

Fault Tolerance

Coordination

Page 19: Guagua an iterative computing framework on hadoop

PLUGGABLE, SCALABLE INTERCEPTORS

Master

Fault Tolerance Interceptor

ZooKeeper Coordinator

Perf Interceptor

Timer

Master Computation

User Defined Interceptors

Worker

Fault Tolerance Interceptor

ZooKeeper Coordinator

Perf Interceptor

Timer

Worker Computation

User Defined Interceptors

* These two graphs are aspects for each iteration.

Page 20: Guagua an iterative computing framework on hadoop

GUAGUA RUNTIME

Master: Mapper (Container)

Worker: Mapper (Container)

Worker: Mapper (Container)

Worker: Mapper (Container)

Worker: Mapper (Container)

Worker: Mapper (Container)

Page 21: Guagua an iterative computing framework on hadoop

GUAGUA RUNTIME

Master: Mapper (Container)

Worker: Mapper (Container)

Worker: Mapper (Container)

Worker: Mapper (Container)

Worker: Mapper (Container)

Worker: Mapper (Container)

ZooKeeperCluster

REGISTER

REGISTER

REGISTER

REGISTER

REGISTER

REGISTER

1. Master is listening znodes of workers.2. Workers are listening znode of master.

Page 22: Guagua an iterative computing framework on hadoop

GUAGUA RUNTIME

Master: Mapper (Container)

Worker: Mapper (Container)

Worker: Mapper (Container)

Worker: Mapper (Container)

Worker: Mapper (Container)

Worker: Mapper (Container)

ZooKeeperCluster

UPDATE ITER

UPDATE ITER

UPDATE IITER

UPDATE ITER

UPDATE ITER

UPDATE ITER

1. Data is loaded in worker memory in the first iteration.2. Whole process is done when reaches maximal iteration

or halt condition is triggered.

Page 23: Guagua an iterative computing framework on hadoop

AGENDA

• Introduction

• Distributed Neural Network Algorithm

• What is Guagua?

• Guagua Advanced Features

• Shifu on Guagua

• Future Plans

Page 24: Guagua an iterative computing framework on hadoop

FAULT TOLERANCE

Master

Worker

Worker

Worker

Worker

Worker

Master

Worker

Worker

Worker

Worker

Worker

Master

Worker

Worker

Worker

Worker

Worker

Master

Worker

Worker

Worker

Worker

Worker

Master

Worker

Worker

Worker

Worker

Worker

1 2 3 4 … n

Worker

Worker

Worker

Page 25: Guagua an iterative computing framework on hadoop

FAULT TOLERANCE

Master

Worker

Worker

Worker

Worker

Worker

Worker

Worker

Worker

Worker

Worker

Worker

Worker

Worker

Worker

Worker

Worker

Worker

Worker

Worker

Worker

Master

Worker

Worker

Worker

Worker

Worker

1 2 3 4 … n

* The same on workers.

Page 26: Guagua an iterative computing framework on hadoop

STRAGGLER MITIGATION

Master

Worker

Worker

Worker

Worker

Worker

1 2 3

Master

Worker

Worker

Worker

Worker

Worker

Master

Worker

Worker

Worker

Worker

Worker

Page 27: Guagua an iterative computing framework on hadoop

STRAGGLER MITIGATION

Master

Worker

Worker

Worker

Worker

1 2 3

Master

Worker

Worker

Worker

Worker

Master

Worker

Worker

Worker

Worker

Worker Worker Worker

Page 28: Guagua an iterative computing framework on hadoop

STRAGGLER MITIGATION

Master

Worker

Worker

Worker

Worker

1 2 3

Master

Worker

Worker

Worker

Worker

Master

Worker

Worker

Worker

Worker

Worker Worker

Page 29: Guagua an iterative computing framework on hadoop

PROGRESS AND STATE REPORT

0.86 = 432/501 (Current Iteration) / (Total Iteration)

Page 30: Guagua an iterative computing framework on hadoop

GUAGUA UNIT

Page 31: Guagua an iterative computing framework on hadoop

AGENDA

• Introduction

• Distributed Neural Network Algorithm

• What is Guagua?

• Guagua Advanced Features

• Shifu on Guagua

• Future Plans

Page 32: Guagua an iterative computing framework on hadoop

WHAT IS SHIFU?

NEW INIT STATS VARSELECT

NORMALIZE

TRAINPOSTTRAIN

EVAL

STATS VARSELECT

TRAINPOSTTRAIN

EVAL

Shifu* is an open-source, end-to-end machine learning and data mining framework built on top of Hadoop.

Built on Guagua

*Want to try Shifu? Please visit http://shifu.ml.

Page 33: Guagua an iterative computing framework on hadoop

SHIFU ON GUAGUA (TRAIN STEP)

NNMaster

NNWorker NNOutput

MasterComputable

WorkerComputable

AbstractWorkerComputable BasicMasterInterceptor

MasterInterceptor

Gradients

Weights

GUAGUA API SHIFU CODE ENCOG CODE

Page 34: Guagua an iterative computing framework on hadoop

SHIFU NN vs. SPARK LR

Shifu-NN: 1102*20*1 Network, 319 Mappers * 1G

Spark-LR: 1102 features, 120 executors * 3G

0

5

10

15

20

25

30

35

40

45

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

Time

Iteration

Run Time Comparison

Shifu-NN

Spark-LR

Page 35: Guagua an iterative computing framework on hadoop

SHIFU NN BENCHMARK RESULTS

All data are located in memory. At most we used 2400 mappers. 20 epochs are used.

0

200

400

600

800

1000

1200

1400

125G 375G 500G 625G 750G 875G 1000G

Run Time(Seconds)

Size of Input

Time(Seconds)

Page 36: Guagua an iterative computing framework on hadoop

AGENDA

• Introduction

• Distributed Neural Network Algorithm

• What is Guagua?

• Guagua Advanced Features

• Shifu on Guagua

• Future Plans

Page 37: Guagua an iterative computing framework on hadoop

WHAT’S NEXT?

• More open source docs

• Support more (distributed) machine learning algorithms

• Improve YARN (Beta) implementation

• Support more input formats

• Big model support

• Deep learning support

Page 38: Guagua an iterative computing framework on hadoop

Q&A

Page 39: Guagua an iterative computing framework on hadoop

APPENDIX

• Website

• http://shifu.ml• http://shifu.ml/docs/guagua/

• Guagua issue website

• https://github.com/shifuml/shifu/issues• https://github.com/shifuml/guagua/issues

• Shifu & Guagua source code:

• https://github.com/shifuml/shifu/• https://github.com/shifuml/guagua/

Page 40: Guagua an iterative computing framework on hadoop