low latency geo-distributed data analytics qifan pu, ganesh ananthanarayanan, peter bodik, srikanth...

Low Latency Geo-distributed Data Analytics

Qifan Pu, Ganesh Ananthanarayanan, Peter Bodik, Srikanth Kandula, Aditya Akella, Paramvir Bahl, Ion Stoica

2

WAN

Geo-distributed Data Analytics

Seattle

Berkeley Beijing

London

Slow & Wasteful

Perf. countersUser activities

…

“Centralized” Data Analytics Paradigm

3

WANSeattle

Berkeley Beijing

London

A single logical analytics cluster across all sites.

44

WANSeattle

Berkeley Beijing

London

Incorporating WAN bandwidths is key to geo-distributed analytics performance.

A single logical analytics system across all sites.

5

Incorporating WAN bandwidths

• Task placement–Decides the destinations of network transfers

• Data placement–Decides the sources of network transfers

Example Analytics Job

SELECT time_window, percentile(latency, 99) GROUP BY time_window

Seattle 40GB 20GB

20GBLondon 40GB

800MB/s

200MB/s

WAN

0

30

60

90

120

150

180

0

0.2

0.4

0.6

0.8

1

0

10

20

30

40

50

60

Task

Fractions

Upload

Time (s)

Download

Time (s)Input Data

(GB)

Calculating Transfer TimeSeattle London

0.5 0.5

40GB40GB 12.5s 12.5s 12.5s

50s

0.2

0.8

20s 20s 20s

2.5s

2.5x

How to solve the general case, with more sites, BW heterogeneity and

data skew?

Seattle

40 20

20London

40

Task Placement (TP Solver)

Task 1 -> LondonTask 2 -> BeijingTask 5 -> London…

Sites MTasks N

Data Matrix (MxN)Upload BWs

Download BWs

8

TPSolver

Optimization Goal: Minimize the longest transfer of all links

0

30

60

90

120

150

180

0

0.2

0.4

0.6

0.8

1

0

10

20

30

40

50

60

Task

Fractions

Upload

Time (s)

Download

Time (s)Input Data

(GB)

London

0.2

0.8

Seattle

100GB 100GB

50s 50s

6.25s40GB

160GB

0.07

0.93

24s 24s 24s

6s

2x

50s

How to jointly optimize data and task placement?

Seattle

100 50

50London

100

Another example

Query Lag

10

Iridium

Jointly optimize data and task placementwith greedy heuristic

improve query response time

bandwidth, query arrivals, etc

Approach

Goal Constraints

11

Iridium with Single Dataset

Iterative heuristics for joint task-data placement.

1, Identify bottlenecksby solving task placement

2, assess:find amount of move data to alleviate current bottleneck

TPSolver

TPSolver

Until query arrivals, repeat.

12

Iridium with Multiple Datasets

• Prioritize high-value datasets:

score = value x urgency / cost - value = sum(timeReduction) for all queries - urgency = 1/avg(query_lag) - cost = amount of data moved

13

Iridium: putting together

• Placement of data– Before query arrival

– prioritize the move of high-value datasets

• Placement of tasks– During query execution:

– constrained solver TP

Solver

Not talked about: estimation of query arrivals, contention of move&query, etc

14

Evaluation

• Spark 1.1.0 and HDFS 2.4.1– Override Spark’s task scheduler with ours– Data placement creates copies in cross-site HDFS

• Geo-distributed EC2 deployment across 8 regions– Tokyo, Singapore, Sydney, Frankfurt, Ireland,

Sao Paulo, Virginia (US) and California (US).

15

• Spark jobs, SQL queries and streaming queries

– Conviva: video sessions paramters

– Bing Edge: running dashboard, streaming

– TPC-DS: decision support queries for retail

– AMP BDB: mix of Hive and Spark queries

• Baseline:– “In-place”: Leave data unmoved + Spark’s scheduling– “Centralized”: aggregate all data onto one site

How well does Iridium perform?

16

Iridium outperforms 4x-19x3x-4x

Conviva Bing-Edge TPC-DS Big-Data

vs. In-place

vs. Centralized

0

20

40

60

80

100 10x

19x7x

4x

Red

uct

ion (

%)

in

Query

Resp

onse

Tim

e

3x4x4

x3x

Iridium subsumes both baselines!

vs. Centralized: Data placement has higher contributionvs. In-place:Equal contributions from two techniques

MedianReduction (%) Vs. Centralized Vs. In-place

Task placementData placementIridium (both)

18%38%75%

24%30%63%

0 20 40 60 800

20

40

60

80

100

IridiumMinBW

Reduct

ion (

%)

in

Query

Resp

onse

Tim

e

Reduction (%) in WAN Usage

1.5xBmin 1.3xBmin

1xBmin

(64%, 19%)

better

MinBW: a scheme that minimizes bandwidth, to BminIridium: budget the bandwidth usage to be m*BminIridium can speed up queries while using

near-optimal bandwidth cost

Bandwidth Cost

19

Related work

• JetStream (NSDI’14)– Data aggregation and adaptive filtering– Does not support arbitrary queries, nor optimizes

task and data placement

• WANalytics (CIDR’15), Geode (NSDI’15)– Optimize BW usage for SQL & general DAG jobs– Can lead to poor query performance time

20

Low Latency Geo-distributed Data Analytics

Data is geographically distributed• Services with global footprintsAnalyze logs across DCs• “99 percentile movie rating”• “Median Skype call setup latency”

Abstraction: Single logical analytics cluster across all sites Incorporating WAN bandwidths Reduce response time over baselines by 3x – 19x

WANSeattle

Berkeley Beijing

London

low latency geo-distributed data analytics qifan pu, ganesh ananthanarayanan, peter bodik, srikanth...

Documents