Transcript

UNIVERSITY OF ELECTRONIC SCIENCE & TECHNOLOGY OF CHINA IEEE INFOCOM 2015, Hong Kong

RAPIER: Integrating Routing and Scheduling

for Coflow-aware Data Center Networks

Yangming Zhao (UESTC), Kai Chen (HKUST), Wei Bai (HKUST),Minlan Yu (USC), Chen Tian (HUST), Yanhui Geng (Huawei),

Yiming Zhang (NUDT), Dan Li (Tsinghua), Sheng Wang (UESTC)

[email protected]

UNIVERSITY OF ELECTRONIC SCIENCE & TECHNOLOGY OF CHINA IEEE INFOCOM 2015, Hong Kong

p.2

Coflow-aware Traffic Optimization

• Why traffic optimization in data center networks?– Improve traffic scalability– Improve QoS

• Why coflow-aware?– Minimize average flow completion time– Minimize average coflow completion time

• How to optimize network traffic?– Routing (Hedera, Micro-TE)– Scheduling (Varys, Baraat, pFabric)

In cluster computing frameworks, a stage cannot complete, or sometimes even start, before it receives all the flows in a coflow from the previous stage

An individual flow can be treated as a special coflow

Why not joint optimization?

UNIVERSITY OF ELECTRONIC SCIENCE & TECHNOLOGY OF CHINA IEEE INFOCOM 2015, Hong Kong

p.3

Motivation ExampleTwo coflows: Coflow a: fa1=40Mb, fa2=100Mb; Coflow b: fb1=60Mb, fb2=100Mb

Link bandwidths are all 100Mbps

Case 1: ECMP + Scheduling

Traffic unbalance may occur due to the route collision incurred by ECMP

Average CCT=1.5ms

UNIVERSITY OF ELECTRONIC SCIENCE & TECHNOLOGY OF CHINA IEEE INFOCOM 2015, Hong Kong

p.4

Motivation ExampleTwo coflows: Coflow a: fa1=40Mb, fa2=100Mb; Coflow b: fb1=60Mb, fb2=100Mb

Link bandwidths are all 100Mbps

Case 2: Coflow-agnostic Load balancing + Scheduling Average CCT=1.5ms

Consider routing and scheduling separately cannot optimize average CCT

Routing should also take flow dependence in a coflow into account

UNIVERSITY OF ELECTRONIC SCIENCE & TECHNOLOGY OF CHINA IEEE INFOCOM 2015, Hong Kong

p.5

Motivation ExampleTwo coflows: Coflow a: fa1=40Mb, fa2=100Mb; Coflow b: fb1=60Mb, fb2=100Mb

Link bandwidths are all 100Mbps

Case 3: Coflow-aware routing + scheduling Average CCT=1.3ms

Jointly optimize routing and scheduling can minimize average CCT

UNIVERSITY OF ELECTRONIC SCIENCE & TECHNOLOGY OF CHINA IEEE INFOCOM 2015, Hong Kong

Desirable Properties of RAPIER

p.6

UNIVERSITY OF ELECTRONIC SCIENCE & TECHNOLOGY OF CHINA IEEE INFOCOM 2015, Hong Kong

Main idea

• Coflow-level Routing– Distribute all the flows in a coflow evenly in the network

• Coflow-level Scheduling– Minimal remaining time first principle

• Starvation-free– Scheduling a coflow first if it is waiting for a long time

• Work-conserving– Distribute all the bandwidth if there is a demand to serve

• Coexistence– Route mice flows with ECMP and highest priority

p.7

UNIVERSITY OF ELECTRONIC SCIENCE & TECHNOLOGY OF CHINA IEEE INFOCOM 2015, Hong Kong

RAPIER in a Nutshell

p.8

For starvation-free

For minimal remaining time first

For work-conserving

UNIVERSITY OF ELECTRONIC SCIENCE & TECHNOLOGY OF CHINA IEEE INFOCOM 2015, Hong Kong

Minimize single coflow completion time

p.9

Non-linear w

ith integer

variable

Let ai=1/ti

Non-linear with integer

variable

Relax integer constraint

Let mkij=aixk

ij

Linear programming

Route demand i to j on the path with largest x and resolve (2)

Non-linear without integer

variable

UNIVERSITY OF ELECTRONIC SCIENCE & TECHNOLOGY OF CHINA IEEE INFOCOM 2015, Hong Kong

Relaxation and Rounding

p.10

Problem (2)

Problem (4)

Theorem 1: Assume the minimum CCT is tmin and talg is the CCT obtained by Algorithm 2, then

where K is the number of candidate paths for each flow

lg minat Kt

UNIVERSITY OF ELECTRONIC SCIENCE & TECHNOLOGY OF CHINA IEEE INFOCOM 2015, Hong Kong

Bandwidth Allocation

p.11

Large coflow first for starvation-free

Large flow first to reduce CCT

UNIVERSITY OF ELECTRONIC SCIENCE & TECHNOLOGY OF CHINA IEEE INFOCOM 2015, Hong Kong

Implementation

• Central controller– Algorithm 1

• End host enforcement modules– OpenFlow based explicit

routing

– Bandwidth enforcement

p.12

No device modification is required!!

UNIVERSITY OF ELECTRONIC SCIENCE & TECHNOLOGY OF CHINA IEEE INFOCOM 2015, Hong Kong

Experiment on Testbed

• Pronto 3295 48-port Gigabit Ethernet switch with PicOS 2.04 system

• Each server has a 4-core Intel E5-1410 2.8GHz CPU, 8G memory, 500GB hard disk and 1G Ethernet NICs

• The OS of servers is Debian 6.0 64bit version with Linux 2.6.38.3 kernel

p.13

UNIVERSITY OF ELECTRONIC SCIENCE & TECHNOLOGY OF CHINA IEEE INFOCOM 2015, Hong Kong

Experiment Results

p.14

Coflow ID

Flow ID source Destination Volume(GB)Coflow Completion Time(s)

RAPIER Routing Baseline

1

1

2

3

M1

M2

M3

M4

M5

M9

3.17

5.29

5.29

50.6 84.1 107.1

24

5

M8

M6

M6

M5

10.6

5.29100.9 203.0 289.5

36

7

M7

M9

M4

M6

17.9

10.6201.1 204.1 289.2

Average completion time 117.5 163.7 228.6RAPIER can save 48.6% of the average CCT compared to the baseline scheme, and it can reduce the average CCT by 28.22% compared to the routing-only scheme

UNIVERSITY OF ELECTRONIC SCIENCE & TECHNOLOGY OF CHINA IEEE INFOCOM 2015, Hong Kong

Simulation Settings

• C/C++ based flow level simulator

• CPLEX 10.0 for solving LP

• Fattree 、 VL2 with 512 servers

• Flows in a coflow arrive simultaneously

• Inter-coflow arrival rate follows a Poisson distribution

p.15

UNIVERSITY OF ELECTRONIC SCIENCE & TECHNOLOGY OF CHINA IEEE INFOCOM 2015, Hong Kong

Impact of coflow width

• Reduce average CCT by up to 79.44% in Fattree, and 55.55% in VL2

• Routing-only scheme performs better when coflow width is small.

• Scheduling-only scheme performs better when coflow width is large.

p.16

UNIVERSITY OF ELECTRONIC SCIENCE & TECHNOLOGY OF CHINA IEEE INFOCOM 2015, Hong Kong

Impact of coflow number

• RAPIER keeps relatively stable performance with different coflow number.

• Scheduling-only scheme is more effective in VL2 than in Fattree

p.17

UNIVERSITY OF ELECTRONIC SCIENCE & TECHNOLOGY OF CHINA IEEE INFOCOM 2015, Hong Kong

Impact of inter-coflow arrival interval

• The average CCT is decreased with the increase of average inter-coflow arrival interval

• The same trend as scheduling-only scheme when the inter-coflow arrival interval is small

• The same trend as routing-only scheme when the inter-coflow arrival interval is large

p.18

UNIVERSITY OF ELECTRONIC SCIENCE & TECHNOLOGY OF CHINA IEEE INFOCOM 2015, Hong Kong

Simulation Results Summary

• In light-load scenario, routing contributes more by solving the flow path collision problem in ECMP.

• In heavy-load scenario, scheduling contributes more by determining the sending order of flows/coflows.

• RAPIER integrates both schemes and gets all the benefits from them.

p.19

UNIVERSITY OF ELECTRONIC SCIENCE & TECHNOLOGY OF CHINA IEEE INFOCOM 2015, Hong Kong

Conclusion

• RAPIER is a system which optimizes average coflow completion time in DCNs by integrating routing and scheduling.

• RAPIER follows the minimal remaining time first to reduce the average coflow completion time.

• We implement the prototype of RAPIER

• Simulation results show that RAPIER can greatly reduce the average coflow completion time in DCNs.

p.20

UNIVERSITY OF ELECTRONIC SCIENCE & TECHNOLOGY OF CHINA IEEE INFOCOM 2015, Hong Kong

• The end!

• Thanks for your attention!

p.21


Top Related