david g. andersen cmu guohui wang, t. s. eugene ng rice michael kaminsky, dina papagiannaki, michael...

19
David G. Andersen CMU Guohui Wang, T. S. Eugene Ng Rice Michael Kaminsky, Dina Papagiannaki, Michael A. Kozuch, Michael Ryan Intel Labs Pittsburgh 1 c-Through: Part-time Optics in Data Centers

Upload: myron-turner

Post on 30-Dec-2015

219 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: David G. Andersen CMU Guohui Wang, T. S. Eugene Ng Rice Michael Kaminsky, Dina Papagiannaki, Michael A. Kozuch, Michael Ryan Intel Labs Pittsburgh 1 c-Through:

David G. Andersen

CMU

Guohui Wang,

T. S. Eugene Ng

Rice

Michael Kaminsky, Dina Papagiannaki,

Michael A. Kozuch, Michael Ryan

Intel Labs Pittsburgh

1

c-Through: Part-time Optics in Data Centers

Page 2: David G. Andersen CMU Guohui Wang, T. S. Eugene Ng Rice Michael Kaminsky, Dina Papagiannaki, Michael A. Kozuch, Michael Ryan Intel Labs Pittsburgh 1 c-Through:

Current solutions for increasing data center network bandwidth

2

1. Hard to construct 2. Hard to expand

FatTree BCube

Page 3: David G. Andersen CMU Guohui Wang, T. S. Eugene Ng Rice Michael Kaminsky, Dina Papagiannaki, Michael A. Kozuch, Michael Ryan Intel Labs Pittsburgh 1 c-Through:

An alternative: hybrid packet/circuit switched data center network

3

Goal of this work: – Feasibility: software design that enables efficient use of optical

circuits– Applicability: application performance over a hybrid network

Page 4: David G. Andersen CMU Guohui Wang, T. S. Eugene Ng Rice Michael Kaminsky, Dina Papagiannaki, Michael A. Kozuch, Michael Ryan Intel Labs Pittsburgh 1 c-Through:

Electrical packet switching

Optical circuit switching

Switching technology

Store and forward Circuit switching

Switching capacity

Switching time

Optical circuit switching v.s. Electrical packet switching

4

16x40Gbps at high end e.g. Cisco CRS-1

320x100Gbps on market, e.g. Calient FiberConnect

Packet granularity Less than 10ms

e.g. MEMS optical switch

Page 5: David G. Andersen CMU Guohui Wang, T. S. Eugene Ng Rice Michael Kaminsky, Dina Papagiannaki, Michael A. Kozuch, Michael Ryan Intel Labs Pittsburgh 1 c-Through:

5

Optical circuit switching is promising despite slow switching time

Full bisection bandwidth at packet granularitymay not be necessary

[WREN09]: “…we find that traffic at the five edge switches exhibit an ON/OFF pattern… ”

[IMC09][HotNets09]: “Only a few ToRs are hot and most their traffic goes to a few other ToRs. …”

Page 6: David G. Andersen CMU Guohui Wang, T. S. Eugene Ng Rice Michael Kaminsky, Dina Papagiannaki, Michael A. Kozuch, Michael Ryan Intel Labs Pittsburgh 1 c-Through:

Hybrid packet/circuit switched network architecture

Optical circuit-switched network for high capacity transfer

Electrical packet-switched network for low latency delivery

Optical paths are provisioned rack-to-rack– A simple and cost-effective choice – Aggregate traffic on per-rack basis to better utilize optical circuits

Page 7: David G. Andersen CMU Guohui Wang, T. S. Eugene Ng Rice Michael Kaminsky, Dina Papagiannaki, Michael A. Kozuch, Michael Ryan Intel Labs Pittsburgh 1 c-Through:

Design requirements

7

Control plane:– Traffic demand estimation – Optical circuit configuration

Data plane:– Dynamic traffic de-multiplexing– Optimizing circuit utilization

(optional)

Traffic demands

Page 8: David G. Andersen CMU Guohui Wang, T. S. Eugene Ng Rice Michael Kaminsky, Dina Papagiannaki, Michael A. Kozuch, Michael Ryan Intel Labs Pittsburgh 1 c-Through:

c-Through (a specific design)

8

No modification to applications and switches

Leverage end-hosts for traffic

management Centralized control for circuit configuration

Page 9: David G. Andersen CMU Guohui Wang, T. S. Eugene Ng Rice Michael Kaminsky, Dina Papagiannaki, Michael A. Kozuch, Michael Ryan Intel Labs Pittsburgh 1 c-Through:

c-Through - traffic demand estimation and traffic batching

9

Per-rack traffic demand vector

2. Packets are buffered per-flow to avoid HOL blocking.

1. Transparent to applications.

Applications

Accomplish two requirements: – Traffic demand estimation – Pre-batch data to improve optical circuit utilization

Socket buffers

Page 10: David G. Andersen CMU Guohui Wang, T. S. Eugene Ng Rice Michael Kaminsky, Dina Papagiannaki, Michael A. Kozuch, Michael Ryan Intel Labs Pittsburgh 1 c-Through:

c-Through - optical circuit configuration

10

Use Edmonds’ algorithm to compute optimal configuration

Many ways to reduce the control traffic overhead

Traffic demand

configuration

Controller

configuration

Page 11: David G. Andersen CMU Guohui Wang, T. S. Eugene Ng Rice Michael Kaminsky, Dina Papagiannaki, Michael A. Kozuch, Michael Ryan Intel Labs Pittsburgh 1 c-Through:

c-Through - traffic de-multiplexing

11

VLAN #1

Traffic de-multiplexer

VLAN #1 VLAN #2

circuit configuration

traffic

VLAN #2

VLAN-based network isolation:– No need to modify

switches– Avoid the instability

caused by circuit reconfiguration

Traffic control on hosts:– Controller informs hosts

about the circuit configuration

– End-hosts tag packets accordingly

Page 12: David G. Andersen CMU Guohui Wang, T. S. Eugene Ng Rice Michael Kaminsky, Dina Papagiannaki, Michael A. Kozuch, Michael Ryan Intel Labs Pittsburgh 1 c-Through:

12

Testbed setup

Ethernet switch

Emulated optical circuit switch

4Gbps links

100Mbps links

16 servers with 1Gbps NICs Emulate a hybrid network on

48-port Ethernet switch

Optical circuit emulation– Optical paths are available

only when hosts are notified – During reconfiguration, no

host can use optical paths– 10 ms reconfiguration delay

Page 13: David G. Andersen CMU Guohui Wang, T. S. Eugene Ng Rice Michael Kaminsky, Dina Papagiannaki, Michael A. Kozuch, Michael Ryan Intel Labs Pittsburgh 1 c-Through:

13

Evaluation

Basic system performance: – Can TCP exploit dynamic bandwidth quickly?

– Does traffic control on servers bring significant overhead?

– Does buffering unfairly increase delay of small flows?

Application performance:– Bulk transfer (VM migration)?

– Loosely synchronized all-to-all communication (MapReduce)?

– Tightly synchronized all-to-all communication (MPI-FFT) ?

Yes

No

No

Yes

Yes

Yes

Page 14: David G. Andersen CMU Guohui Wang, T. S. Eugene Ng Rice Michael Kaminsky, Dina Papagiannaki, Michael A. Kozuch, Michael Ryan Intel Labs Pittsburgh 1 c-Through:

14

TCP can exploit dynamic bandwidth quickly

Throughput ramps upwithin 10 ms

Throughput stabilizeswithin 100ms

Page 15: David G. Andersen CMU Guohui Wang, T. S. Eugene Ng Rice Michael Kaminsky, Dina Papagiannaki, Michael A. Kozuch, Michael Ryan Intel Labs Pittsburgh 1 c-Through:

15

MapReduce Overview

mapper

mapper

mapper

reducer

reducer

reducer

loadlocal write

data shuffling

outputfile

write

outputfile

outputfile

Split 0

Split 1

Split 2

Input file

Concentrated traffic in 64MB blocks Concentrated traffic

in 64MB blocksIndependent transfers:amenable to batching

Page 16: David G. Andersen CMU Guohui Wang, T. S. Eugene Ng Rice Michael Kaminsky, Dina Papagiannaki, Michael A. Kozuch, Michael Ryan Intel Labs Pittsburgh 1 c-Through:

16

MapReduce sort 10GB random data

128 KB

50 MB

100 MB

300 MB

500 MB

0

200

400

600

800

1000

c-Through varying socket buffer size limit (reconfiguration interval: 1 sec)

Electrical network

Full bisection

bandwidth

c-Through

Co

mp

leti

on

tim

e (

s)

153s 135s

Page 17: David G. Andersen CMU Guohui Wang, T. S. Eugene Ng Rice Michael Kaminsky, Dina Papagiannaki, Michael A. Kozuch, Michael Ryan Intel Labs Pittsburgh 1 c-Through:

MapReduce sort 10GB random data

17

c-Through varying reconfiguration interval (socket buffer size limit: 100MB)

0.3 Sec

0.5 Sec

1.0 Sec

3.0 Sec

5.0 Sec

0

200

400

600

800

Co

mp

leti

on

tim

e (

s)

Electrical network

Full bisection

bandwidth

c-Through

168s 135s

Page 18: David G. Andersen CMU Guohui Wang, T. S. Eugene Ng Rice Michael Kaminsky, Dina Papagiannaki, Michael A. Kozuch, Michael Ryan Intel Labs Pittsburgh 1 c-Through:

Yahoo Gridmix benchmark

18

3 runs of 100 mixed jobs such as web query, web scan and sorting 200GB of uncompressed data, 50 GB of compressed data

Page 19: David G. Andersen CMU Guohui Wang, T. S. Eugene Ng Rice Michael Kaminsky, Dina Papagiannaki, Michael A. Kozuch, Michael Ryan Intel Labs Pittsburgh 1 c-Through:

19

Summary

Hybrid packet/circuit switched data center network c-Through demonstrates its feasibility Good performance even for applications with all to all traffic

Future directions to explore: The scaling property of hybrid data center networks Making applications circuit aware Power efficient data centers with optical circuits

Picture from Internet websites.