programming the topology of networks - umd

Post on 08-Nov-2021

2 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Programming the Topology of NetworksTechnology and Algorithms

Pierre Blanche Klaus-Tycho Foerster

Jeff Cox Jamie Gaudette

Nikhil Devanur Phillipa Gill

Mark Filer Madeleine Glick

Daniel Kilper Gireeja Ranade

Janardhan Kulkarni Houman Rastegarfar

Ratul Mahajan Stefan Schmid

Amar Phanishayee Rachee Singh

Manya Ghobadi

The cloud infrastructure

Inefficiencies and waste in

the cloud infrastructure

Data centers Optical cables

2

Network

topology

Traffic engineeringCapacity provisioning

Technology and algorithms to optimize network topology

3

High-level idea

Networks with programmable topologies

A B

D C

1

A B

D C

1

1

11

Throughput: 2 units Throughput: 3 units

1

1

1

Dynamic topologyStatic topology

4

• Challenging:

• Requires reconfigurable hardware technology

• Requires revisiting networking layer algorithms

• Impactful:

• Cheaper networks

• Higher throughput

Programmable topologies

5

Talk outline

Data center networks

ProjecToR: Programming the

network topology [SIGCOMM’16]

Technology and algorithms to enable programmable

topologies in the cloud

Google data center

Wide-area networks

Programming the capacity of links[SIGCOMM’18, HotNets’17, OFC’16]

Level3 global backbone

6

Today’s data center interconnects

A B C D

0 3 3 3

3 0 3 3

3 3 0 3

3 3 3 0

Ideal demand matrix:

uniform and static

Static capacity between

Top-of-Rack (ToR) pairs

0 6 6 0

0 0 0 0

0 0 0 0

0 0 0 0

Non-ideal demand matrix:

skewed and dynamic

A

B

C

D

A B C D A B C D

A

B

C

D10Gbps

10Gbps

0 0 0 0

0 0 0 0

8 0 6 0

0 0 0 0

0 0 0 0

0 0 0 7

0 0 0 7

0 0 0 0

0 0 0 0

0 0 0 0

0 0 0 0

0 12 8 0

0 6 6 0

0 0 0 0

0 0 0 0

0 0 0 0

7

Need for a reconfigurable interconnect

Data:

• 200K servers across 4 production clusters

• Cluster sizes: 100 -- 2500 racks

Observation:

• Many rack pairs exchange little traffic

• Only some hot rack pairs are active

Implication:

• Static topology with uniform capacity:

• Over-provisioned for most rack pairs

• Under-provisioned for few others

Reconfigurable interconnect:

To dynamically provide additional capacity between hot rack pairs

8

Our proposal: ProjecToR interconnect

9

• Free-space topology (programmable)

• Digital micromirror device to redirect light

• Disco-ball shaped mirror assembly to magnify reachability

Laser Photodetector

Static topology 9

Digital Micromirror Device (DMD)

Array of micromirrors (10 um) Memory cell

10

A 3-ToR ProjecToR interconnect prototype

ToR1ToR2

ToR3

Source laser

DMD

Mirrors

reflecting to

ToR2 and ToR3

ToR1ToR2

ToR3

11

Routing algorithm

• We have a highly flexible topology allowing for millions of ways to connect lasers to photodetectors

• Ideal solution: fast changing topology to adapt to demand change

• Challenge: It takes 12μs to reprogram a link

ToR1 ToR1

ToR2

ToR3

ToR2

ToR3

lasers photodetectors

12

Routing algorithm

• Two topology approach:

• Slow switching topology or dedicated topology

• Fast switching links or opportunistic links

ToR1

ToR2

ToR3

ToR1

ToR2

ToR3

ToR1 ToR1

ToR2

ToR3

ToR2

ToR3

ToR1

ToR2

ToR3

ToR1

ToR2

ToR3

lasers photodetectors

dedicated topology opportunistic links

13

Routing packets

ToR1

ToR2

ToR3

ToR1

ToR2

ToR3

dedicated topology

2

3

3

3

Virtual output queues

K-shortest paths routing

opportunistic link

2

2

2

2

2

2

2

14

Scheduling opportunistic links

• Given a set of potential links and current traffic demand, find a set of active opportunistic links

ToR1

ToR2

ToR3

ToR1

ToR2

ToR3

0 0 100

100 0 0

0 0 0

s

o

u

r

c

e

d e s t i n a t i o n

15

Scheduling opportunistic links

• Standard switch scheduling problem

• Blossom matching

• BvN matrix decomposition

• Centralized scheduler

• Single tiered matching

0 100 0

0 0 100

100 0 0

s

o

u

r

c

e

d e s t i n a t i o n

input output

16

• Standard switch scheduling problem

• Blossom matching

• BvN matrix decomposition

• Centralized scheduler

• Single tiered matching

Scheduling opportunistic links

input outputSrc ToRs Dst ToRs

Two-tiered

Decentralized

Extended the Gale-Shapely algorithm for finding stable matches [GS-1962]

Constant competitive against an offline optimal allocation17

- Slow switching time

+ Reconfigurable

+ Switching time: 12μs

• Tail flow completion time

• Different traffic matrices

• Impact of switching time

• Impact of fan-out

Simulation results

0

5

10

15

20

25

30

35

40

20 30 40 50 60 70 80

Ave

rage

Flo

w C

om

ple

tio

n T

ime

(m

s)

Average Load (%)

- No reconfigurability

95%

ProjecToR

Fat tree [SIGCOMM’08]

FireFly

[SIGCOMM’14]

18

The key takeaway from this talk

Current assumption: Network topology is fixed

New world: Network topology is dynamicc3

c4

s

w

u v

y

tc7 c8

Problems to solve:

• Scheduling

• Capacity provisioning

• Traffic engineering

• Load-balancing

• Exciting: Unusual wealth of algorithms

• Challenging: Changes fundamental assumptions

• Impactful: Better efficiency ($/Gbps) 19

top related