self-adaptive, multi-rate optical network for ...lightwave.ee.columbia.edu/files/samadi2017c.pdf ·...
TRANSCRIPT
W3D.2.pdf OFC 2017 © OSA 2017
Self-Adaptive, Multi-Rate Optical Network forGeographically Distributed Metro Data Centers
Payman Samadi1, Matteo Fiorani2, Yiwen Shen1, Lena Wosinska2, Keren Bergman1
1Lightwave Research Laboratory, Department of Electrical Engineering, Columbia University, New York, NY, USA2School of ICT, KTH Royal Institute of Technology, Stockholm, Sweden
[email protected], [email protected]
Abstract: We propose a self-adaptive, multi-rate converged architecture and control-planefor metro-scale inter-data-center networks, enabling live autonomous bandwidth steering.Experimental and numerical evaluations demonstrate up to 5× and 25% improvements intransmission times and spectrum usage.OCIS codes: (060.4250) Networks; (060.4510) Optical communications; (200.4650) Optical interconnects.
1. Introduction
Small to mid-sized Data Centers (DCs) in metro-scale distances are now widely used by enterprises and cloud-providers. Trends show metro traffic surpassing long-haul, i.e. most of the traffic generated in the metro networkstays locally and does not go to through the core network [1]. Furthermore, the upcoming 5th generation of mobilecommunications (5G) is expected to rely on general purpose hardware and distributed DCs to bring the services closerto the end-users. These new emerging services including the ones enabled by 5G impose strict latency and bandwidthrequirements that force the next-generation metro networks to be flexible, dynamic and support different levels ofQuality of Service (QoS). As a result of this inevitable complexity, network management and provisioning needs tomove from conventional human-operated towards autonomous, self-adaptive and cognitive networks.
Several network architectures have been recently proposed to support dynamic inter DC networking [2, 3] and pro-vide bandwidth on-demand [4]. Also, inside large-scale DCs, automated bandwidth management leveraging Software-Defined Networking (SDN) has been explored [5]. However, there is no comprehensive network architecture and con-trol strategy that supports various QoS levels, dynamic optical connectivity, and autonomous bandwidth steering, usingcommodity optical and electrical components for metro-scale inter DC networks. In previous works, we introduced theconcept of converged inter/intra DC network with background and dynamic connections supporting multiple QoS lev-els [6]. In this work, we (i) extend the control plane to enable self-adaptive bandwidth steering, (ii) support multi-rateoptical data plane, (iii) show a comparison (performed by simulations) with single-rate converged and conventionalnon-converged networks, and (iv) demonstrate full network prototype with autonomous bandwidth steering. Resultsshow 2–5× shorter transmission times and 20–25% lower wavelength usage compared with single-rate, converged andconventional networks. This architecture enables DC scaling in distance, improves application performance reliabilityby enabling distribution over multiple DCs and supports connections with strict bandwidth and latency requirements.
2. Hardware Architecture and Control Plane
The network architecture is shown in Fig. 1(a). In this converged architecture Rack/Pod switches are aggregated usingElectrical Packet Switches (EPS) for intra DC connectivity. The optical gateway [7] that is a high port count Colorless,Directionless, Contentionless Reconfigurable Optical Add/Drop Multiplexer (CDC-ROADM) manages the optical in-ter DC connectivity. Racks/Pods are connected to the gateway by optical metro transceivers with different rates (10G,40G, 100G, etc.) to support dedicated dynamic Rack-to-Rack and/or Pod-to-Pod connections (shown in green). Dy-namic connections are utilized for high priority long-lived traffic. The EPS aggregation switch is also connected tothe optical gateway with multiple transceivers for background connections (shown in red). Background connectionstransmit low priority short-lived traffic. The function of the SDN control plane is illustrated in the flowchart shownin Fig. 1(b). The control plane consists of the traffic monitoring, network optimizer and topology manager modules.Each DC is divided into different subnets based on the size. ToRs and/or Pods are equipped with OpenFlow switchesand have permanent flow rules for global inter DC connectivity through background connections. The traffic monitor-ing module receives flow counters of the background and dynamic flow rules periodically. The decision to add/dropbackground/dynamic connections is based on the average traffic in the last n seconds and the Standard Deviation (SD)of the traffic on the background connections. Machine learning tools can be applied to optimize the decision makingprocess using historical data.
Fig. 2(a) shows the proposed network control algorithm. At the beginning, a set of static background connectionsare established to provide all-to-all connectivity among the metro DCs. The monitoring module measures the traffic on
W3D.2.pdf OFC 2017 © OSA 2017
Optical GatewayOptical Gateway
DC3
DC2 – 50k serversDC1 – 50k servers
Optical Gateway
EPS
DC controller
DC controller
DC controller
Metro network controller
Copper link
Backup (Priority 0)Video Streaming (Priority 2) VM Migration (Priority 1)
Dynamic (Multi-Rate)
Rack
s
Core EPS
EPS
Rack
s
EPS
Rack
s
EPS
Rack
s
Core EPS
EPS
Rack
s
EPS
Rack
s
EPS
Background
Control link
Fiber link
10.2
.1.x
10.2
.2.x
10.2
.3.x
10.1
.1.x
10.1
.2.x
10.1
.3.x
10G, 40G
(a)
Source Destination Subnet Counter
DC2 – Rack1 DC1 – Rack3 (10.0.3.x) AA Bytes
DC2 – Rack1 DC3 – Rack1 (10.3.1.x) BB Bytes
Initialize Network• Background Connections
Traffic Monitoring (f = n sec)• Under-usage Alert• Over-usage Alert
Optical Gateway Electronic Packet Switches (EPS)
Network Optimizer• Connection Request Aggregation• Routing and Wavelength Assignment
Topology Manager• Link Establishment• Link Removal
NetworkDatabase
Control Plane
Data PlaneOpenFlow
FlowMonitor
FlowUpdateFlowUpdate
From OpenFlow EPS
(b)
Fig. 1. (a) Proposed converged architecture with background/dynamic connections, (b) Self-adaptive control plane workflow.
the background connections. If the traffic on background connection B between source and destination DCs (DCS andDCD) is higher than a specified threshold, the network optimizer runs a routing and wavelength assignment algorithmto identify a possibility to establish a new lightpath between DCS and DCD (in our work we use k-Shortest Path withFirst-Fit). In the positive case, the network optimizer calculates the SD of the traffic of the different flows in B. If theSD is ≤1, the network optimizer creates a new background connection between DCS and DCD. On the other hand, ifthe SD is >1, the network optimizer creates a new dynamic connection for carrying the flows between the Racks/Podsgenerating the highest traffic. The network optimizer creates the new dynamic connection using the highest availabledata rate (in this work we assume 10G and 40G rates). If the new dynamic connection has high priority (Critical), thenetwork optimizer can force an active dynamic connection with lower priority (Bulk) to move to a lower data rate.
3. Prototype, Experimental and Numerical Results
We developed an event-driven simulator to evaluate the benefits of the proposed architecture and control algorithm ina realistic network scenario. The reference metro topology is composed of 38 nodes, 59 links and 100 wavelengths perfiber [6]. Each node represents a metro DC with 100 Racks/Pods. Each Rack/Pod switch is equipped with one 10G andone 40G WDM tunable transceivers connected to the optical gateway. The EPS that aggregates Racks/Pods has 25 10Ggrey transceiver connected to the optical gateway as well. We assume that Rack/Pod switches generate traffic flowswith lognormal inter-arrival distribution. We vary the mean of the lognormal distribution to mimic different trafficloads. In average half of the traffic flows have low priority (Bulk) and half have high priority (Critical). The flowsrepresent data transfers with sizes uniformly distributed between 1 and 500 GB. We compared the performance of ourConverged Multi-Rate (MR) network with a Converged Single-Rate (SR) and a Conventional network, all with overallthe same capacity. In the Converged SR only 10G transceivers are used for both background and dynamic connections,while the Conventional relies only on the background connections [6].
Fig. 2(b) shows the average time required to complete a data transfer as a function of the load. It can be observedthat the proposed Converged MR provides in average 2.5× faster Critical transfers and 2× faster Bulk transfers, withrespect to the Converged SR. This is due to the use of multi rate transmission and to an effective QoS management.In addition, the Converged MR provides 5× faster Critical and Bulk transfers compared to the Conventional network.This is due to the combined use of multi-rate transmission and dynamic connections. Fig. 2(c) shows the averagenumber of wavelengths required to carry different loads. The Converged MR requires at least 20% less wavelengthsthan the Converged SR and 25% less wavelengths than the Conventional, i.e., it offers a more efficient resource usage.
We built a 3-node DC network prototype as shown in Fig. 2(d), which is an experimental implementation of thearchitecture shown on Fig. 1(a). DC1 and DC2 are emulated with 4 ToRs, each connected to one server. ToR switchesare implemented using Pica8 OpenFlow switches with 10 Gbps server ports and 10 Gbps and 40 Gbps uplinks. Theyare aggregated with a 10G EPS switch for intra DC connectivity. Two optical gateways are implemented using Calientand Polatis OSS, Nistica WSS, and DWDM Mux/Demux. In each ToR one 10G and one 40G, and from the EPS two10G transceivers are connected to the optical gateway. The 10G optical transceivers are DWDM SFP+ with 24 dBpower budget while for 40G, due to limitations in single wavelength 40G DWDM transceivers, we used (4×10G)QSFP+ with 18 dB power budget. DC3 is only implemented in the control plane and the distances between the DCsare 5 to 25 km. We have assigned 10.1.x subnet to DC1, 10.2.x subnet to DC2. The controller server is connected tothe ToR switches via 1 Gbps campus Internet network. Once the network is initialized, DC1 and DC2 are connectedthrough one 10 Gbps background connections with permanent flow rules on all electronic switches. The control plane
W3D.2.pdf OFC 2017 © OSA 2017
(a)
Load per DC [Gb/s]0 500 1000 1500 2000
Avera
ge tra
nsfe
r tim
e [s]
0
100
200
300
400
500
600
700Converged MR: BulkConverged MR: CriticalConverged SR: BulkConverged SR: CriticalConventional: BulkConventional: Critical
(b)
Load per DC [Gb/s]777 1389 2000
# o
f u
sed
wa
vele
ng
hts
0
1000
2000
3000
4000
5000
6000Converged MRConverged SRConventional
(c)
DC1 OSS
DC2 OSS
DC1 ToRsDC2 ToRs
DC1
& 2
Ser
vers
Controller
2 x WSS
Optical Gateway
Circulators
Combiners
Mux/Demux
Mux/Demux
25 km SMF
5 km SMF
(d)
Time [s]10 20 30 40 50 60 70
Thro
ughp
ut [G
bps]
0
1
2
3
4
5
6Rack 1Rack 2Rack 3Rack 4
Traffic Increase
Link Saturation
New 10G Background Link
(e)
Time [s]10 20 30 40 50 60 70
Thro
ughp
ut [G
bps]
02468
101214161820
Rack 1Rack 2Rack 3
Traffic Increase on Rack 1
Link Saturation
New 40G Dynamic
Link
(f)
Fig. 2. (a) Network control algorithm, (b) Average transfer times, (c) Average network resource usage, (d) Experimentalsetup, (e) Autonomous background link establishment between DC1 and DC2 after link saturation by low SD traffic, (f)Autonomous dynamic Rack-to-Rack link establishment between DC1 and DC2 after increase in only Rack1 traffic.
receives the flow counters from the electronic switches every 2s and is averaged every 4 measurements.In the first set of experiments, we evaluated automated bandwidth steering on the background connections. Racks
1–4 between DC1 and DC2 are transmitting data with bit rates from 0.8-1.2 Gbps, total 4 Gbps on the backgroundconnection. At 25s, the throughput increases to 4 Gbps on all four Racks, overall 16 Gbps. The background linksaturates at 30s to the total link capacity of 10 Gbps. At this point, the monitoring module of the controller detectsthe link saturation (>9 Gbps) with the SD of ≤1. In this case, a new background link is established (32s) and halfof the traffic is randomly moved to the new background connection (Racks 1, 3). Now each background connectionscarries 8 Gbps of traffic. Fig. 2(e) shows the throughput as a function of time. Next, we demonstrate autonomousbandwidth adjustment for a dynamic Rack-to-Rack connection. Fig. 2(f) shows the results. Rack 1, 2, 3 between DC1and DC2 are transmitting data with data rates of 0.8, 1, and 1.2 Gbps, respectively. At 23s, Rack 1 requires morebandwidth (18 Gbps traffic) and start saturating the background traffic (30s). At this point, the controller has measuredhigh throughput (>9 Gbps) on the background connection and since the SD of the 3 traffic flows is larger than 1, adedicated Rack-to-Rack connection is established for Racks 1 of the DCs (32s). At this point a dedicated dynamic 40Gbps link is transferring Rack 1 data and the 10 Gbps background link transfers Racks 2 and 3.
4. Conclusion
We proposed a self-adaptive multi-rate optical network architecture and provisioning strategy for geographically dis-tributed metro DCs. Based on the traffic characteristics and requested QoS, the SDN control plane detects, grooms andautonomously provisions the bandwidth resources across the network. Simulation results in a realistic scenario show2.5–5× shorter transmission times and 20–25% lower wavelength usage compared with converged and conventionalsingle-rate networks. The architecture and control plane were experimentally validated on a prototype.
AcknowledgmentThis work was supported in part by CIAN NSF ERC (EEC-0812072), NSF NeTS (CNS-1423105), DoE ASCR Turbo Project (DE-SC0015867)and the Swedish Research Council (VR). We would also like to thank ATT, Calient and Polatis for generous donations to our testbed.
References1. Cisco White Paper, “Cisco Visual Networking Index: Forecast and Methodology, 2014-2019 White Paper,” Aug. 2015.2. S. Yan et al., “Archon: A Function Programmable Optical Interconnect Architecture for Transparent Intra and Inter Data Center... ,” JLT 2015.3. Gang Chen et al., “First Demonstration of Holistically-Organized Metro-Embedded Cloud Platform with All-Optical... ,” OECC, 2015.4. R. Doverspike et al., “Using sdn technology to enable cost-effective bandwidth-on-demand for cloud services,” JOCN 2015.5. D. Adami et al., “Cloud and Network Service Orchestration in Software Defined Data Centers,” SPECTS 2015.6. M. Fiorani et al., “Flexible Architecture & Control Strategy for Metro-Scale Networking of Geographically Distributed DCs,” ECOC 2016.7. P. Samadi et al.,“Software-Defined Optical Network for Metro-Scale Geographically Distributed Data Centers,” OE 2016.