handout # 9: packet forwarding in data plane

68
Handout # 9: Packet Forwarding in Data Plane Professor Yashar Ganjali Department of Computer Science University of Toronto [email protected] http://www.cs.toronto.edu/~yganjali CSC 2229 – Software-Defined Networking

Upload: portia-ratliff

Post on 01-Jan-2016

38 views

Category:

Documents


0 download

DESCRIPTION

Handout # 9: Packet Forwarding in Data Plane. CSC 2229 – Software-Defined Networking. Professor Yashar Ganjali Department of Computer Science University of Toronto [email protected] http://www.cs.toronto.edu/~yganjali. TexPoint fonts used in EMF. - PowerPoint PPT Presentation

TRANSCRIPT

Slide 1

Handout # 9: Packet Forwarding in Data PlaneProfessor Yashar GanjaliDepartment of Computer ScienceUniversity of Toronto

[email protected]://www.cs.toronto.edu/~yganjaliCSC 2229 Software-Defined NetworkingTexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: 1AnnouncementsIn class presentation scheduleSent via emailIf you havent emailed me, please do ASAP.

Need to change your presentation time?Contact your classmates via the mailing list.Let me know once you agree on a switch.At least one week before your presentation

Final projectStart early. Set weekly goals and stick to those.

No office hours next weekCSC 2229 - Software-Defined Networking2University of Toronto Fall 20152Data Plane FunctionalitiesSo far we have focused on control planeIt distinguishes SDN from traditional networksSource of many (perceived) challenges and opportunities

What about the data plane?Which features should be provided in the data plane?

What defines the boundary between control and data planes?

CSC 2229 - Software-Defined Networking3University of Toronto Fall 2015OpenFlow and Data PlaneHistorically, OpenFlow defined data planes role as forwarding.Other functions are left to middleboxes and edge nodes in this model.

Question. Why only forwarding?Other data path functionalities exist in todays routers/switches too. What distinguishes forwarding from other functions?CSC 2229 - Software-Defined Networking4University of Toronto Fall 2015Adding New FunctionalitiesCSC 2229 - Software-Defined Networking5University of Toronto Fall 2015Consider a new functionality we want to implement in a software-defined network.Where is the best place to add that function?Controller?Switches?Middleboxes?Endhosts?

??? ? Adding New FunctionalitiesWhat metrics/criteria do we have to decide where new functionalities should be implemented?

Example: Elephant flow detectionData plane: change forwarding elements (DevoFlow)Control plane: push functionality close to the path (Kandoo)

What does the development process look like? Changing ASIC expensive and time-consumingChanging software fastCSC 2229 - Software-Defined Networking6University of Toronto Fall 2015Alternative ViewDecouple based on development cycleFast: mostly softwareSlow: mostly hardware

Development processStart in softwareAs function matures, move towards hardware

Not a new ideaModel has been used for a long-time in industry.

CSC 2229 - Software-Defined Networking7University of Toronto Fall 2015Back to Data PlaneMost important functionality of the data plane: packet forwarding

Packet forwardingReceive packet on ingress linesTransfer to the appropriate egress lineTransmit the packet

Later: other functionalitiesUpdate header? Other processing?CSC 2229 - Software-Defined Networking8University of Toronto Fall 2015Output-Queued SwitchingCSC 2229 - Software-Defined Networking9University of Toronto Fall 20159LookupIP AddressUpdateHeaderHeader ProcessingAddressTableLookupIP AddressUpdateHeaderHeader ProcessingAddressTableLookupIP AddressUpdateHeaderHeader ProcessingAddressTableQueuePacketBufferMemoryQueuePacketBufferMemoryQueuePacketBufferMemoryDataHdrDataHdrDataHdr12N 12NN times line rateN times line rateProperties of OQ SwitchesRequire speedup of NCrossbar fabric N times faster than the line rate

Work conservingIf there is a packet for a specific output port, link will not stay idle

Ideal in theoryAchieves 100% throughputVery hard to build in practiceCSC 2229 - Software-Defined Networking10University of Toronto Fall 2015CSC 2229 - Software-Defined Networking11University of Toronto Fall 2015Input-Queued SwitchingLookupIP AddressUpdateHeaderHeader ProcessingAddressTableLookupIP AddressUpdateHeaderHeader ProcessingAddressTableLookupIP AddressUpdateHeaderHeader ProcessingAddressTableQueuePacketBufferMemoryQueuePacketBufferMemoryQueuePacketBufferMemoryDataHdrDataHdrDataHdr12N 12NDataHdrDataHdrDataHdrScheduler11Properties of Input-Queued SwitchesNeed for a schedulerChoose which input port to connect to which output port during each time slot

No speedup necessaryBut sometimes helpful to ensure 100% throughput

Throughput might be limited due to poor schedulingCSC 2229 - Software-Defined Networking12University of Toronto Fall 2015CSC 2229 - Software-Defined Networking13University of Toronto Fall 2015Head-of-Line BlockingBlocked!Blocked!The switch is NOT work-conserving!Blocked!13CSC 2229 - Software-Defined Networking14University of Toronto Fall 201514CSC 2229 - Software-Defined Networking15University of Toronto Fall 201515CSC 2229 - Software-Defined Networking16University of Toronto Fall 2015SchedulerVOQsVOQs: How Packets Move16HoL Blocking vs. OQ SwitchCSC 2229 - Software-Defined Networking17University of Toronto Fall 2015OQ switch

IQ switch with HoL blocking0%20%40%60%80%100%LoadDelay17Scheduling in Input-Queued SwitchesInput: the scheduler can use Average rateQueue sizesPacket delays

Output: which ingress port to connect to which egress portA matching between input and output ports

Goal: achieve 100% throughput

CSC 2229 - Software-Defined Networking18University of Toronto Fall 2015CSC 2229 - Software-Defined Networking19University of Toronto Fall 2015Common Definitions of 100% throughputWork-conserving

For all n,i,j, Qij(n) < C,i.e.,

For all n,i,j, E[Qij(n)] < Ci.e.,

Departure rate = arrival rate,i.e., weakerUsually we focus on this definition.

19Other definitions?Why 100% throughput?Scheduling in Input-Queued SwitchesUniform trafficUniform cyclicRandom permutationWait-until-fullNon-uniform traffic, known traffic matrixBirkhoff-von-NeumannUnknown traffic matrixMaximum Size MatchingMaximum Weight Matching CSC 2229 - Software-Defined Networking20University of Toronto Fall 201520CSC 2229 - Software-Defined Networking21University of Toronto Fall 2015100% Throughput for Uniform TrafficNearly all algorithms in literature can give 100% throughput when traffic is uniformFor example:Uniform cyclic. Random permutation.Wait-until-full [simulations].Maximum size matching (MSM) [simulations].Maximal size matching (e.g. WFA, PIM, iSLIP) [simulations].21Uniform Cyclic SchedulingCSC 2229 - Software-Defined Networking22University of Toronto Fall 2015A1BCD234BCD234BCD234A1A1=/N < 1/N1/NEach (i, j) pair is served every N time slots: Geom/D/1Stable for < 122Wait Until FullWe dont have to do much at all to achieve 100% throughput when arrivals are Bernoulli IID uniform.Simulation suggests that the following algorithm leads to 100% throughput.

Wait-until-full:If any VOQ is empty, do nothing (i.e. serve no queues).If no VOQ is empty, pick a random permutation.CSC 2229 - Software-Defined Networking23University of Toronto Fall 201523CSC 2229 - Software-Defined Networking24University of Toronto Fall 2015Simple Algorithms with 100% Throughput

WaituntilfullMaximalMatchingAlgorithm(iSLIP)MSMUniform Cyclic24OutlineUniform trafficUniform cyclicRandom permutationWait-until-fullNon-uniform traffic, known traffic matrixBirkhoff-von-NeumannUnknown traffic matrixMaximum Size MatchingMaximum Weight Matching CSC 2229 - Software-Defined Networking25University of Toronto Fall 201525Non-Uniform TrafficAssume the traffic matrix is:

is admissible (rates < 1) and non-uniformCSC 2229 - Software-Defined Networking26University of Toronto Fall 2015

26Uniform Schedule?Uniform schedule cannot work

Each VOQ serviced at rate = 1/N =

Arrival rate > departure rate switch unstable!CSC 2229 - Software-Defined Networking27University of Toronto Fall 2015Need to adapt schedule to traffic matrix.27Example 1 Scheduling (Trivial)Assume we know the traffic matrix, it is admissible, and it follows a permutation:

Then we can simply choose:CSC 2229 - Software-Defined Networking28University of Toronto Fall 2015

28CSC 2229 - Software-Defined Networking29University of Toronto Fall 2015Example 2 - BvN

Consider the following traffic matrix:Step 1) Increase entries so rows/cols each add up to 1.29CSC 2229 - Software-Defined Networking30University of Toronto Fall 2015BvN Example Contd

Step 2) Find permutation matrices and weights so that Schedule these permutation matrices based on the weights.

For details, please see the references.30Unknown Traffic Matrix We want to maximize throughputTraffic matrix unknown cannot use BvNIdea: maximize instantaneous throughputIn other words: transfer as many packets as possible at each time-slotMaximum Size Matching (MSM) algorithmCSC 2229 - Software-Defined Networking31University of Toronto Fall 201531Maximum Size Matching (MSM)MSM maximizes instantaneous throughput

MSM algorithm: among all maximum size matches, pick a random oneCSC 2229 - Software-Defined Networking32University of Toronto Fall 2015Q11(n)>0QN1(n)>0Request GraphBipartite MatchMaximumSize Match32CSC 2229 - Software-Defined Networking33University of Toronto Fall 2015Question Is the intuition right?Answer: NO!

There is a counter-example for which, in a given VOQ (i,j), ij < ij but MSM does not provide 100% throughput.33CSC 2229 - Software-Defined Networking34University of Toronto Fall 2015

Counter-example Consider the following non-uniform traffic pattern, with Bernoulli IID arrivals:

Consider the case when Q21, Q32 both have arrivals, w. p. (1/2 - )2.In this case, input 1 is served w. p. at most 2/3.

Overall, the service rate for input 1, 1 is at most

2/3.[(1/2-)2] + 1.[1-(1/2- )2]

i.e. 1 1 1/3.(1/2- )2 .

Switch unstable for 0.0358Three possiblematches, S(n):34CSC 2229 - Software-Defined Networking35University of Toronto Fall 2015Simulation of Simple 3x3 Example

35OutlineUniform trafficUniform cyclicRandom permutationWait-until-fullNon-uniform traffic, known traffic matrixBirkhoff-von-NeumannUnknown traffic matrixMaximum Size MatchingMaximum Weight Matching CSC 2229 - Software-Defined Networking36University of Toronto Fall 201536Problem. Maximum size matchingMaximizes instantaneous throughput.Does not take into account VOQ backlogs.Solution. Give higher priority to VOQs which have more packets.

Scheduling in Input-Queued SwitchesCSC 2229 - Software-Defined Networking37University of Toronto Fall 2015A1(n)NNQNN(n)A1N(n)A11(n)Q11(n)11AN(n)ANN(n)AN1(n)D1(n)DN(n)S*(n)37Maximum Weight Matching (MWM)Assign weights to the edges of the request graph.

Find the matching with maximum weight.CSC 2229 - Software-Defined Networking38University of Toronto Fall 2015Q11(n)>0QN1(n)>0Request GraphWeighted Request GraphAssignWeightsW11WN138MWM SchedulingCreate the request graph.Find the associated link weights.Find the matching with maximum weight.How?Transfer packets from the ingress lines to egress lines based on the matching.

Question. How often do we need to calculate MWM?CSC 2229 - Software-Defined Networking39University of Toronto Fall 201539WeightsLongest Queue First (LQF)Weight associated with each link is the length of the corresponding VOQ.MWM here, tends to give priority to long queues.Does not necessarily serve the longest queue.

Oldest Cell First (OCF)Weight of each link is the waiting time of the HoL packet in the corresponding queue.CSC 2229 - Software-Defined Networking40University of Toronto Fall 201540RecapUniform trafficUniform cyclicRandom permutationWait-until-fullNon-uniform traffic, known traffic matrixBirkhoff-von-NeumannUnknown traffic matrixMaximum Size MatchingMaximum Weight Matching CSC 2229 - Software-Defined Networking41University of Toronto Fall 201541Summary of MWM SchedulingMWM LQF scheduling provides 100% throughput.It can starve some of the packets.MWM OCF scheduling gives 100% throughput.No starvation.

Question. Are these fast enough to implement in real switches?CSC 2229 - Software-Defined Networking42University of Toronto Fall 201542Complexity of Maximum MatchingsMaximum Size Matchings:Typical complexity O(N2.5)Maximum Weight Matchings:Typical complexity O(N3)

In general:Hard to implement in hardwareSlooooow

Can we find a faster algorithm?CSC 2229 - Software-Defined Networking43University of Toronto Fall 201543Maximal MatchingA maximal matching is a matching in which each edge is added one at a time, and is not later removed from the matching.

No augmenting paths allowed (they remove edges added earlier)

Consequence: no input and output are left unnecessarily idle.CSC 2229 - Software-Defined Networking44University of Toronto Fall 201544CSC 2229 - Software-Defined Networking45University of Toronto Fall 2015Example of Maximal MatchingA1BCDEF23456A1BCDEF23456Maximal Size MatchingMaximum Size MatchingABCDEF12345645Properties of Maximal MatchingsIn general, maximal matching is much simpler to implement, and has a much faster running time.

A maximal size matching is at least half the size of a maximum size matching. (Why?)

Well study the following algorithms:Greedy LQFWFAPIMiSLIPCSC 2229 - Software-Defined Networking46University of Toronto Fall 201546Greedy LQFGreedy LQF (Greedy Longest Queue First) is defined as follows:Pick the VOQ with the most number of packets (if there are ties, pick at random among the VOQs that are tied). Say it is VOQ(i1,j1). Then, among all free VOQs, pick again the VOQ with the most number of packets (say VOQ(i2,j2), with i2 i1, j2 j1).Continue likewise until the algorithm converges.Greedy LQF is also called iLQF (iterative LQF) and Greedy Maximal Weight Matching.CSC 2229 - Software-Defined Networking47University of Toronto Fall 201547Properties of Greedy LQFThe algorithm converges in at most N iterations. (Why?)Greedy LQF results in a maximal size matching. (Why?)Greedy LQF produces a matching that has at least half the size and half the weight of a maximum weight matching. (Why?)CSC 2229 - Software-Defined Networking48University of Toronto Fall 201548CSC 2229 - Software-Defined Networking49University of Toronto Fall 2015Wave Front Arbiter (WFA) [Tamir and Chi, 1993]RequestsMatch123412341234123449CSC 2229 - Software-Defined Networking50University of Toronto Fall 2015Wave Front ArbiterRequestsMatch50CSC 2229 - Software-Defined Networking51University of Toronto Fall 2015Wave Front Arbiter Implementation 1,11,21,31,42,12,22,32,43,13,23,33,44,14,24,34,4Simple combinational logic blocks51CSC 2229 - Software-Defined Networking52University of Toronto Fall 2015Wave Front Arbiter Wrapped WFA (WWFA)RequestsMatchN steps instead of2N-152Properties of Wave Front ArbitersFeed-forward (i.e. non-iterative) design lends itself to pipelining.Always finds maximal match.Usually requires mechanism to prevent Q11 from getting preferential service.In principle, can be distributed over multiple chips.CSC 2229 - Software-Defined Networking53University of Toronto Fall 201553CSC 2229 - Software-Defined Networking54University of Toronto Fall 2015Parallel Iterative Matching [Anderson et al., 1993]uar selectionuar selection123412342: Grant123412343: Accept/Match1: Requests12341234#1Iteration:12341234#2123412341234123454PIM PropertiesGuaranteed to find a maximal match in at most N iterations. (Why?)In each phase, each input and output arbiter can make decisions independently.In general, will converge to a maximal match in < N iterations. How many iterations should we run?CSC 2229 - Software-Defined Networking55University of Toronto Fall 201555Parallel Iterative Matching Convergence TimeCSC 2229 - Software-Defined Networking56University of Toronto Fall 2015

Number of iterations to converge:Anderson et al., High-Speed Switch Scheduling for Local Area Networks, 1993. 56CSC 2229 - Software-Defined Networking57University of Toronto Fall 2015Parallel Iterative Matching

57CSC 2229 - Software-Defined Networking58University of Toronto Fall 2015Parallel Iterative Matching

PIM with a single iteration58CSC 2229 - Software-Defined Networking59University of Toronto Fall 2015Parallel Iterative Matching

PIM with 4 iterations59CSC 2229 - Software-Defined Networking60University of Toronto Fall 2015iSLIP [McKeown et al., 1993]123412341: Requests123412343: Accept/Match12341234#1#21234123412341234123412342: Grant123460iSLIP OperationGrant phase: Each output selects the requesting input at the pointer, or the next input in round-robin order. It only updates its pointer if the grant is accepted.Accept phase: Each input selects the granting output at the pointer, or the next output in round-robin order. Consequence: Under high load, grant pointers tend to move to unique values.CSC 2229 - Software-Defined Networking61University of Toronto Fall 201561iSLIP PropertiesRandom under low loadTDM under high loadLowest priority to MRU (most recently used)1 iteration: fair to outputsConverges in at most N iterations. (On average, simulations suggest < log2N)Implementation: N priority encoders100% throughput for uniform i.i.d. traffic.Butsome pathological patterns can lead to low throughput.CSC 2229 - Software-Defined Networking62University of Toronto Fall 201562CSC 2229 - Software-Defined Networking63University of Toronto Fall 2015iSLIP

63CSC 2229 - Software-Defined Networking64University of Toronto Fall 2015iSLIP

64CSC 2229 - Software-Defined Networking65University of Toronto Fall 2015iSLIP ImplementationGrantGrantGrantAcceptAcceptAccept12N12NStateNNNDecisionlog2Nlog2Nlog2NProgrammablePriority Encoder65Maximal MatchesMaximal matching algorithms are widely used in industry (especially algorithms based on WFA and iSLIP).

PIM and iSLIP are rarely run to completion (i.e. they are sub-maximal).

A maximal match with a speedup of 2 is stable for non-uniform traffic.CSC 2229 - Software-Defined Networking66University of Toronto Fall 201566CSC 2229 - Software-Defined Networking67University of Toronto Fall 2015ReferencesAchieving 100% Throughput in an Input-queued Switch (Extended Version). Nick McKeown, Adisak Mekkittikul, Venkat Anantharam and Jean Walrand. IEEE Transactions on Communications, Vol.47, No.8, August 1999.A Practical Scheduling Algorithm to Achieve 100% Throughput in Input-Queued Switches.. Adisak Mekkittikul and Nick McKeown. IEEE Infocom 98, Vol 2, pp. 792-799, April 1998, San Francisco. 67ReferencesA. Schrijver, Combinatorial Optimization - Polyhedra and Efficiency, Springer-Verlag, 2003.

T. Anderson, S. Owicki, J. Saxe, and C. Thacker, High-Speed Switch Scheduling for Local-Area Networks, ACM Transactions on Computer Systems, II (4):319-352, November 1993.

Y. Tamir and H.-C. Chi, Symmetric Crossbar Arbiters for VLSI Communication Switches, IEEE Transactions on Parallel and Distributed Systems, 4(j):13-27, 1993.

N. McKeown, The iSLIP Scheduling Algorithm for Input-Queued Switches, IEEE/ACM Transactions on Networking, 7(2):188-201, April 1999. CSC 2229 - Software-Defined Networking68University of Toronto Fall 201568