chapter 3 outline principles of congestion controlqyang/ele543/2020/lecture4.pdf · 3 transport...

8
1 Transport Layer 3-1 Chapter 3 outline 3.1 transport-layer services 3.2 principles of reliable data transfer 3.3 connection-oriented transport: TCP segment structure reliable data transfer flow control connection management 3.4 principles of congestion control 3.5 TCP congestion control Transport Layer 3-2 congestion: § informally: too many sources sending too much data too fast for network to handle§ different from flow control! § manifestations: lost packets (buffer overflow at routers) long delays (queueing in router buffers) § a top-10 problem! Principles of congestion control Transport Layer 3-3 Causes/costs of congestion: scenario 1 § two senders, two receivers § one router, infinite buffers § output link capacity: R § no retransmission § maximum per-connection throughput: R/2 unlimited shared output link buffers Host A original data: lin Host B throughput: l out R/2 R/2 l out lin R/2 delay lin v large delays as arrival rate, lin, approaches capacity Transport Layer 3-4 § one router, finite buffers § sender retransmission of timed-out packet application-layer input = application-layer output: lin = l out transport-layer input includes retransmissions : lin lin finite shared output link buffers Host A l in : original data Host B lout l'in: original data, plus retransmitted data Causes/costs of congestion: scenario 2 Transport Layer 3-5 idealization: perfect knowledge § sender sends only when router buffers available finite shared output link buffers lin : original data lout l' in : original data, plus retransmitted data copy free buffer space! R/2 R/2 l out lin Causes/costs of congestion: scenario 2 Host B A Transport Layer 3-6 lin : original data lout l'in: original data, plus retransmitted data copy no buffer space! Idealization: known loss packets can be lost, dropped at router due to full buffers § sender only resends if packet known to be lost Causes/costs of congestion: scenario 2 A Host B

Upload: others

Post on 22-May-2020

17 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Chapter 3 outline Principles of congestion controlqyang/ele543/2020/Lecture4.pdf · 3 Transport Layer3-13 TCP congestion control: additive increase multiplicative decrease §approach:senderincreases

1

Transport Layer 3-1

Chapter 3 outline

31 transport-layer services

32 principles of reliable data transfer

33 connection-oriented transport TCPbull segment structurebull reliable data transferbull flow controlbull connection management

34 principles of congestion control

35 TCP congestion control

Transport Layer 3-2

congestionsect informally ldquotoo many sources sending too much

data too fast for network to handlerdquosect different from flow controlsect manifestationsbull lost packets (buffer overflow at routers)bull long delays (queueing in router buffers)

sect a top-10 problem

Principles of congestion control

Transport Layer 3-3

Causescosts of congestion scenario 1

sect two senders two receivers

sect one router infinite buffers sect output link capacity Rsect no retransmission

sect maximum per-connection throughput R2

unlimited shared output link buffers

Host A

original data lin

Host B

throughput lout

R2

R2

l out

lin R2

dela

y

lin

v large delays as arrival rate lin approaches capacity

Transport Layer 3-4

sect one router finite buffers sect sender retransmission of timed-out packet

bull application-layer input = application-layer output lin = lout

bull transport-layer input includes retransmissions lin lin

finite shared output link buffers

Host A

lin original data

Host B

loutlin original data plusretransmitted data

lsquo

Causescosts of congestion scenario 2

Transport Layer 3-5

idealization perfect knowledge

sect sender sends only when router buffers available

finite shared output link buffers

lin original dataloutlin original data plus

retransmitted data

copy

free buffer space

R2

R2

l out

lin

Causescosts of congestion scenario 2

Host B

A

Transport Layer 3-6

lin original dataloutlin original data plus

retransmitted data

copy

no buffer space

Idealization known losspackets can be lost dropped at router due to full buffers

sect sender only resends if packet known to be lost

Causescosts of congestion scenario 2

A

Host B

2

Transport Layer 3-7

lin original dataloutlin original data plus

retransmitted data

free buffer space

Causescosts of congestion scenario 2Idealization known loss

packets can be lost dropped at router due to full buffers

sect sender only resends if packet known to be lost

R2

R2lin

lout

when sending at R2 some packets are retransmissions but asymptotic goodput is still R2 (why)

A

Host BTransport Layer 3-8

A

linloutlincopy

free buffer space

timeout

R2

R2lin

lout

when sending at R2 some packets are retransmissions including duplicated that are delivered

Host B

Realistic duplicatessect packets can be lost dropped at

router due to full bufferssect sender times out prematurely

sending two copies both of which are delivered

Causescosts of congestion scenario 2

Transport Layer 3-9

R2

lout

when sending at R2 some packets are retransmissions including duplicated that are delivered

ldquocostsrdquo of congestionsect more work (retrans) for given ldquogoodputrdquosect unneeded retransmissions link carries multiple copies of pkt

bull decreasing goodput

R2lin

Causescosts of congestion scenario 2Realistic duplicatessect packets can be lost dropped at

router due to full bufferssect sender times out prematurely

sending two copies both of which are delivered

Transport Layer 3-10

sect four senderssect multihop pathssect timeoutretransmit

Q what happens as lin and linrsquo

increase

finite shared output link buffers

Host A lout

Causescosts of congestion scenario 3

Host B

Host CHost D

lin original datalin original data plus

retransmitted data

A as red linrsquo increases all arriving blue pkts at upper queue are dropped blue throughput g 0

Transport Layer 3-11

another ldquocostrdquo of congestionsect when packet dropped any ldquoupstream

transmission capacity used for that packet was wasted

Causescosts of congestion scenario 3

C2

C2

l out

linrsquo

Transport Layer 3-12

Chapter 3 outline

31 transport-layer services

32 multiplexing and demultiplexing

33 connectionless transport UDP

34 principles of reliable data transfer

35 connection-oriented transport TCPbull segment structurebull reliable data transferbull flow controlbull connection management

36 principles of congestion control

37 TCP congestion control

3

Transport Layer 3-13

TCP congestion control additive increase multiplicative decrease

sect approach sender increases transmission rate (window size) probing for usable bandwidth until loss occursbull additive increase increase cwnd by 1 MSS every

RTT until loss detectedbull multiplicative decrease cut cwnd in half after loss

cwnd

TC

P s

ende

r co

nges

tion

win

dow

siz

e

AIMD saw toothbehavior probing

for bandwidth

additively increase window size helliphellip until loss occurs (then cut window in half)

timeTransport Layer 3-14

TCP Congestion Control details

sect sender limits transmission

sect cwnd is dynamic function of perceived network congestion

TCP sending ratesect roughly send cwnd

bytes wait RTT for ACKS then send more bytes

last byteACKed sent not-

yet ACKed(ldquoin-flightrdquo)

last byte sent

cwnd

LastByteSent-LastByteAcked

lt cwnd

sender sequence number space

rate ~~cwndRTT

bytessec

Transport Layer 3-15

TCP Slow Start sect when connection begins

increase rate exponentially until first loss eventbull initially cwnd = 1 MSSbull double cwnd every RTTbull done by incrementing cwnd for every ACK received

sect summary initial rate is slow but ramps up exponentially fast

Host A

one segment

RTT

Host B

time

two segments

four segments

Transport Layer 3-16

TCP detecting reacting to loss

sect loss indicated by timeoutbull cwnd set to 1 MSS bull window then grows exponentially (as in slow start)

to threshold then grows linearlysect loss indicated by 3 duplicate ACKs TCP RENObull dup ACKs indicate network capable of delivering

some segments bull cwnd is cut in half window then grows linearly

sect TCP Tahoe always sets cwnd to 1 (timeout or 3 duplicate acks)

Transport Layer 3-17

Q when should the exponential increase switch to linear

A when cwnd gets to 12 of its value before timeout

Implementationsect variable ssthreshsect on loss event ssthresh

is set to 12 of cwnd just before loss event

TCP switching from slow start to CA

Check out the online interactive exercises for more examples httpgaiacsumassedukurose_rossinteractive Transport Layer 3-18

Summary TCP Congestion Control

timeoutssthresh = cwnd2

cwnd = 1 MSSdupACKcount = 0

retransmit missing segment

Lcwnd gt ssthresh

congestionavoidance

cwnd = cwnd + MSS (MSScwnd)dupACKcount = 0

transmit new segment(s) as allowed

new ACK

dupACKcount++duplicate ACK

fastrecovery

cwnd = cwnd + MSStransmit new segment(s) as allowed

duplicate ACK

ssthresh= cwnd2cwnd = ssthresh + 3

retransmit missing segment

dupACKcount == 3

timeoutssthresh = cwnd2cwnd = 1 dupACKcount = 0retransmit missing segment

ssthresh= cwnd2cwnd = ssthresh + 3retransmit missing segment

dupACKcount == 3cwnd = ssthreshdupACKcount = 0

New ACK

slow start

timeoutssthresh = cwnd2

cwnd = 1 MSSdupACKcount = 0

retransmit missing segment

cwnd = cwnd+MSSdupACKcount = 0transmit new segment(s) as allowed

new ACKdupACKcount++duplicate ACK

Lcwnd = 1 MSS

ssthresh = 64 KBdupACKcount = 0

NewACK

NewACK

NewACK

4

Transport Layer 3-19

TCP throughputsect avg TCP thruput as function of window size RTTbull ignore slow start assume always data to send

sect W window size (measured in bytes) where loss occursbull avg window size ( in-flight bytes) is frac34 Wbull avg thruput is 34W per RTT

W

W2

avg TCP thruput = 34

WRTT bytessec

Transport Layer 3-20

TCP Futures TCP over ldquolong fat pipesrdquo

sect example 1500 byte segments 100ms RTT want 10 Gbps throughput

sect requires W = 83333 in-flight segmentssect throughput in terms of segment loss probability L

[Mathis 1997]

to achieve 10 Gbps throughput need a loss rate of L = 210-10 ndash a very small loss rate

sect new versions of TCP for high-speed

TCP throughput = 122 MSSRTT L

Transport Layer 3-21

fairness goal if K TCP sessions share same bottleneck link of bandwidth R each should have average rate of RK

TCP connection 1

bottleneckrouter

capacity R

TCP Fairness

TCP connection 2

Transport Layer 3-22

Why is TCP fairtwo competing sessionssect additive increase gives slope of 1 as throughout increasessect multiplicative decrease decreases throughput proportionally

R

R

equal bandwidth share

Connection 1 throughput

Con

nect

ion

2 th

roug

hput

congestion avoidance additive increaseloss decrease window by factor of 2

congestion avoidance additive increaseloss decrease window by factor of 2

Transport Layer 3-23

Fairness (more)Fairness and UDPsect multimedia apps often

do not use TCPbull do not want rate

throttled by congestion control

sect instead use UDPbull send audiovideo at

constant rate tolerate packet loss

Fairness parallel TCP connections

sect application can open multiple parallel connections between two hosts

sect web browsers do this sect eg link of rate R with 9

existing connectionsbull new app asks for 1 TCP gets

rate R10bull new app asks for 11 TCPs

gets R2

Transport Layer 3-24

network-assisted congestion controlsect two bits in IP header (ToS field) marked by network router

to indicate congestionsect congestion indication carried to receiving hostsect receiver (seeing congestion indication in IP datagram) )

sets ECE bit on receiver-to-sender ACK segment to notify sender of congestion

Explicit Congestion Notification (ECN)

sourceapplicationtransportnetworklink

physical

destinationapplicationtransportnetworklink

physical

ECN=00 ECN=11

ECE=1

IP datagram

TCP ACK segment

5

Transport Layer 3-25

Chapter 3 summarysect principles behind transport

layer servicesbull multiplexing

demultiplexingbull reliable data transferbull flow controlbull congestion control

sect instantiation implementation in the Internetbull UDPbull TCP

nextsect leaving the network ldquoedgerdquo (application transport layers)

sect into the network ldquocorerdquo

sect two network layer chaptersbull data planebull control plane

Quiz 2

Consider the following network Host A wants tosimultaneously send messages to hosts B and C A isconnected to B and C via a broadcast channelmdasha packet sentby A is carried by the channel to both B and C Suppose thatthe broadcast channel connecting A B and C canindependently lose and corrupt messages (and so forexample a message sent from A might be correctly receivedby B but not by C) The stop-and-wait-like error-controlprotocol is used for reliably transferring packets from A to Band C such that A will send a new packet from the upperlayer until it knows that both B and C have correctlyreceived the current packet Packets from upper layers maybe queued in a sufficiently large buffer Give FSM descriptionsof A and C (Hint The FSM for B should be the same as forC)

Transport Layer 3-26

Quiz 3

Consider the GBN protocol with a sender window size of 4and sequence number range of 1024 Suppose that at time tthe next in-order packet that the receiver is expecting has asequence number of k Assume that the medium does notreorder message Answer the following questionsaWhat are the possible sets of sequence numbers inside thesenderrsquos window at time tbWhat are all possible values of the ACK field in all possiblemessages currently propagating back to the sender at time t

Transport Layer 3-27 Transport Layer 3-28

aHere we have a window size of N=3 Suppose the receiver has received packet k-1 and has ACKed that and all other preceding packets If all of these ACKs have been received by sender then senders window is [k k+N-1] Suppose next that none of the ACKs have been received at the sender In this second case the senders window contains k-1 and the N packets up to and including k-1 The senders window is thus [k-Nk-1] By these arguments the senders window is of size 3 and begins somewhere in the range [k-Nk]bIf the receiver is waiting for packet k then it has received (and ACKed) packet k-1 and the N-1 packets before that If none of those N ACKs have been yet received by the sender then ACK messages with values of [k-Nk-1] may still be propagating backBecause the sender has sent packets [k-N k-1] it must be the case that the sender has already received an ACK for k-N-1 Once the receiver has sent an ACK for k-N-1 it will never send an ACK that is less that k-N-1 Thus the range of in-flight ACK values can range from k-N-1 to k-1

Transport Layer 3-29

Chapter 4 network layer

chapter goalssect understand principles behind network layer

services bull network layer service modelsbull forwarding versus routingbull how a router worksbull generalized forwarding

sect instantiation implementation in the Internet

4-30

Network Layer Data Plane

6

Network layersect transport segment from

sending to receiving host sect on sending side

encapsulates segments into datagrams

sect on receiving side delivers segments to transport layer

sect network layer protocols in every host router

sect router examines header fields in all IP datagrams passing through it

applicationtransportnetworkdata linkphysical

applicationtransportnetworkdata linkphysical

networkdata linkphysical network

data linkphysical

networkdata linkphysical

networkdata linkphysical

networkdata linkphysical

networkdata linkphysical

networkdata linkphysical

networkdata linkphysical

networkdata linkphysical

networkdata linkphysicalnetwork

data linkphysical

4-31

Network Layer Data Plane

Two key network-layer functions

network-layer functionssectforwarding move packets from routerrsquos input to appropriate router outputsectrouting determine route taken by packets from source to destinationbull routing algorithms

analogy taking a tripsect forwarding process of

getting through single interchange

sect routing process of planning trip from source to destination

4-32

Network Layer Data Plane

Network service modelQ What service model for ldquochannelrdquo transporting datagrams from sender to receiver

example services for individual datagrams

sect guaranteed deliverysect guaranteed delivery with

less than 40 msec delay

example services for a flow of datagrams

sect in-order datagram deliverysect guaranteed minimum

bandwidth to flowsect restrictions on changes in

inter-packet spacing

4-33

Network Layer Data Plane

Router architecture overview

high-seed switching

fabric

routing processor

router input ports router output ports

forwarding data plane (hardware) operttes in

nanosecond timeframe

routing managementcontrol plane (software)operates in millisecond

time frame

sect high-level view of generic router architecture

4-34Network Layer Data Plane

linetermination

link layer

protocol(receive)

lookupforwarding

queueing

Input port functions

decentralized switchingsect using header field values lookup output

port using forwarding table in input port memory (ldquomatch plus actionrdquo)

sect goal complete input port processing at lsquoline speedrsquo

sect queuing if datagrams arrive faster than forwarding rate into switch fabric

physical layerbit-level reception

data link layereg Ethernetsee chapter 5

switchfabric

4-35Network Layer Data Plane

linetermination

link layer

protocol(receive)

lookupforwarding

queueing

Input port functions

decentralized switchingsect using header field values lookup output

port using forwarding table in input port memory (ldquomatch plus actionrdquo)

sect destination-based forwarding forward based only on destination IP address (traditional)

sect generalized forwarding forward based on any set of header field values

physical layerbit-level reception

data link layereg Ethernetsee chapter 5

switchfabric

4-36Network Layer Data Plane

7

DestinationAddress Range

11001000 00010111 00010000 00000000through11001000 00010111 00010111 11111111

11001000 00010111 00011000 00000000through11001000 00010111 00011000 11111111

11001000 00010111 00011001 00000000through11001000 00010111 00011111 11111111

otherwise

Link Interface

0

1

2

3

Q but what happens if ranges donrsquot divide up so nicely

Destination-based forwardingforwarding table

4-37Network Layer Data Plane

Longest prefix matching

Destination Address Range11001000 00010111 00010 11001000 00010111 00011000 11001000 00010111 00011 otherwise

DA 11001000 00010111 00011000 10101010

examplesDA 11001000 00010111 00010110 10100001 which interface

which interface

when looking for forwarding table entry for given destination address use longest address prefix that matches destination address

longest prefix matching

Link interface01

23

4-38Network Layer Data Plane

Longest prefix matching

sect wersquoll see why longest prefix matching is used shortly when we study addressing

sect longest prefix matching often performed using ternary content addressable memories (TCAMs)bull content addressable present address to TCAM

retrieve address in one clock cycle regardless of table sizebull Cisco Catalyst can up ~1M routing table

entries in TCAM

4-39Network Layer Data Plane

Switching fabricssect transfer packet from input buffer to appropriate

output buffersect switching rate rate at which packets can be

transfer from inputs to outputsbull often measured as multiple of inputoutput line ratebull N inputs switching rate N times line rate desirable

sect three types of switching fabrics

memory

memory

bus crossbar

4-40Network Layer Data Plane

Switching via memory

first generation routerssect traditional computers with switching under direct control

of CPUsect packet copied to systemrsquos memorysect speed limited by memory bandwidth (2 bus crossings per

datagram)

inputport(eg

Ethernet)

memoryoutputport(eg

Ethernet)

system bus

4-41Network Layer Data Plane

Switching via a bus

sect datagram from input port memoryto output port memory via a

shared bussect bus contention switching speed

limited by bus bandwidthsect 32 Gbps bus Cisco 5600

sufficient speed for access and enterprise routers

bus

4-42Network Layer Data Plane

8

Switching via interconnection network

sect overcome bus bandwidth limitationssect banyan networks crossbar other

interconnection nets initially developed to connect processors in multiprocessor

sect advanced design fragmenting datagram into fixed length cells switch cells through the fabric

sect Cisco 12000 switches 60 Gbps through the interconnection network

crossbar

4-43Network Layer Data Plane

Page 2: Chapter 3 outline Principles of congestion controlqyang/ele543/2020/Lecture4.pdf · 3 Transport Layer3-13 TCP congestion control: additive increase multiplicative decrease §approach:senderincreases

2

Transport Layer 3-7

lin original dataloutlin original data plus

retransmitted data

free buffer space

Causescosts of congestion scenario 2Idealization known loss

packets can be lost dropped at router due to full buffers

sect sender only resends if packet known to be lost

R2

R2lin

lout

when sending at R2 some packets are retransmissions but asymptotic goodput is still R2 (why)

A

Host BTransport Layer 3-8

A

linloutlincopy

free buffer space

timeout

R2

R2lin

lout

when sending at R2 some packets are retransmissions including duplicated that are delivered

Host B

Realistic duplicatessect packets can be lost dropped at

router due to full bufferssect sender times out prematurely

sending two copies both of which are delivered

Causescosts of congestion scenario 2

Transport Layer 3-9

R2

lout

when sending at R2 some packets are retransmissions including duplicated that are delivered

ldquocostsrdquo of congestionsect more work (retrans) for given ldquogoodputrdquosect unneeded retransmissions link carries multiple copies of pkt

bull decreasing goodput

R2lin

Causescosts of congestion scenario 2Realistic duplicatessect packets can be lost dropped at

router due to full bufferssect sender times out prematurely

sending two copies both of which are delivered

Transport Layer 3-10

sect four senderssect multihop pathssect timeoutretransmit

Q what happens as lin and linrsquo

increase

finite shared output link buffers

Host A lout

Causescosts of congestion scenario 3

Host B

Host CHost D

lin original datalin original data plus

retransmitted data

A as red linrsquo increases all arriving blue pkts at upper queue are dropped blue throughput g 0

Transport Layer 3-11

another ldquocostrdquo of congestionsect when packet dropped any ldquoupstream

transmission capacity used for that packet was wasted

Causescosts of congestion scenario 3

C2

C2

l out

linrsquo

Transport Layer 3-12

Chapter 3 outline

31 transport-layer services

32 multiplexing and demultiplexing

33 connectionless transport UDP

34 principles of reliable data transfer

35 connection-oriented transport TCPbull segment structurebull reliable data transferbull flow controlbull connection management

36 principles of congestion control

37 TCP congestion control

3

Transport Layer 3-13

TCP congestion control additive increase multiplicative decrease

sect approach sender increases transmission rate (window size) probing for usable bandwidth until loss occursbull additive increase increase cwnd by 1 MSS every

RTT until loss detectedbull multiplicative decrease cut cwnd in half after loss

cwnd

TC

P s

ende

r co

nges

tion

win

dow

siz

e

AIMD saw toothbehavior probing

for bandwidth

additively increase window size helliphellip until loss occurs (then cut window in half)

timeTransport Layer 3-14

TCP Congestion Control details

sect sender limits transmission

sect cwnd is dynamic function of perceived network congestion

TCP sending ratesect roughly send cwnd

bytes wait RTT for ACKS then send more bytes

last byteACKed sent not-

yet ACKed(ldquoin-flightrdquo)

last byte sent

cwnd

LastByteSent-LastByteAcked

lt cwnd

sender sequence number space

rate ~~cwndRTT

bytessec

Transport Layer 3-15

TCP Slow Start sect when connection begins

increase rate exponentially until first loss eventbull initially cwnd = 1 MSSbull double cwnd every RTTbull done by incrementing cwnd for every ACK received

sect summary initial rate is slow but ramps up exponentially fast

Host A

one segment

RTT

Host B

time

two segments

four segments

Transport Layer 3-16

TCP detecting reacting to loss

sect loss indicated by timeoutbull cwnd set to 1 MSS bull window then grows exponentially (as in slow start)

to threshold then grows linearlysect loss indicated by 3 duplicate ACKs TCP RENObull dup ACKs indicate network capable of delivering

some segments bull cwnd is cut in half window then grows linearly

sect TCP Tahoe always sets cwnd to 1 (timeout or 3 duplicate acks)

Transport Layer 3-17

Q when should the exponential increase switch to linear

A when cwnd gets to 12 of its value before timeout

Implementationsect variable ssthreshsect on loss event ssthresh

is set to 12 of cwnd just before loss event

TCP switching from slow start to CA

Check out the online interactive exercises for more examples httpgaiacsumassedukurose_rossinteractive Transport Layer 3-18

Summary TCP Congestion Control

timeoutssthresh = cwnd2

cwnd = 1 MSSdupACKcount = 0

retransmit missing segment

Lcwnd gt ssthresh

congestionavoidance

cwnd = cwnd + MSS (MSScwnd)dupACKcount = 0

transmit new segment(s) as allowed

new ACK

dupACKcount++duplicate ACK

fastrecovery

cwnd = cwnd + MSStransmit new segment(s) as allowed

duplicate ACK

ssthresh= cwnd2cwnd = ssthresh + 3

retransmit missing segment

dupACKcount == 3

timeoutssthresh = cwnd2cwnd = 1 dupACKcount = 0retransmit missing segment

ssthresh= cwnd2cwnd = ssthresh + 3retransmit missing segment

dupACKcount == 3cwnd = ssthreshdupACKcount = 0

New ACK

slow start

timeoutssthresh = cwnd2

cwnd = 1 MSSdupACKcount = 0

retransmit missing segment

cwnd = cwnd+MSSdupACKcount = 0transmit new segment(s) as allowed

new ACKdupACKcount++duplicate ACK

Lcwnd = 1 MSS

ssthresh = 64 KBdupACKcount = 0

NewACK

NewACK

NewACK

4

Transport Layer 3-19

TCP throughputsect avg TCP thruput as function of window size RTTbull ignore slow start assume always data to send

sect W window size (measured in bytes) where loss occursbull avg window size ( in-flight bytes) is frac34 Wbull avg thruput is 34W per RTT

W

W2

avg TCP thruput = 34

WRTT bytessec

Transport Layer 3-20

TCP Futures TCP over ldquolong fat pipesrdquo

sect example 1500 byte segments 100ms RTT want 10 Gbps throughput

sect requires W = 83333 in-flight segmentssect throughput in terms of segment loss probability L

[Mathis 1997]

to achieve 10 Gbps throughput need a loss rate of L = 210-10 ndash a very small loss rate

sect new versions of TCP for high-speed

TCP throughput = 122 MSSRTT L

Transport Layer 3-21

fairness goal if K TCP sessions share same bottleneck link of bandwidth R each should have average rate of RK

TCP connection 1

bottleneckrouter

capacity R

TCP Fairness

TCP connection 2

Transport Layer 3-22

Why is TCP fairtwo competing sessionssect additive increase gives slope of 1 as throughout increasessect multiplicative decrease decreases throughput proportionally

R

R

equal bandwidth share

Connection 1 throughput

Con

nect

ion

2 th

roug

hput

congestion avoidance additive increaseloss decrease window by factor of 2

congestion avoidance additive increaseloss decrease window by factor of 2

Transport Layer 3-23

Fairness (more)Fairness and UDPsect multimedia apps often

do not use TCPbull do not want rate

throttled by congestion control

sect instead use UDPbull send audiovideo at

constant rate tolerate packet loss

Fairness parallel TCP connections

sect application can open multiple parallel connections between two hosts

sect web browsers do this sect eg link of rate R with 9

existing connectionsbull new app asks for 1 TCP gets

rate R10bull new app asks for 11 TCPs

gets R2

Transport Layer 3-24

network-assisted congestion controlsect two bits in IP header (ToS field) marked by network router

to indicate congestionsect congestion indication carried to receiving hostsect receiver (seeing congestion indication in IP datagram) )

sets ECE bit on receiver-to-sender ACK segment to notify sender of congestion

Explicit Congestion Notification (ECN)

sourceapplicationtransportnetworklink

physical

destinationapplicationtransportnetworklink

physical

ECN=00 ECN=11

ECE=1

IP datagram

TCP ACK segment

5

Transport Layer 3-25

Chapter 3 summarysect principles behind transport

layer servicesbull multiplexing

demultiplexingbull reliable data transferbull flow controlbull congestion control

sect instantiation implementation in the Internetbull UDPbull TCP

nextsect leaving the network ldquoedgerdquo (application transport layers)

sect into the network ldquocorerdquo

sect two network layer chaptersbull data planebull control plane

Quiz 2

Consider the following network Host A wants tosimultaneously send messages to hosts B and C A isconnected to B and C via a broadcast channelmdasha packet sentby A is carried by the channel to both B and C Suppose thatthe broadcast channel connecting A B and C canindependently lose and corrupt messages (and so forexample a message sent from A might be correctly receivedby B but not by C) The stop-and-wait-like error-controlprotocol is used for reliably transferring packets from A to Band C such that A will send a new packet from the upperlayer until it knows that both B and C have correctlyreceived the current packet Packets from upper layers maybe queued in a sufficiently large buffer Give FSM descriptionsof A and C (Hint The FSM for B should be the same as forC)

Transport Layer 3-26

Quiz 3

Consider the GBN protocol with a sender window size of 4and sequence number range of 1024 Suppose that at time tthe next in-order packet that the receiver is expecting has asequence number of k Assume that the medium does notreorder message Answer the following questionsaWhat are the possible sets of sequence numbers inside thesenderrsquos window at time tbWhat are all possible values of the ACK field in all possiblemessages currently propagating back to the sender at time t

Transport Layer 3-27 Transport Layer 3-28

aHere we have a window size of N=3 Suppose the receiver has received packet k-1 and has ACKed that and all other preceding packets If all of these ACKs have been received by sender then senders window is [k k+N-1] Suppose next that none of the ACKs have been received at the sender In this second case the senders window contains k-1 and the N packets up to and including k-1 The senders window is thus [k-Nk-1] By these arguments the senders window is of size 3 and begins somewhere in the range [k-Nk]bIf the receiver is waiting for packet k then it has received (and ACKed) packet k-1 and the N-1 packets before that If none of those N ACKs have been yet received by the sender then ACK messages with values of [k-Nk-1] may still be propagating backBecause the sender has sent packets [k-N k-1] it must be the case that the sender has already received an ACK for k-N-1 Once the receiver has sent an ACK for k-N-1 it will never send an ACK that is less that k-N-1 Thus the range of in-flight ACK values can range from k-N-1 to k-1

Transport Layer 3-29

Chapter 4 network layer

chapter goalssect understand principles behind network layer

services bull network layer service modelsbull forwarding versus routingbull how a router worksbull generalized forwarding

sect instantiation implementation in the Internet

4-30

Network Layer Data Plane

6

Network layersect transport segment from

sending to receiving host sect on sending side

encapsulates segments into datagrams

sect on receiving side delivers segments to transport layer

sect network layer protocols in every host router

sect router examines header fields in all IP datagrams passing through it

applicationtransportnetworkdata linkphysical

applicationtransportnetworkdata linkphysical

networkdata linkphysical network

data linkphysical

networkdata linkphysical

networkdata linkphysical

networkdata linkphysical

networkdata linkphysical

networkdata linkphysical

networkdata linkphysical

networkdata linkphysical

networkdata linkphysicalnetwork

data linkphysical

4-31

Network Layer Data Plane

Two key network-layer functions

network-layer functionssectforwarding move packets from routerrsquos input to appropriate router outputsectrouting determine route taken by packets from source to destinationbull routing algorithms

analogy taking a tripsect forwarding process of

getting through single interchange

sect routing process of planning trip from source to destination

4-32

Network Layer Data Plane

Network service modelQ What service model for ldquochannelrdquo transporting datagrams from sender to receiver

example services for individual datagrams

sect guaranteed deliverysect guaranteed delivery with

less than 40 msec delay

example services for a flow of datagrams

sect in-order datagram deliverysect guaranteed minimum

bandwidth to flowsect restrictions on changes in

inter-packet spacing

4-33

Network Layer Data Plane

Router architecture overview

high-seed switching

fabric

routing processor

router input ports router output ports

forwarding data plane (hardware) operttes in

nanosecond timeframe

routing managementcontrol plane (software)operates in millisecond

time frame

sect high-level view of generic router architecture

4-34Network Layer Data Plane

linetermination

link layer

protocol(receive)

lookupforwarding

queueing

Input port functions

decentralized switchingsect using header field values lookup output

port using forwarding table in input port memory (ldquomatch plus actionrdquo)

sect goal complete input port processing at lsquoline speedrsquo

sect queuing if datagrams arrive faster than forwarding rate into switch fabric

physical layerbit-level reception

data link layereg Ethernetsee chapter 5

switchfabric

4-35Network Layer Data Plane

linetermination

link layer

protocol(receive)

lookupforwarding

queueing

Input port functions

decentralized switchingsect using header field values lookup output

port using forwarding table in input port memory (ldquomatch plus actionrdquo)

sect destination-based forwarding forward based only on destination IP address (traditional)

sect generalized forwarding forward based on any set of header field values

physical layerbit-level reception

data link layereg Ethernetsee chapter 5

switchfabric

4-36Network Layer Data Plane

7

DestinationAddress Range

11001000 00010111 00010000 00000000through11001000 00010111 00010111 11111111

11001000 00010111 00011000 00000000through11001000 00010111 00011000 11111111

11001000 00010111 00011001 00000000through11001000 00010111 00011111 11111111

otherwise

Link Interface

0

1

2

3

Q but what happens if ranges donrsquot divide up so nicely

Destination-based forwardingforwarding table

4-37Network Layer Data Plane

Longest prefix matching

Destination Address Range11001000 00010111 00010 11001000 00010111 00011000 11001000 00010111 00011 otherwise

DA 11001000 00010111 00011000 10101010

examplesDA 11001000 00010111 00010110 10100001 which interface

which interface

when looking for forwarding table entry for given destination address use longest address prefix that matches destination address

longest prefix matching

Link interface01

23

4-38Network Layer Data Plane

Longest prefix matching

sect wersquoll see why longest prefix matching is used shortly when we study addressing

sect longest prefix matching often performed using ternary content addressable memories (TCAMs)bull content addressable present address to TCAM

retrieve address in one clock cycle regardless of table sizebull Cisco Catalyst can up ~1M routing table

entries in TCAM

4-39Network Layer Data Plane

Switching fabricssect transfer packet from input buffer to appropriate

output buffersect switching rate rate at which packets can be

transfer from inputs to outputsbull often measured as multiple of inputoutput line ratebull N inputs switching rate N times line rate desirable

sect three types of switching fabrics

memory

memory

bus crossbar

4-40Network Layer Data Plane

Switching via memory

first generation routerssect traditional computers with switching under direct control

of CPUsect packet copied to systemrsquos memorysect speed limited by memory bandwidth (2 bus crossings per

datagram)

inputport(eg

Ethernet)

memoryoutputport(eg

Ethernet)

system bus

4-41Network Layer Data Plane

Switching via a bus

sect datagram from input port memoryto output port memory via a

shared bussect bus contention switching speed

limited by bus bandwidthsect 32 Gbps bus Cisco 5600

sufficient speed for access and enterprise routers

bus

4-42Network Layer Data Plane

8

Switching via interconnection network

sect overcome bus bandwidth limitationssect banyan networks crossbar other

interconnection nets initially developed to connect processors in multiprocessor

sect advanced design fragmenting datagram into fixed length cells switch cells through the fabric

sect Cisco 12000 switches 60 Gbps through the interconnection network

crossbar

4-43Network Layer Data Plane

Page 3: Chapter 3 outline Principles of congestion controlqyang/ele543/2020/Lecture4.pdf · 3 Transport Layer3-13 TCP congestion control: additive increase multiplicative decrease §approach:senderincreases

3

Transport Layer 3-13

TCP congestion control additive increase multiplicative decrease

sect approach sender increases transmission rate (window size) probing for usable bandwidth until loss occursbull additive increase increase cwnd by 1 MSS every

RTT until loss detectedbull multiplicative decrease cut cwnd in half after loss

cwnd

TC

P s

ende

r co

nges

tion

win

dow

siz

e

AIMD saw toothbehavior probing

for bandwidth

additively increase window size helliphellip until loss occurs (then cut window in half)

timeTransport Layer 3-14

TCP Congestion Control details

sect sender limits transmission

sect cwnd is dynamic function of perceived network congestion

TCP sending ratesect roughly send cwnd

bytes wait RTT for ACKS then send more bytes

last byteACKed sent not-

yet ACKed(ldquoin-flightrdquo)

last byte sent

cwnd

LastByteSent-LastByteAcked

lt cwnd

sender sequence number space

rate ~~cwndRTT

bytessec

Transport Layer 3-15

TCP Slow Start sect when connection begins

increase rate exponentially until first loss eventbull initially cwnd = 1 MSSbull double cwnd every RTTbull done by incrementing cwnd for every ACK received

sect summary initial rate is slow but ramps up exponentially fast

Host A

one segment

RTT

Host B

time

two segments

four segments

Transport Layer 3-16

TCP detecting reacting to loss

sect loss indicated by timeoutbull cwnd set to 1 MSS bull window then grows exponentially (as in slow start)

to threshold then grows linearlysect loss indicated by 3 duplicate ACKs TCP RENObull dup ACKs indicate network capable of delivering

some segments bull cwnd is cut in half window then grows linearly

sect TCP Tahoe always sets cwnd to 1 (timeout or 3 duplicate acks)

Transport Layer 3-17

Q when should the exponential increase switch to linear

A when cwnd gets to 12 of its value before timeout

Implementationsect variable ssthreshsect on loss event ssthresh

is set to 12 of cwnd just before loss event

TCP switching from slow start to CA

Check out the online interactive exercises for more examples httpgaiacsumassedukurose_rossinteractive Transport Layer 3-18

Summary TCP Congestion Control

timeoutssthresh = cwnd2

cwnd = 1 MSSdupACKcount = 0

retransmit missing segment

Lcwnd gt ssthresh

congestionavoidance

cwnd = cwnd + MSS (MSScwnd)dupACKcount = 0

transmit new segment(s) as allowed

new ACK

dupACKcount++duplicate ACK

fastrecovery

cwnd = cwnd + MSStransmit new segment(s) as allowed

duplicate ACK

ssthresh= cwnd2cwnd = ssthresh + 3

retransmit missing segment

dupACKcount == 3

timeoutssthresh = cwnd2cwnd = 1 dupACKcount = 0retransmit missing segment

ssthresh= cwnd2cwnd = ssthresh + 3retransmit missing segment

dupACKcount == 3cwnd = ssthreshdupACKcount = 0

New ACK

slow start

timeoutssthresh = cwnd2

cwnd = 1 MSSdupACKcount = 0

retransmit missing segment

cwnd = cwnd+MSSdupACKcount = 0transmit new segment(s) as allowed

new ACKdupACKcount++duplicate ACK

Lcwnd = 1 MSS

ssthresh = 64 KBdupACKcount = 0

NewACK

NewACK

NewACK

4

Transport Layer 3-19

TCP throughputsect avg TCP thruput as function of window size RTTbull ignore slow start assume always data to send

sect W window size (measured in bytes) where loss occursbull avg window size ( in-flight bytes) is frac34 Wbull avg thruput is 34W per RTT

W

W2

avg TCP thruput = 34

WRTT bytessec

Transport Layer 3-20

TCP Futures TCP over ldquolong fat pipesrdquo

sect example 1500 byte segments 100ms RTT want 10 Gbps throughput

sect requires W = 83333 in-flight segmentssect throughput in terms of segment loss probability L

[Mathis 1997]

to achieve 10 Gbps throughput need a loss rate of L = 210-10 ndash a very small loss rate

sect new versions of TCP for high-speed

TCP throughput = 122 MSSRTT L

Transport Layer 3-21

fairness goal if K TCP sessions share same bottleneck link of bandwidth R each should have average rate of RK

TCP connection 1

bottleneckrouter

capacity R

TCP Fairness

TCP connection 2

Transport Layer 3-22

Why is TCP fairtwo competing sessionssect additive increase gives slope of 1 as throughout increasessect multiplicative decrease decreases throughput proportionally

R

R

equal bandwidth share

Connection 1 throughput

Con

nect

ion

2 th

roug

hput

congestion avoidance additive increaseloss decrease window by factor of 2

congestion avoidance additive increaseloss decrease window by factor of 2

Transport Layer 3-23

Fairness (more)Fairness and UDPsect multimedia apps often

do not use TCPbull do not want rate

throttled by congestion control

sect instead use UDPbull send audiovideo at

constant rate tolerate packet loss

Fairness parallel TCP connections

sect application can open multiple parallel connections between two hosts

sect web browsers do this sect eg link of rate R with 9

existing connectionsbull new app asks for 1 TCP gets

rate R10bull new app asks for 11 TCPs

gets R2

Transport Layer 3-24

network-assisted congestion controlsect two bits in IP header (ToS field) marked by network router

to indicate congestionsect congestion indication carried to receiving hostsect receiver (seeing congestion indication in IP datagram) )

sets ECE bit on receiver-to-sender ACK segment to notify sender of congestion

Explicit Congestion Notification (ECN)

sourceapplicationtransportnetworklink

physical

destinationapplicationtransportnetworklink

physical

ECN=00 ECN=11

ECE=1

IP datagram

TCP ACK segment

5

Transport Layer 3-25

Chapter 3 summarysect principles behind transport

layer servicesbull multiplexing

demultiplexingbull reliable data transferbull flow controlbull congestion control

sect instantiation implementation in the Internetbull UDPbull TCP

nextsect leaving the network ldquoedgerdquo (application transport layers)

sect into the network ldquocorerdquo

sect two network layer chaptersbull data planebull control plane

Quiz 2

Consider the following network Host A wants tosimultaneously send messages to hosts B and C A isconnected to B and C via a broadcast channelmdasha packet sentby A is carried by the channel to both B and C Suppose thatthe broadcast channel connecting A B and C canindependently lose and corrupt messages (and so forexample a message sent from A might be correctly receivedby B but not by C) The stop-and-wait-like error-controlprotocol is used for reliably transferring packets from A to Band C such that A will send a new packet from the upperlayer until it knows that both B and C have correctlyreceived the current packet Packets from upper layers maybe queued in a sufficiently large buffer Give FSM descriptionsof A and C (Hint The FSM for B should be the same as forC)

Transport Layer 3-26

Quiz 3

Consider the GBN protocol with a sender window size of 4and sequence number range of 1024 Suppose that at time tthe next in-order packet that the receiver is expecting has asequence number of k Assume that the medium does notreorder message Answer the following questionsaWhat are the possible sets of sequence numbers inside thesenderrsquos window at time tbWhat are all possible values of the ACK field in all possiblemessages currently propagating back to the sender at time t

Transport Layer 3-27 Transport Layer 3-28

aHere we have a window size of N=3 Suppose the receiver has received packet k-1 and has ACKed that and all other preceding packets If all of these ACKs have been received by sender then senders window is [k k+N-1] Suppose next that none of the ACKs have been received at the sender In this second case the senders window contains k-1 and the N packets up to and including k-1 The senders window is thus [k-Nk-1] By these arguments the senders window is of size 3 and begins somewhere in the range [k-Nk]bIf the receiver is waiting for packet k then it has received (and ACKed) packet k-1 and the N-1 packets before that If none of those N ACKs have been yet received by the sender then ACK messages with values of [k-Nk-1] may still be propagating backBecause the sender has sent packets [k-N k-1] it must be the case that the sender has already received an ACK for k-N-1 Once the receiver has sent an ACK for k-N-1 it will never send an ACK that is less that k-N-1 Thus the range of in-flight ACK values can range from k-N-1 to k-1

Transport Layer 3-29

Chapter 4 network layer

chapter goalssect understand principles behind network layer

services bull network layer service modelsbull forwarding versus routingbull how a router worksbull generalized forwarding

sect instantiation implementation in the Internet

4-30

Network Layer Data Plane

6

Network layersect transport segment from

sending to receiving host sect on sending side

encapsulates segments into datagrams

sect on receiving side delivers segments to transport layer

sect network layer protocols in every host router

sect router examines header fields in all IP datagrams passing through it

applicationtransportnetworkdata linkphysical

applicationtransportnetworkdata linkphysical

networkdata linkphysical network

data linkphysical

networkdata linkphysical

networkdata linkphysical

networkdata linkphysical

networkdata linkphysical

networkdata linkphysical

networkdata linkphysical

networkdata linkphysical

networkdata linkphysicalnetwork

data linkphysical

4-31

Network Layer Data Plane

Two key network-layer functions

network-layer functionssectforwarding move packets from routerrsquos input to appropriate router outputsectrouting determine route taken by packets from source to destinationbull routing algorithms

analogy taking a tripsect forwarding process of

getting through single interchange

sect routing process of planning trip from source to destination

4-32

Network Layer Data Plane

Network service modelQ What service model for ldquochannelrdquo transporting datagrams from sender to receiver

example services for individual datagrams

sect guaranteed deliverysect guaranteed delivery with

less than 40 msec delay

example services for a flow of datagrams

sect in-order datagram deliverysect guaranteed minimum

bandwidth to flowsect restrictions on changes in

inter-packet spacing

4-33

Network Layer Data Plane

Router architecture overview

high-seed switching

fabric

routing processor

router input ports router output ports

forwarding data plane (hardware) operttes in

nanosecond timeframe

routing managementcontrol plane (software)operates in millisecond

time frame

sect high-level view of generic router architecture

4-34Network Layer Data Plane

linetermination

link layer

protocol(receive)

lookupforwarding

queueing

Input port functions

decentralized switchingsect using header field values lookup output

port using forwarding table in input port memory (ldquomatch plus actionrdquo)

sect goal complete input port processing at lsquoline speedrsquo

sect queuing if datagrams arrive faster than forwarding rate into switch fabric

physical layerbit-level reception

data link layereg Ethernetsee chapter 5

switchfabric

4-35Network Layer Data Plane

linetermination

link layer

protocol(receive)

lookupforwarding

queueing

Input port functions

decentralized switchingsect using header field values lookup output

port using forwarding table in input port memory (ldquomatch plus actionrdquo)

sect destination-based forwarding forward based only on destination IP address (traditional)

sect generalized forwarding forward based on any set of header field values

physical layerbit-level reception

data link layereg Ethernetsee chapter 5

switchfabric

4-36Network Layer Data Plane

7

DestinationAddress Range

11001000 00010111 00010000 00000000through11001000 00010111 00010111 11111111

11001000 00010111 00011000 00000000through11001000 00010111 00011000 11111111

11001000 00010111 00011001 00000000through11001000 00010111 00011111 11111111

otherwise

Link Interface

0

1

2

3

Q but what happens if ranges donrsquot divide up so nicely

Destination-based forwardingforwarding table

4-37Network Layer Data Plane

Longest prefix matching

Destination Address Range11001000 00010111 00010 11001000 00010111 00011000 11001000 00010111 00011 otherwise

DA 11001000 00010111 00011000 10101010

examplesDA 11001000 00010111 00010110 10100001 which interface

which interface

when looking for forwarding table entry for given destination address use longest address prefix that matches destination address

longest prefix matching

Link interface01

23

4-38Network Layer Data Plane

Longest prefix matching

sect wersquoll see why longest prefix matching is used shortly when we study addressing

sect longest prefix matching often performed using ternary content addressable memories (TCAMs)bull content addressable present address to TCAM

retrieve address in one clock cycle regardless of table sizebull Cisco Catalyst can up ~1M routing table

entries in TCAM

4-39Network Layer Data Plane

Switching fabricssect transfer packet from input buffer to appropriate

output buffersect switching rate rate at which packets can be

transfer from inputs to outputsbull often measured as multiple of inputoutput line ratebull N inputs switching rate N times line rate desirable

sect three types of switching fabrics

memory

memory

bus crossbar

4-40Network Layer Data Plane

Switching via memory

first generation routerssect traditional computers with switching under direct control

of CPUsect packet copied to systemrsquos memorysect speed limited by memory bandwidth (2 bus crossings per

datagram)

inputport(eg

Ethernet)

memoryoutputport(eg

Ethernet)

system bus

4-41Network Layer Data Plane

Switching via a bus

sect datagram from input port memoryto output port memory via a

shared bussect bus contention switching speed

limited by bus bandwidthsect 32 Gbps bus Cisco 5600

sufficient speed for access and enterprise routers

bus

4-42Network Layer Data Plane

8

Switching via interconnection network

sect overcome bus bandwidth limitationssect banyan networks crossbar other

interconnection nets initially developed to connect processors in multiprocessor

sect advanced design fragmenting datagram into fixed length cells switch cells through the fabric

sect Cisco 12000 switches 60 Gbps through the interconnection network

crossbar

4-43Network Layer Data Plane

Page 4: Chapter 3 outline Principles of congestion controlqyang/ele543/2020/Lecture4.pdf · 3 Transport Layer3-13 TCP congestion control: additive increase multiplicative decrease §approach:senderincreases

4

Transport Layer 3-19

TCP throughputsect avg TCP thruput as function of window size RTTbull ignore slow start assume always data to send

sect W window size (measured in bytes) where loss occursbull avg window size ( in-flight bytes) is frac34 Wbull avg thruput is 34W per RTT

W

W2

avg TCP thruput = 34

WRTT bytessec

Transport Layer 3-20

TCP Futures TCP over ldquolong fat pipesrdquo

sect example 1500 byte segments 100ms RTT want 10 Gbps throughput

sect requires W = 83333 in-flight segmentssect throughput in terms of segment loss probability L

[Mathis 1997]

to achieve 10 Gbps throughput need a loss rate of L = 210-10 ndash a very small loss rate

sect new versions of TCP for high-speed

TCP throughput = 122 MSSRTT L

Transport Layer 3-21

fairness goal if K TCP sessions share same bottleneck link of bandwidth R each should have average rate of RK

TCP connection 1

bottleneckrouter

capacity R

TCP Fairness

TCP connection 2

Transport Layer 3-22

Why is TCP fairtwo competing sessionssect additive increase gives slope of 1 as throughout increasessect multiplicative decrease decreases throughput proportionally

R

R

equal bandwidth share

Connection 1 throughput

Con

nect

ion

2 th

roug

hput

congestion avoidance additive increaseloss decrease window by factor of 2

congestion avoidance additive increaseloss decrease window by factor of 2

Transport Layer 3-23

Fairness (more)Fairness and UDPsect multimedia apps often

do not use TCPbull do not want rate

throttled by congestion control

sect instead use UDPbull send audiovideo at

constant rate tolerate packet loss

Fairness parallel TCP connections

sect application can open multiple parallel connections between two hosts

sect web browsers do this sect eg link of rate R with 9

existing connectionsbull new app asks for 1 TCP gets

rate R10bull new app asks for 11 TCPs

gets R2

Transport Layer 3-24

network-assisted congestion controlsect two bits in IP header (ToS field) marked by network router

to indicate congestionsect congestion indication carried to receiving hostsect receiver (seeing congestion indication in IP datagram) )

sets ECE bit on receiver-to-sender ACK segment to notify sender of congestion

Explicit Congestion Notification (ECN)

sourceapplicationtransportnetworklink

physical

destinationapplicationtransportnetworklink

physical

ECN=00 ECN=11

ECE=1

IP datagram

TCP ACK segment

5

Transport Layer 3-25

Chapter 3 summarysect principles behind transport

layer servicesbull multiplexing

demultiplexingbull reliable data transferbull flow controlbull congestion control

sect instantiation implementation in the Internetbull UDPbull TCP

nextsect leaving the network ldquoedgerdquo (application transport layers)

sect into the network ldquocorerdquo

sect two network layer chaptersbull data planebull control plane

Quiz 2

Consider the following network Host A wants tosimultaneously send messages to hosts B and C A isconnected to B and C via a broadcast channelmdasha packet sentby A is carried by the channel to both B and C Suppose thatthe broadcast channel connecting A B and C canindependently lose and corrupt messages (and so forexample a message sent from A might be correctly receivedby B but not by C) The stop-and-wait-like error-controlprotocol is used for reliably transferring packets from A to Band C such that A will send a new packet from the upperlayer until it knows that both B and C have correctlyreceived the current packet Packets from upper layers maybe queued in a sufficiently large buffer Give FSM descriptionsof A and C (Hint The FSM for B should be the same as forC)

Transport Layer 3-26

Quiz 3

Consider the GBN protocol with a sender window size of 4and sequence number range of 1024 Suppose that at time tthe next in-order packet that the receiver is expecting has asequence number of k Assume that the medium does notreorder message Answer the following questionsaWhat are the possible sets of sequence numbers inside thesenderrsquos window at time tbWhat are all possible values of the ACK field in all possiblemessages currently propagating back to the sender at time t

Transport Layer 3-27 Transport Layer 3-28

aHere we have a window size of N=3 Suppose the receiver has received packet k-1 and has ACKed that and all other preceding packets If all of these ACKs have been received by sender then senders window is [k k+N-1] Suppose next that none of the ACKs have been received at the sender In this second case the senders window contains k-1 and the N packets up to and including k-1 The senders window is thus [k-Nk-1] By these arguments the senders window is of size 3 and begins somewhere in the range [k-Nk]bIf the receiver is waiting for packet k then it has received (and ACKed) packet k-1 and the N-1 packets before that If none of those N ACKs have been yet received by the sender then ACK messages with values of [k-Nk-1] may still be propagating backBecause the sender has sent packets [k-N k-1] it must be the case that the sender has already received an ACK for k-N-1 Once the receiver has sent an ACK for k-N-1 it will never send an ACK that is less that k-N-1 Thus the range of in-flight ACK values can range from k-N-1 to k-1

Transport Layer 3-29

Chapter 4 network layer

chapter goalssect understand principles behind network layer

services bull network layer service modelsbull forwarding versus routingbull how a router worksbull generalized forwarding

sect instantiation implementation in the Internet

4-30

Network Layer Data Plane

6

Network layersect transport segment from

sending to receiving host sect on sending side

encapsulates segments into datagrams

sect on receiving side delivers segments to transport layer

sect network layer protocols in every host router

sect router examines header fields in all IP datagrams passing through it

applicationtransportnetworkdata linkphysical

applicationtransportnetworkdata linkphysical

networkdata linkphysical network

data linkphysical

networkdata linkphysical

networkdata linkphysical

networkdata linkphysical

networkdata linkphysical

networkdata linkphysical

networkdata linkphysical

networkdata linkphysical

networkdata linkphysicalnetwork

data linkphysical

4-31

Network Layer Data Plane

Two key network-layer functions

network-layer functionssectforwarding move packets from routerrsquos input to appropriate router outputsectrouting determine route taken by packets from source to destinationbull routing algorithms

analogy taking a tripsect forwarding process of

getting through single interchange

sect routing process of planning trip from source to destination

4-32

Network Layer Data Plane

Network service modelQ What service model for ldquochannelrdquo transporting datagrams from sender to receiver

example services for individual datagrams

sect guaranteed deliverysect guaranteed delivery with

less than 40 msec delay

example services for a flow of datagrams

sect in-order datagram deliverysect guaranteed minimum

bandwidth to flowsect restrictions on changes in

inter-packet spacing

4-33

Network Layer Data Plane

Router architecture overview

high-seed switching

fabric

routing processor

router input ports router output ports

forwarding data plane (hardware) operttes in

nanosecond timeframe

routing managementcontrol plane (software)operates in millisecond

time frame

sect high-level view of generic router architecture

4-34Network Layer Data Plane

linetermination

link layer

protocol(receive)

lookupforwarding

queueing

Input port functions

decentralized switchingsect using header field values lookup output

port using forwarding table in input port memory (ldquomatch plus actionrdquo)

sect goal complete input port processing at lsquoline speedrsquo

sect queuing if datagrams arrive faster than forwarding rate into switch fabric

physical layerbit-level reception

data link layereg Ethernetsee chapter 5

switchfabric

4-35Network Layer Data Plane

linetermination

link layer

protocol(receive)

lookupforwarding

queueing

Input port functions

decentralized switchingsect using header field values lookup output

port using forwarding table in input port memory (ldquomatch plus actionrdquo)

sect destination-based forwarding forward based only on destination IP address (traditional)

sect generalized forwarding forward based on any set of header field values

physical layerbit-level reception

data link layereg Ethernetsee chapter 5

switchfabric

4-36Network Layer Data Plane

7

DestinationAddress Range

11001000 00010111 00010000 00000000through11001000 00010111 00010111 11111111

11001000 00010111 00011000 00000000through11001000 00010111 00011000 11111111

11001000 00010111 00011001 00000000through11001000 00010111 00011111 11111111

otherwise

Link Interface

0

1

2

3

Q but what happens if ranges donrsquot divide up so nicely

Destination-based forwardingforwarding table

4-37Network Layer Data Plane

Longest prefix matching

Destination Address Range11001000 00010111 00010 11001000 00010111 00011000 11001000 00010111 00011 otherwise

DA 11001000 00010111 00011000 10101010

examplesDA 11001000 00010111 00010110 10100001 which interface

which interface

when looking for forwarding table entry for given destination address use longest address prefix that matches destination address

longest prefix matching

Link interface01

23

4-38Network Layer Data Plane

Longest prefix matching

sect wersquoll see why longest prefix matching is used shortly when we study addressing

sect longest prefix matching often performed using ternary content addressable memories (TCAMs)bull content addressable present address to TCAM

retrieve address in one clock cycle regardless of table sizebull Cisco Catalyst can up ~1M routing table

entries in TCAM

4-39Network Layer Data Plane

Switching fabricssect transfer packet from input buffer to appropriate

output buffersect switching rate rate at which packets can be

transfer from inputs to outputsbull often measured as multiple of inputoutput line ratebull N inputs switching rate N times line rate desirable

sect three types of switching fabrics

memory

memory

bus crossbar

4-40Network Layer Data Plane

Switching via memory

first generation routerssect traditional computers with switching under direct control

of CPUsect packet copied to systemrsquos memorysect speed limited by memory bandwidth (2 bus crossings per

datagram)

inputport(eg

Ethernet)

memoryoutputport(eg

Ethernet)

system bus

4-41Network Layer Data Plane

Switching via a bus

sect datagram from input port memoryto output port memory via a

shared bussect bus contention switching speed

limited by bus bandwidthsect 32 Gbps bus Cisco 5600

sufficient speed for access and enterprise routers

bus

4-42Network Layer Data Plane

8

Switching via interconnection network

sect overcome bus bandwidth limitationssect banyan networks crossbar other

interconnection nets initially developed to connect processors in multiprocessor

sect advanced design fragmenting datagram into fixed length cells switch cells through the fabric

sect Cisco 12000 switches 60 Gbps through the interconnection network

crossbar

4-43Network Layer Data Plane

Page 5: Chapter 3 outline Principles of congestion controlqyang/ele543/2020/Lecture4.pdf · 3 Transport Layer3-13 TCP congestion control: additive increase multiplicative decrease §approach:senderincreases

5

Transport Layer 3-25

Chapter 3 summarysect principles behind transport

layer servicesbull multiplexing

demultiplexingbull reliable data transferbull flow controlbull congestion control

sect instantiation implementation in the Internetbull UDPbull TCP

nextsect leaving the network ldquoedgerdquo (application transport layers)

sect into the network ldquocorerdquo

sect two network layer chaptersbull data planebull control plane

Quiz 2

Consider the following network Host A wants tosimultaneously send messages to hosts B and C A isconnected to B and C via a broadcast channelmdasha packet sentby A is carried by the channel to both B and C Suppose thatthe broadcast channel connecting A B and C canindependently lose and corrupt messages (and so forexample a message sent from A might be correctly receivedby B but not by C) The stop-and-wait-like error-controlprotocol is used for reliably transferring packets from A to Band C such that A will send a new packet from the upperlayer until it knows that both B and C have correctlyreceived the current packet Packets from upper layers maybe queued in a sufficiently large buffer Give FSM descriptionsof A and C (Hint The FSM for B should be the same as forC)

Transport Layer 3-26

Quiz 3

Consider the GBN protocol with a sender window size of 4and sequence number range of 1024 Suppose that at time tthe next in-order packet that the receiver is expecting has asequence number of k Assume that the medium does notreorder message Answer the following questionsaWhat are the possible sets of sequence numbers inside thesenderrsquos window at time tbWhat are all possible values of the ACK field in all possiblemessages currently propagating back to the sender at time t

Transport Layer 3-27 Transport Layer 3-28

aHere we have a window size of N=3 Suppose the receiver has received packet k-1 and has ACKed that and all other preceding packets If all of these ACKs have been received by sender then senders window is [k k+N-1] Suppose next that none of the ACKs have been received at the sender In this second case the senders window contains k-1 and the N packets up to and including k-1 The senders window is thus [k-Nk-1] By these arguments the senders window is of size 3 and begins somewhere in the range [k-Nk]bIf the receiver is waiting for packet k then it has received (and ACKed) packet k-1 and the N-1 packets before that If none of those N ACKs have been yet received by the sender then ACK messages with values of [k-Nk-1] may still be propagating backBecause the sender has sent packets [k-N k-1] it must be the case that the sender has already received an ACK for k-N-1 Once the receiver has sent an ACK for k-N-1 it will never send an ACK that is less that k-N-1 Thus the range of in-flight ACK values can range from k-N-1 to k-1

Transport Layer 3-29

Chapter 4 network layer

chapter goalssect understand principles behind network layer

services bull network layer service modelsbull forwarding versus routingbull how a router worksbull generalized forwarding

sect instantiation implementation in the Internet

4-30

Network Layer Data Plane

6

Network layersect transport segment from

sending to receiving host sect on sending side

encapsulates segments into datagrams

sect on receiving side delivers segments to transport layer

sect network layer protocols in every host router

sect router examines header fields in all IP datagrams passing through it

applicationtransportnetworkdata linkphysical

applicationtransportnetworkdata linkphysical

networkdata linkphysical network

data linkphysical

networkdata linkphysical

networkdata linkphysical

networkdata linkphysical

networkdata linkphysical

networkdata linkphysical

networkdata linkphysical

networkdata linkphysical

networkdata linkphysicalnetwork

data linkphysical

4-31

Network Layer Data Plane

Two key network-layer functions

network-layer functionssectforwarding move packets from routerrsquos input to appropriate router outputsectrouting determine route taken by packets from source to destinationbull routing algorithms

analogy taking a tripsect forwarding process of

getting through single interchange

sect routing process of planning trip from source to destination

4-32

Network Layer Data Plane

Network service modelQ What service model for ldquochannelrdquo transporting datagrams from sender to receiver

example services for individual datagrams

sect guaranteed deliverysect guaranteed delivery with

less than 40 msec delay

example services for a flow of datagrams

sect in-order datagram deliverysect guaranteed minimum

bandwidth to flowsect restrictions on changes in

inter-packet spacing

4-33

Network Layer Data Plane

Router architecture overview

high-seed switching

fabric

routing processor

router input ports router output ports

forwarding data plane (hardware) operttes in

nanosecond timeframe

routing managementcontrol plane (software)operates in millisecond

time frame

sect high-level view of generic router architecture

4-34Network Layer Data Plane

linetermination

link layer

protocol(receive)

lookupforwarding

queueing

Input port functions

decentralized switchingsect using header field values lookup output

port using forwarding table in input port memory (ldquomatch plus actionrdquo)

sect goal complete input port processing at lsquoline speedrsquo

sect queuing if datagrams arrive faster than forwarding rate into switch fabric

physical layerbit-level reception

data link layereg Ethernetsee chapter 5

switchfabric

4-35Network Layer Data Plane

linetermination

link layer

protocol(receive)

lookupforwarding

queueing

Input port functions

decentralized switchingsect using header field values lookup output

port using forwarding table in input port memory (ldquomatch plus actionrdquo)

sect destination-based forwarding forward based only on destination IP address (traditional)

sect generalized forwarding forward based on any set of header field values

physical layerbit-level reception

data link layereg Ethernetsee chapter 5

switchfabric

4-36Network Layer Data Plane

7

DestinationAddress Range

11001000 00010111 00010000 00000000through11001000 00010111 00010111 11111111

11001000 00010111 00011000 00000000through11001000 00010111 00011000 11111111

11001000 00010111 00011001 00000000through11001000 00010111 00011111 11111111

otherwise

Link Interface

0

1

2

3

Q but what happens if ranges donrsquot divide up so nicely

Destination-based forwardingforwarding table

4-37Network Layer Data Plane

Longest prefix matching

Destination Address Range11001000 00010111 00010 11001000 00010111 00011000 11001000 00010111 00011 otherwise

DA 11001000 00010111 00011000 10101010

examplesDA 11001000 00010111 00010110 10100001 which interface

which interface

when looking for forwarding table entry for given destination address use longest address prefix that matches destination address

longest prefix matching

Link interface01

23

4-38Network Layer Data Plane

Longest prefix matching

sect wersquoll see why longest prefix matching is used shortly when we study addressing

sect longest prefix matching often performed using ternary content addressable memories (TCAMs)bull content addressable present address to TCAM

retrieve address in one clock cycle regardless of table sizebull Cisco Catalyst can up ~1M routing table

entries in TCAM

4-39Network Layer Data Plane

Switching fabricssect transfer packet from input buffer to appropriate

output buffersect switching rate rate at which packets can be

transfer from inputs to outputsbull often measured as multiple of inputoutput line ratebull N inputs switching rate N times line rate desirable

sect three types of switching fabrics

memory

memory

bus crossbar

4-40Network Layer Data Plane

Switching via memory

first generation routerssect traditional computers with switching under direct control

of CPUsect packet copied to systemrsquos memorysect speed limited by memory bandwidth (2 bus crossings per

datagram)

inputport(eg

Ethernet)

memoryoutputport(eg

Ethernet)

system bus

4-41Network Layer Data Plane

Switching via a bus

sect datagram from input port memoryto output port memory via a

shared bussect bus contention switching speed

limited by bus bandwidthsect 32 Gbps bus Cisco 5600

sufficient speed for access and enterprise routers

bus

4-42Network Layer Data Plane

8

Switching via interconnection network

sect overcome bus bandwidth limitationssect banyan networks crossbar other

interconnection nets initially developed to connect processors in multiprocessor

sect advanced design fragmenting datagram into fixed length cells switch cells through the fabric

sect Cisco 12000 switches 60 Gbps through the interconnection network

crossbar

4-43Network Layer Data Plane

Page 6: Chapter 3 outline Principles of congestion controlqyang/ele543/2020/Lecture4.pdf · 3 Transport Layer3-13 TCP congestion control: additive increase multiplicative decrease §approach:senderincreases

6

Network layersect transport segment from

sending to receiving host sect on sending side

encapsulates segments into datagrams

sect on receiving side delivers segments to transport layer

sect network layer protocols in every host router

sect router examines header fields in all IP datagrams passing through it

applicationtransportnetworkdata linkphysical

applicationtransportnetworkdata linkphysical

networkdata linkphysical network

data linkphysical

networkdata linkphysical

networkdata linkphysical

networkdata linkphysical

networkdata linkphysical

networkdata linkphysical

networkdata linkphysical

networkdata linkphysical

networkdata linkphysicalnetwork

data linkphysical

4-31

Network Layer Data Plane

Two key network-layer functions

network-layer functionssectforwarding move packets from routerrsquos input to appropriate router outputsectrouting determine route taken by packets from source to destinationbull routing algorithms

analogy taking a tripsect forwarding process of

getting through single interchange

sect routing process of planning trip from source to destination

4-32

Network Layer Data Plane

Network service modelQ What service model for ldquochannelrdquo transporting datagrams from sender to receiver

example services for individual datagrams

sect guaranteed deliverysect guaranteed delivery with

less than 40 msec delay

example services for a flow of datagrams

sect in-order datagram deliverysect guaranteed minimum

bandwidth to flowsect restrictions on changes in

inter-packet spacing

4-33

Network Layer Data Plane

Router architecture overview

high-seed switching

fabric

routing processor

router input ports router output ports

forwarding data plane (hardware) operttes in

nanosecond timeframe

routing managementcontrol plane (software)operates in millisecond

time frame

sect high-level view of generic router architecture

4-34Network Layer Data Plane

linetermination

link layer

protocol(receive)

lookupforwarding

queueing

Input port functions

decentralized switchingsect using header field values lookup output

port using forwarding table in input port memory (ldquomatch plus actionrdquo)

sect goal complete input port processing at lsquoline speedrsquo

sect queuing if datagrams arrive faster than forwarding rate into switch fabric

physical layerbit-level reception

data link layereg Ethernetsee chapter 5

switchfabric

4-35Network Layer Data Plane

linetermination

link layer

protocol(receive)

lookupforwarding

queueing

Input port functions

decentralized switchingsect using header field values lookup output

port using forwarding table in input port memory (ldquomatch plus actionrdquo)

sect destination-based forwarding forward based only on destination IP address (traditional)

sect generalized forwarding forward based on any set of header field values

physical layerbit-level reception

data link layereg Ethernetsee chapter 5

switchfabric

4-36Network Layer Data Plane

7

DestinationAddress Range

11001000 00010111 00010000 00000000through11001000 00010111 00010111 11111111

11001000 00010111 00011000 00000000through11001000 00010111 00011000 11111111

11001000 00010111 00011001 00000000through11001000 00010111 00011111 11111111

otherwise

Link Interface

0

1

2

3

Q but what happens if ranges donrsquot divide up so nicely

Destination-based forwardingforwarding table

4-37Network Layer Data Plane

Longest prefix matching

Destination Address Range11001000 00010111 00010 11001000 00010111 00011000 11001000 00010111 00011 otherwise

DA 11001000 00010111 00011000 10101010

examplesDA 11001000 00010111 00010110 10100001 which interface

which interface

when looking for forwarding table entry for given destination address use longest address prefix that matches destination address

longest prefix matching

Link interface01

23

4-38Network Layer Data Plane

Longest prefix matching

sect wersquoll see why longest prefix matching is used shortly when we study addressing

sect longest prefix matching often performed using ternary content addressable memories (TCAMs)bull content addressable present address to TCAM

retrieve address in one clock cycle regardless of table sizebull Cisco Catalyst can up ~1M routing table

entries in TCAM

4-39Network Layer Data Plane

Switching fabricssect transfer packet from input buffer to appropriate

output buffersect switching rate rate at which packets can be

transfer from inputs to outputsbull often measured as multiple of inputoutput line ratebull N inputs switching rate N times line rate desirable

sect three types of switching fabrics

memory

memory

bus crossbar

4-40Network Layer Data Plane

Switching via memory

first generation routerssect traditional computers with switching under direct control

of CPUsect packet copied to systemrsquos memorysect speed limited by memory bandwidth (2 bus crossings per

datagram)

inputport(eg

Ethernet)

memoryoutputport(eg

Ethernet)

system bus

4-41Network Layer Data Plane

Switching via a bus

sect datagram from input port memoryto output port memory via a

shared bussect bus contention switching speed

limited by bus bandwidthsect 32 Gbps bus Cisco 5600

sufficient speed for access and enterprise routers

bus

4-42Network Layer Data Plane

8

Switching via interconnection network

sect overcome bus bandwidth limitationssect banyan networks crossbar other

interconnection nets initially developed to connect processors in multiprocessor

sect advanced design fragmenting datagram into fixed length cells switch cells through the fabric

sect Cisco 12000 switches 60 Gbps through the interconnection network

crossbar

4-43Network Layer Data Plane

Page 7: Chapter 3 outline Principles of congestion controlqyang/ele543/2020/Lecture4.pdf · 3 Transport Layer3-13 TCP congestion control: additive increase multiplicative decrease §approach:senderincreases

7

DestinationAddress Range

11001000 00010111 00010000 00000000through11001000 00010111 00010111 11111111

11001000 00010111 00011000 00000000through11001000 00010111 00011000 11111111

11001000 00010111 00011001 00000000through11001000 00010111 00011111 11111111

otherwise

Link Interface

0

1

2

3

Q but what happens if ranges donrsquot divide up so nicely

Destination-based forwardingforwarding table

4-37Network Layer Data Plane

Longest prefix matching

Destination Address Range11001000 00010111 00010 11001000 00010111 00011000 11001000 00010111 00011 otherwise

DA 11001000 00010111 00011000 10101010

examplesDA 11001000 00010111 00010110 10100001 which interface

which interface

when looking for forwarding table entry for given destination address use longest address prefix that matches destination address

longest prefix matching

Link interface01

23

4-38Network Layer Data Plane

Longest prefix matching

sect wersquoll see why longest prefix matching is used shortly when we study addressing

sect longest prefix matching often performed using ternary content addressable memories (TCAMs)bull content addressable present address to TCAM

retrieve address in one clock cycle regardless of table sizebull Cisco Catalyst can up ~1M routing table

entries in TCAM

4-39Network Layer Data Plane

Switching fabricssect transfer packet from input buffer to appropriate

output buffersect switching rate rate at which packets can be

transfer from inputs to outputsbull often measured as multiple of inputoutput line ratebull N inputs switching rate N times line rate desirable

sect three types of switching fabrics

memory

memory

bus crossbar

4-40Network Layer Data Plane

Switching via memory

first generation routerssect traditional computers with switching under direct control

of CPUsect packet copied to systemrsquos memorysect speed limited by memory bandwidth (2 bus crossings per

datagram)

inputport(eg

Ethernet)

memoryoutputport(eg

Ethernet)

system bus

4-41Network Layer Data Plane

Switching via a bus

sect datagram from input port memoryto output port memory via a

shared bussect bus contention switching speed

limited by bus bandwidthsect 32 Gbps bus Cisco 5600

sufficient speed for access and enterprise routers

bus

4-42Network Layer Data Plane

8

Switching via interconnection network

sect overcome bus bandwidth limitationssect banyan networks crossbar other

interconnection nets initially developed to connect processors in multiprocessor

sect advanced design fragmenting datagram into fixed length cells switch cells through the fabric

sect Cisco 12000 switches 60 Gbps through the interconnection network

crossbar

4-43Network Layer Data Plane

Page 8: Chapter 3 outline Principles of congestion controlqyang/ele543/2020/Lecture4.pdf · 3 Transport Layer3-13 TCP congestion control: additive increase multiplicative decrease §approach:senderincreases

8

Switching via interconnection network

sect overcome bus bandwidth limitationssect banyan networks crossbar other

interconnection nets initially developed to connect processors in multiprocessor

sect advanced design fragmenting datagram into fixed length cells switch cells through the fabric

sect Cisco 12000 switches 60 Gbps through the interconnection network

crossbar

4-43Network Layer Data Plane