1 cmpe 252a: computer networks set 13: end-to-end transmission control (tcp and udp)

85
1 CMPE 252A: Computer Networks Set 13: End-to-End Transmission Control (TCP and UDP)

Upload: dustin-obrien

Post on 21-Jan-2016

224 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 CMPE 252A: Computer Networks Set 13: End-to-End Transmission Control (TCP and UDP)

1

CMPE 252A: Computer Networks

Set 13: End-to-End Transmission Control (TCP and UDP)

Page 2: 1 CMPE 252A: Computer Networks Set 13: End-to-End Transmission Control (TCP and UDP)

2

Transport Protocols

Services: Addressing of processes Reliable or unreliable transport from

source process to end process(es) Multiplexing and demultiplexing Flow control

Avoid overflowing receiver’s buffer Congestion control

Avoid overflowing the network bottleneck Examples: UDP and TCP

Page 3: 1 CMPE 252A: Computer Networks Set 13: End-to-End Transmission Control (TCP and UDP)

3

Why Multiplexing

IP delivers packets from source host to destination host.

However, multiple processes run in the hosts! Applications require communication among

processes, not just host computers. Example: Multiple telnet sessions, email, ftp

sessions, and www can all be running concurrently in the same host.

Ports are defined as the addresses of processes inside a host.

How do we identify processes uniquely and efficiently?

Page 4: 1 CMPE 252A: Computer Networks Set 13: End-to-End Transmission Control (TCP and UDP)

4

Well-Known Applications Well-known ports under 1024, such as

FTP - port # 21 Telnet - port # 23 HTTP - port # 80

Host A Host BClient Server

Source port = x Source port = 23

Dst. port = 23 Dst. port = x

Segment

Page 5: 1 CMPE 252A: Computer Networks Set 13: End-to-End Transmission Control (TCP and UDP)

5

Transport Protocols

Transport protocols used today are point to point UDP used for:

Remote file server (NFS), name translation (DNS), intra-domain routing (RIP), network management (SNMP), multimedia applications.

TCP used for: Electronic mail (SMTP), file transfer (FTP),

remote login (Telnet), web (HTTP)

No standard multipoint e-t-e protocol yet!

Page 6: 1 CMPE 252A: Computer Networks Set 13: End-to-End Transmission Control (TCP and UDP)

6

UDP: User Datagram Protocol [RFC 768]

“No frills,” “bare bones” Internet transport protocol

“Best effort” service, UDP segments may be:

Lost Delivered out of order to

app Connectionless:

No handshaking between UDP sender, receiver

Each UDP segment handled independently of others

Why is there a UDP?• No connection establishment

(which can add delay)• Simple: no connection state

at sender, receiver• Small header• No congestion control: UDP

can blast away as fast as desired

Page 7: 1 CMPE 252A: Computer Networks Set 13: End-to-End Transmission Control (TCP and UDP)

7

UDP Header specifies the minimum

needed for multiplexing and framing.

Source and destination ports: identify the end points.

Often used for streaming multimedia apps Loss tolerant Rate sensitive

Other UDP uses: DNS

Reliable transfer over UDP Must be at application layer Application-specific error

recovery

Source port # Dest port #

32 bits

Applicationdata

(message)

UDP segment format

Length Checksum

Length, inbytes of

UDPsegment,includingheader

Checksum: optional; if not used, set to zero.

Page 8: 1 CMPE 252A: Computer Networks Set 13: End-to-End Transmission Control (TCP and UDP)

8

UDP Checksum

Computed over a pseudo-header + UDP header+data+padding (to even number of bytes if needed).

Pseudo-header:0 31

Source IP address

Destination IP address

00000000 Protocol Segment length

Page 9: 1 CMPE 252A: Computer Networks Set 13: End-to-End Transmission Control (TCP and UDP)

9

High-Level TCP Characteristics Protocol implemented entirely at the ends

Fate sharing (on IP) Protocol has evolved over time and will

continue to do so Nearly impossible to change the header Use options to add information to the header Change processing at endpoints Backward compatibility is what makes it TCP

Page 10: 1 CMPE 252A: Computer Networks Set 13: End-to-End Transmission Control (TCP and UDP)

10

Differences From Link Layer Logical link vs. physical link

Must establish connection Variable RTT

May vary within a connection Reordering

How long can packets live implies max segment lifetime

Endpoints need not match link Buffer space availability

Transmission rate Must be found

Page 11: 1 CMPE 252A: Computer Networks Set 13: End-to-End Transmission Control (TCP and UDP)

11

TCP in a Nutshell

Abstraction: Reliable. Ordered. Point-to-point. Byte-stream.

Mechanisms: Window-based flow

control. Sequence

numbers/ordering, 3-way handshake.

Reliability (ACK, retransmission policies).

Congestion control. RTT estimation.

Page 12: 1 CMPE 252A: Computer Networks Set 13: End-to-End Transmission Control (TCP and UDP)

12

TCP Header

Source port Destination port

Sequence number

Acknowledgement

Advertised windowHdrLen Flags0

Checksum Urgent pointer

Options (variable)

Data

Flags: SYNFINRESETPUSHURGACK

Page 13: 1 CMPE 252A: Computer Networks Set 13: End-to-End Transmission Control (TCP and UDP)

13

History of TCP “The” Internet protocol for reliable end-to-end

communication First key paper:

V. Cerf and R. Kahn, “A Protocol for Packet Network Interconnection,” IEEE Trans. Commun., 1974, pp. 627-641.

Designed per se in the early ‘80s J. Postel, RFC 793 (also IP and UDP) Network assumptions:

reliable links losses due to congestion only! symmetric network connections Implicit in order delivery of packets (more than IP can promise!)

Page 14: 1 CMPE 252A: Computer Networks Set 13: End-to-End Transmission Control (TCP and UDP)

14

Evolution of TCP

1975 1980 1985 1990

1982TCP & IP

RFC 793 & 791

1974TCP described by

Vint Cerf and Bob KahnIn IEEE Trans Comm

1983BSD Unix 4.2

supports TCP/IP

1984Nagel’s algorithmto reduce overhead

of small packets;predicts congestion

collapse

1987Karn’s algorithmto better estimate

round-trip time

1986Congestion

collapseobserved

1988Van Jacobson’s

algorithmscongestion avoidance and congestion control(most implemented in

4.3BSD Tahoe)

19904.3BSD Renofast retransmitdelayed ACK’s

1975Three-way handshake

Raymond TomlinsonIn SIGCOMM 75

Page 15: 1 CMPE 252A: Computer Networks Set 13: End-to-End Transmission Control (TCP and UDP)

15

TCP Through the 1990s

1993 1994 1996

1994ECN

(Floyd)Explicit

CongestionNotification

1993TCP Vegas

(Brakmo et al)delay-based

congestion avoidance

1994T/TCP

(Braden)Transaction

TCP

1996SACK TCP(Floyd et al)

Selective Acknowledgement

1996Hoe

NewReno startup and loss recovery

1996FACK TCP

(Mathis et al)extension to SACK

Page 16: 1 CMPE 252A: Computer Networks Set 13: End-to-End Transmission Control (TCP and UDP)

16

Services Provided

End-to-end flow control Reliable byte stream In-order packet delivery (buffering) Connection-oriented

Socket <host address, port> uniquely identify connection

Page 17: 1 CMPE 252A: Computer Networks Set 13: End-to-End Transmission Control (TCP and UDP)

17

TCP Service Model

TCP connections are full-duplex and point-to-point.

Byte stream (not message stream). Message boundaries are not preserved

e2e. A B C D

Four 512-byte segments sent asseparate IP datagrams

A B C D

2048 bytes of data deliveredto application in single READ

Page 18: 1 CMPE 252A: Computer Networks Set 13: End-to-End Transmission Control (TCP and UDP)

18

TCP Byte Stream When application passes data to TCP,

it may send it immediately or buffer it.

Sometimes application wants to send data immediately. Example: interactive applications. Use PUSH flag to force transmission.

URGENT flag. Also forces TCP to transmit at once.

Page 19: 1 CMPE 252A: Computer Networks Set 13: End-to-End Transmission Control (TCP and UDP)

19

TCP Header Important fields:

source port and destination port: identify connection end points

32 bit SN: identifies byte in segment 32 bit ACK: identifies next byte expected 4 bit header length: how many 32-bit words in header 16-bit window size (max. 64KB) advertised by the receiver

(RAW) checksum: checks header, data and pseudo-header Flags: SYN, FIN, ACK, URG, PUSH Options: Way to add more information.

Important: Only one sequence number! Identifies the segment, but does not identify

which retransmission of the segment is being sent!

Page 20: 1 CMPE 252A: Computer Networks Set 13: End-to-End Transmission Control (TCP and UDP)

20

TCP Header Flags

Six TCP flags: URG: indicates urgent data present; urgent pointer

gives byte offset from current sequence number where urgent data are. Generally not used.

ACK: indicates whether segment contains acknowledgment; if 0, acknowledgement number field ignored.

PUSH: indicates PUSHed data so receiver delivers it to application immediately. Generally not used.

RST: used to reset connection, reject invalid segment, or refuse to open connection.

SYN: used to establish connection; connection request, SYN=1, ACK=0.

FIN: used to release connection.

Page 21: 1 CMPE 252A: Computer Networks Set 13: End-to-End Transmission Control (TCP and UDP)

21

TCP Connection Management

TCP ClientLifecycle

TCP ServerLifecycle

Page 22: 1 CMPE 252A: Computer Networks Set 13: End-to-End Transmission Control (TCP and UDP)

22

TCP Transmission

Sender process initiates connection. Once connection established, TCP

can start sending data. Sender writes bytes to TCP stream. TCP sender breaks byte stream into

segments. Each byte assigned sequence number. Segment sent and timer started.

Page 23: 1 CMPE 252A: Computer Networks Set 13: End-to-End Transmission Control (TCP and UDP)

23

TCP Transmission

If timer expires, retransmit segment. After retransmitting segment for maximum

number of times, assumes connection is dead and closes it.

If user aborts connection, sending TCP flushes its buffers and sends RESET segment.

Receiving TCP decides when to pass received data to upper layer.

Page 24: 1 CMPE 252A: Computer Networks Set 13: End-to-End Transmission Control (TCP and UDP)

24

Timeout-Based Recovery

Wait at least one RTT before retransmitting Importance of accurate RTT estimators:

Low RTT unneeded retransmissions High RTT poor throughput

RTT estimator must adapt to change in RTT But not too fast, or too slow!

Spurious timeouts “Conservation of packets” principle – more than

a window worth of packets in flight

Page 25: 1 CMPE 252A: Computer Networks Set 13: End-to-End Transmission Control (TCP and UDP)

3-25

TCP Sender Events:Data received from

app: Create segment with

seq # Seq # is byte-stream

number of first data byte in segment

start timer if not already running (think of timer as for oldest unacked segment)

expiration interval: TimeOutInterval

Timeout expires: Retransmit segment

that caused timeout Restart timer

Acknowledgments: If acknowledges

previously unacked segments

update what is known to be acked

start timer if there are outstanding segments

Page 26: 1 CMPE 252A: Computer Networks Set 13: End-to-End Transmission Control (TCP and UDP)

3-26

TCP ACK Generation [RFC 1122, RFC 2581]

Event at Receiver

Arrival of in-order segment withexpected seq #. All data up toexpected seq # already ACKed

Arrival of in-order segment withexpected seq #. One other segment has ACK pending

Arrival of out-of-order segmenthigher-than-expect seq. # .Gap detected

Arrival of segment that partially or completely fills gap

TCP Receiver action

Delayed ACK. Wait up to 500msfor next segment. If no next segment,send ACK

Immediately send single cumulative ACK, ACKing both in-order segments

Immediately send duplicate ACK, indicating seq. # of next expected byte

Immediate send ACK, provided thatsegment starts at lower end of gap

Page 27: 1 CMPE 252A: Computer Networks Set 13: End-to-End Transmission Control (TCP and UDP)

3-27

TCP Seq. #’s and ACKsSeq. #’s:

byte stream “number” of first byte in segment’s data

ACKs: seq # of next byte

expected from other side

cumulative ACK Piggybacking

NOTE: TCP spec does not dictate how receiver handles out-of-order segments

(store them with modern hardware)

Host A Host B

Seq=42, ACK=79, data = ‘U’

Seq=79, ACK=43, data = ‘U’

Seq=43, ACK=80

Usertypes‘U’

host ACKsreceipt of echoed‘U’

host ACKsreceipt of‘U’, echoesback ‘U’

timesimple telnet scenario

Page 28: 1 CMPE 252A: Computer Networks Set 13: End-to-End Transmission Control (TCP and UDP)

28

Flow Control vs. Congestion Control

Congestion control Global issue: concerns all routers and

hosts on path from Source to Destination make sure every subnet can handle the

traffic

Router

Senders

Receiver

Router

1Mbps 1Mbps1Mbps

Page 29: 1 CMPE 252A: Computer Networks Set 13: End-to-End Transmission Control (TCP and UDP)

29

Flow Control vs. Congestion Control

Flow Control Involves two endpoints Make sure sender doesn’t transmit

faster than receiver can absorb packets

Server1Gbps

PC1Mbps

File transfer

Page 30: 1 CMPE 252A: Computer Networks Set 13: End-to-End Transmission Control (TCP and UDP)

30

End-to-End Congestion Control

Why do it at the transport layer? Real fix to congestion is to slow down

sender. Use law of “conservation of packets”.

Keep number of packets in the network constant, just below maximum that bottleneck can take

Don’t inject new packet until old one leaves.

Congestion indicator: packet loss.

Page 31: 1 CMPE 252A: Computer Networks Set 13: End-to-End Transmission Control (TCP and UDP)

31

TCP Flow Control TCP is a sliding window protocol

For window size n, can send up to n bytes without receiving an acknowledgement

When the data is acknowledged then the window slides forward

Each packet advertises a window size Indicates number of bytes the receiver has

space for Original TCP always sent entire window

Congestion control now limits this

Page 32: 1 CMPE 252A: Computer Networks Set 13: End-to-End Transmission Control (TCP and UDP)

32

Self-Clocking

If we have large actual window, should we send data in one shot? No, use ACKs to clock sending new data.

PrPb

Ar

Ab

receiversender

As

Page 33: 1 CMPE 252A: Computer Networks Set 13: End-to-End Transmission Control (TCP and UDP)

33

TCP Congestion Control Mechanisms

Collection of interrelated mechanisms: Slow start. Congestion avoidance. Accurate retransmission timeout

estimation. Fast retransmit. Fast recovery.

Page 34: 1 CMPE 252A: Computer Networks Set 13: End-to-End Transmission Control (TCP and UDP)

34

TCP Flow Control Transmission Window, a.k.a. congestion window ( cwnd )

Sliding window maintained by sender conservation of packets

Receiver’s window set through socket API controlled by the receiver advertised to the sender in field of TCP header ( RAW )

cwnd

1 2 3 4 5 6 7 8 9 10 11 12 ...Sent and ACKed

Sent, not ACKed

Send ASAP

Can’t send

Page 35: 1 CMPE 252A: Computer Networks Set 13: End-to-End Transmission Control (TCP and UDP)

35

ReceiverReceiverSenderSender

Sender/Receiver Statew/o Buffering at Receiver

… …

Sent & Acked Sent Not Acked

OK to Send Not Usable

… …

Max acceptable

Receiver window

Max ACK received Next seqnum

Received & Acked Acceptable Packet

Not Usable

Sender window

Next expected

Page 36: 1 CMPE 252A: Computer Networks Set 13: End-to-End Transmission Control (TCP and UDP)

36

acknowledged sent to be sent outside window

Source PortSource Port Dest. PortDest. Port

Sequence NumberSequence Number

AcknowledgmentAcknowledgment

HL/FlagsHL/Flags WindowWindow

D. ChecksumD. Checksum Urgent PointerUrgent Pointer

Options…Options…

Source PortSource Port Dest. PortDest. Port

Sequence NumberSequence Number

AcknowledgmentAcknowledgment

HL/FlagsHL/Flags WindowWindow

D. ChecksumD. Checksum Urgent PointerUrgent Pointer

Options...Options...

Packet Sent Packet Received

App write

Window Flow Control: Send Side

Page 37: 1 CMPE 252A: Computer Networks Set 13: End-to-End Transmission Control (TCP and UDP)

37

TCP Flow Control: Observations

TCP sender not required to transmit data as soon as it comes in form application. Example: when first 2KB of data come in,

sender could wait for more data since window is 4KB.

Receiver not required to send ACKs as soon as possible. Example: Wait for data so ACK is

piggybacked.

Page 38: 1 CMPE 252A: Computer Networks Set 13: End-to-End Transmission Control (TCP and UDP)

38

Data Flow Conservation of packets

Inject new packets at the rate ACKs are returned by receiver

New window: cwnd initially cwnd = 1 segment

Sender’s window = min(cwnd, RAW)

Page 39: 1 CMPE 252A: Computer Networks Set 13: End-to-End Transmission Control (TCP and UDP)

39

TCP Congestion Control Slow Start

Due to Van Jacobson (SIGCOMM 88) Algorithm:

Initialize cwnd = 1 MSS. If an ACK is received before timeout: cwnd = cwnd + 1 MSS for each acknowledged

segment

Algorithm used at the beginning of a connection and after a timeout

Leads to exponential growth in the amount of outstanding data in network

cwnd doubles every RTT epoch (i.e., once last segment in current window is acknowledged)

How do we avoid congesting the network?

Page 40: 1 CMPE 252A: Computer Networks Set 13: End-to-End Transmission Control (TCP and UDP)

40

Slow Start Example

1

one RTT

one pkt time

0R

21R

3

42R

567

83R

91011

1213

1415

1

2 3

4 5 6 7

Page 41: 1 CMPE 252A: Computer Networks Set 13: End-to-End Transmission Control (TCP and UDP)

41

When Should Slow-Start End?Congestion Avoidance

Want to end slow start when the pipe is full! When cwnd > ssthresh. Start with large ssthresh, but then refine it.

Slow start continues until BWDP is exceeded, then: Routers drop packets -- losses occur Need to stop exponential increase!

Use congestion avoidance to deal with lost packets! Slow down the transmission rate Provide for linear increase of the transmission

window Congestion avoidance implemented together with slow

start

Page 42: 1 CMPE 252A: Computer Networks Set 13: End-to-End Transmission Control (TCP and UDP)

42

Congestion Avoidance Introduce a new variable, ssthresh

Initialized to 65,535 (max. window) On packet loss (timeout):

Set ssthresh = cwnd/2 and cwnd = 1 Re-enter slow start, until cwnd = ssthresh

When cwnd = ssthresh, then Grow cwnd linearly until it reaches RAW When ACK is received before timeout then set cwnd = cwnd + 1/cwnd Hence, cwnd increases by 1 segment every RTT

Page 43: 1 CMPE 252A: Computer Networks Set 13: End-to-End Transmission Control (TCP and UDP)

43

Putting it together: Slow start and congestion avoidance

The algorithm: If cwnd < ssthresh

do slow start Else if cwnd > ssthresh

do congestion avoidance

Page 44: 1 CMPE 252A: Computer Networks Set 13: End-to-End Transmission Control (TCP and UDP)

44

Problems with CA and SS Slow start is an attempt to discover the

network bandwidth (quickly) Discovery proceeds by filling network

queues in intermediate routers. Once queues are full, routers drop packets. Once loss is discovered, it’s too late! TCP sender reduces window when loss is

discovered Queue level oscillates between full and

cwnd/2 What sort of problems does this introduce?

Page 45: 1 CMPE 252A: Computer Networks Set 13: End-to-End Transmission Control (TCP and UDP)

45

Tahoe TCP Congestion Control: Under-damped Feedback

System!cwnd

rt times

ssthresh

ssthresh/2

1 MSS

Very drastic a reaction to congestion!

Slow start

Congestion

avoidance

Waiting

timeout

Slow start

Congestion

avoidance

Page 46: 1 CMPE 252A: Computer Networks Set 13: End-to-End Transmission Control (TCP and UDP)

46

Tuning TCP Tahoe’s Congestion Control

Coarse timeouts remained a problem, and Fast retransmit was added with TCP Tahoe.

Timeouts can cause connections to be idle for a long time waiting for timer to expire.

Fast retransmit: may trigger retransmission of dropped packet sooner. Complements regular timeouts.

Page 47: 1 CMPE 252A: Computer Networks Set 13: End-to-End Transmission Control (TCP and UDP)

47

Fast Retransmit When can duplicate ACKs occur?

Loss, packet re-ordering, or sender waits for some number of duplicate ACKs before retransmitting.

Assume packet re-ordering is infrequent. Use receipt of 3 or more duplicate ACKs as

loss indicator Retransmit that segment before timeout.

Generally, fast retransmit eliminates about half the coarse-grain timeouts.

Conventional wisdom is that this yields roughly a 20% improvement in throughput.

Note – fast retransmit does not eliminate all the timeouts due to small window sizes at the source.

Page 48: 1 CMPE 252A: Computer Networks Set 13: End-to-End Transmission Control (TCP and UDP)

48

Reno: Fast Recovery Goal:

Reduce the number of times connection is slow-started.

Use ACKs in the pipe for self-clocking. In congestion avoidance mode,

after fast retransmit, reduce cwnd to half (rather than dropping it to 1).

Reno vs Tahoe: Slow start only used in the beginning of connection or when timeout occurs.

Page 49: 1 CMPE 252A: Computer Networks Set 13: End-to-End Transmission Control (TCP and UDP)

49

Reno: Tahoe with Fast Retransmit and Recovery

If 3 duplicate ACKs for segment N received: Retransmit segment N. Set ssthresh = 0.5*cwnd. Set cwnd = ssthresh + 3*MSS. [account for 3 duplicate ACKs]

For every subsequent duplicate ACK: Increase cwnd by 1 segment.

When “new” ACK received to retransmitted packet: Reset cwnd = ssthresh Resume congestion avoidance.

Result: cwnd is reset to half of the old cwnd after fast recovery

Page 50: 1 CMPE 252A: Computer Networks Set 13: End-to-End Transmission Control (TCP and UDP)

50

Delayed ACKs Tries to optimize ACK transmission. Delay ACKs (500msec) hoping to

piggyback on data segment. Example: telnet to interactive editor:

Send 1 character at a time: 20-byte TCP header+ 1-byte data+20-byte IP header.

Receiver ACKs immediately: 40-byte ACK. When editor reads character, window

update: 40-byte datagram. Then echoes character back: 41-byte

datagram.

Page 51: 1 CMPE 252A: Computer Networks Set 13: End-to-End Transmission Control (TCP and UDP)

51

Example Simple Bottleneck

Packet size = 1Kbyte Initial ssthresh = 32 packets BWDP (capacity of network) = 16.3 Kbyte Queue capacity = 17 packets

Page 52: 1 CMPE 252A: Computer Networks Set 13: End-to-End Transmission Control (TCP and UDP)

52

Reno: Congestion Window and Queue Growth

Page 53: 1 CMPE 252A: Computer Networks Set 13: End-to-End Transmission Control (TCP and UDP)

53

Reno: Congestion Window and Queue Growth

Queues fill once window grows larger than 17 packets After packet loss, Reno cuts window and starts again See-saw oscillations in window and queue length

increases end-to-end delays bad for real-time and interactive applications

Page 54: 1 CMPE 252A: Computer Networks Set 13: End-to-End Transmission Control (TCP and UDP)

54

So far…

Have way to fill pipe (slow start). Have way to run at equilibrium

(congestion avoidance). But tough transition.

No good initial ssthresh. Large ssthresh causes packet loss. Need approaches to quickly recover from

packet loss.

Page 55: 1 CMPE 252A: Computer Networks Set 13: End-to-End Transmission Control (TCP and UDP)

55

How Do Losses Occur? Bit errors detected by CRC Wireless links: common place Congestion in network - packets

dropped by routers competing data flows window exceeds bandwidth delay

product (BWDP) Note that BWDP is a function of

length (prop delay) of link and bandwidth

Page 56: 1 CMPE 252A: Computer Networks Set 13: End-to-End Transmission Control (TCP and UDP)

56

Error Recovery After Timeouts and duplicate ACKs. Sender retransmits segment after a

Timeout Sender must estimate connection RTT

times one packet per windowperforms smooth average over time

• rtt = β*old_rtt + (1-β) * rtt_sample (e.g., β = 0.875)•RTO = rtt + 4 * dev• Difference = rtt_sample – rtt_estimated• rtt_estimated = rtt_estimated + (δ x

Difference)• dev = δ (|Difference| - Deviation) and0 < δ < 1

Page 57: 1 CMPE 252A: Computer Networks Set 13: End-to-End Transmission Control (TCP and UDP)

57

Error Recovery (Conc.)

Sender also retransmits segment after 3 duplicate ACKs Receiver sends cumulative ACK stating the

next in-order packet expected Missing packet causes generation of

duplicates No theoretical reason for 3

Page 58: 1 CMPE 252A: Computer Networks Set 13: End-to-End Transmission Control (TCP and UDP)

58

Estimating RTT

Karn, P. and Partridge, C. 1987, “Improving Round-Trip Time Estimates in Reliable Transport Protocols”

Reno performs one RTT estimate per window of data What do we do when there is a timeout and

retransmission? When ACK arrives, does it refer to 1st or 2nd transmission?

Karn and Partridge’s solution: Don’t update RTT on any segments that have been

retransmitted Instead, double timeout on each failure, until successful Goal: Induce exponential backoff!

This is a direct result of using a single sequence number over a non-FIFO link!

TCP option can be used with time stamps!

Page 59: 1 CMPE 252A: Computer Networks Set 13: End-to-End Transmission Control (TCP and UDP)

59

Example RTT Estimation:RTT: gaia.cs.umass.edu to fantasia.eurecom.fr

100

150

200

250

300

350

1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106

time (seconnds)

RTT (milliseconds)

SampleRTT Estimated RTT

Page 60: 1 CMPE 252A: Computer Networks Set 13: End-to-End Transmission Control (TCP and UDP)

TCP New RenoTCP New Reno Two problem scenarios with TCP Reno

bursty losses, Reno cannot recover from bursts of 3+ losses

Packets arriving out-of-order can yield duplicate acks when in fact there is no loss.

New Reno solution – try to determine the end of a burst loss.

60

Page 61: 1 CMPE 252A: Computer Networks Set 13: End-to-End Transmission Control (TCP and UDP)

TCP New RenoTCP New Reno When duplicate ACKs trigger a retransmission for a

lost packet, remember the highest packet sent from window in recover.

Upon receiving an ACK, if ACK < recover => partial ACK If ACK ≥ recover => new ACK

Partial ACK implies another lost packet: Retransmit next packet, inflate window and stay in fast

recovery. New ACK implies fast recovery is over:

Starting from 0.5 x cwnd proceed with congestion avoidance (linear increase).

Positive result: New Reno recovers from n losses in n round trips.

Many servers support New Reno 61

Page 62: 1 CMPE 252A: Computer Networks Set 13: End-to-End Transmission Control (TCP and UDP)

62

Evaluation of TCP Reno Good:

Tries to adapt to network conditions, uses the Internet transparently.

Undesirable: Error recovery:

RTT measurements and assumptions of link reliability. Sequence numbering evolved from ARQ schemes that assume in-order

delivery of packets. Caveat: wireless networks and mobility

Congests to discover bandwidth available in the connection: Creates congestion and fills network queues until a loss occurs.

Poor performance over asymmetric networks RTT is used, rather than forward delay! Internet is asymmetric

Out-of-order packet delivery happens When is a packet really lost? No packet self clocking!

Could be improved: Every object is not just a byte stream with the same service needs!

Page 63: 1 CMPE 252A: Computer Networks Set 13: End-to-End Transmission Control (TCP and UDP)

63

Improvements to TCP

CUBIC: meant for high-speed networks. Meant to improve TCP friendliness and RTT fairness. Throughput is defined by packet loss rate only and not RTT.

SACK: include information in the ACK which indicates missing packets in the window

Vegas: use rate control instead of arrival of ACKs to pace data into network

Santa Cruz: use relative delay over forward path to anticipate queue buildup

Page 64: 1 CMPE 252A: Computer Networks Set 13: End-to-End Transmission Control (TCP and UDP)

64

Extras…

Page 65: 1 CMPE 252A: Computer Networks Set 13: End-to-End Transmission Control (TCP and UDP)

65

Goals for “TCP Santa Cruz”1. Improve detection of congestion

Decouple error control from congestion control Don’t rely on packet loss Identify direction of congestion

2. Improve congestion control Don’t fill network queues Robust to congestion on reverse path and to

ACK loss Isolate forward throughput from events in

reverse path

3. Provide high throughput, low delay and delay variation

Page 66: 1 CMPE 252A: Computer Networks Set 13: End-to-End Transmission Control (TCP and UDP)

66

Error Recovery

1. Improve RTT estimate by timing each packet, including retransmissions

identify by SN and copy numbereliminate Karn’s algorithmtime packets when needed most during

congestion2. Retransmit after 1st duplicate ACK if

necessary (Vegas does this for original transmissions)

3. Receiver transmits an ACK window to indicate holes in the transmission stream

Page 67: 1 CMPE 252A: Computer Networks Set 13: End-to-End Transmission Control (TCP and UDP)

67

ACK Window Provides status of every packet within send

window Each bit represents a specified number of

bytes received Granularity of bit determined by the receiver

Page 68: 1 CMPE 252A: Computer Networks Set 13: End-to-End Transmission Control (TCP and UDP)

68

Congestion Control

Detect changes in queue length at the bottleneck link

Monitor the relative delay over the link: Delay that one packet experiences with

respect to another Limit number of packets in bottleneck queue Calculated by sender from a timestamp

returned by the Receiver

Page 69: 1 CMPE 252A: Computer Networks Set 13: End-to-End Transmission Control (TCP and UDP)

69

Relative Delay Calculation Dj,i = 0 no additional

queuing

Dj,i > 0 increased queuing on forward path

Dj,i < 0 decreased queuing on forward path

Dj,i = Rj,i - Sj,i

Page 70: 1 CMPE 252A: Computer Networks Set 13: End-to-End Transmission Control (TCP and UDP)

70

Congestion Control Algorithm

1. Let Nop be the desired number of packets, per session, queued at

the bottleneck2. For each pair of ACKs received,

Sender computes:Relative Delay: D = R - SCurrent packet service time at receiver:

•pktS = R / # pkts received

Page 71: 1 CMPE 252A: Computer Networks Set 13: End-to-End Transmission Control (TCP and UDP)

71

Congestion Control Algorithm (cont.)

3. Translate the relative delay into packets:

(a) Queuing over window interval: • sum delay measurements for all packet pairs and divide by

average packet service time

(b) Total queuing:

Page 72: 1 CMPE 252A: Computer Networks Set 13: End-to-End Transmission Control (TCP and UDP)

72

Congestion Control Algorithm (cont.)

4. Window adjustment policy:• controls amount of outstanding data in

network• Goal is to maximize throughput and minimize

delay

if ni == Nop maintain current window size if ni < Nop increase by one segment if ni > Nop decrease by one segment

Page 73: 1 CMPE 252A: Computer Networks Set 13: End-to-End Transmission Control (TCP and UDP)

73

Simulations ns-2 Network Simulator Derived protocol from existing TCP

implementation Compare TCP-Santa Cruz to Reno and

Vegas 3 configurations:

Simple bottleneck Traffic on reverse path Asymmetric configuration

Page 74: 1 CMPE 252A: Computer Networks Set 13: End-to-End Transmission Control (TCP and UDP)

74

Experiment #1 simple bottleneck

Packet size = 1Kbyte BWDP (capacity of network) = 16.3 Kbyte Queue capacity = 17 packets

Page 75: 1 CMPE 252A: Computer Networks Set 13: End-to-End Transmission Control (TCP and UDP)

75

Reno: Congestion Window and Queue Growth

Queues fill once window grows larger than 17 packets After packet loss, Reno cuts window and starts again See-saw oscillations in window and queue length

increases end-to-end delays bad for real-time and interactive applications

Page 76: 1 CMPE 252A: Computer Networks Set 13: End-to-End Transmission Control (TCP and UDP)

76

TCP-Santa Cruz: Congestion Window and Queue Growth

Nop = 1 (minimal queuing to reduce delays)

Window at desired operating point: 17 + 1 = 18 No see-saw oscillations in window and queue length !! Transmits at available bandwidth without introducing

congestion No overflow of network queues

Page 77: 1 CMPE 252A: Computer Networks Set 13: End-to-End Transmission Control (TCP and UDP)

77

Simple Bottleneck - summary

TCP Santa Cruz provides: Slightly improved throughput (not much room for

improvement) Lower delay (20 - 45% )improvement over Reno Reduced delay variance over Reno and Vegas

Page 78: 1 CMPE 252A: Computer Networks Set 13: End-to-End Transmission Control (TCP and UDP)

78

Experiment #2 - Reverse Traffic

Question: Is throughput affected by reverse path traffic?

Goal: Isolate forward throughput from reverse path events ! can’t be done with RTT measurements!

No reason to slow forward path transmission rate

Page 79: 1 CMPE 252A: Computer Networks Set 13: End-to-End Transmission Control (TCP and UDP)

79

Reverse Traffic: Window Growth

Reno Window growth slowed because of lost and

delayed ACK packets Loss detection is also delayed

Santa Cruz Nop = 5 Achieves optimal window size: 17 + 5 = 22

Page 80: 1 CMPE 252A: Computer Networks Set 13: End-to-End Transmission Control (TCP and UDP)

80

Reverse Traffic - Summary

Throughput: Santa Cruz achieves 47 - 67% improvement Delay: Santa Cruz achieves 45 - 59% improvement Delay variance: 3 orders of magnitude improvement

Page 81: 1 CMPE 252A: Computer Networks Set 13: End-to-End Transmission Control (TCP and UDP)

81

Experiment #3 Network Asymmetry

ADSL, HFC, Combination Networks: e.g., telephone upstream, cable downstream

Forward path: 24Mbps 3000 pkts/sec Reverse path: 320kbps 1000 pkts/sec Asymmetry factor: k = 3 commonplace

Page 82: 1 CMPE 252A: Computer Networks Set 13: End-to-End Transmission Control (TCP and UDP)

82

Reverse Traffic - Summary

Throughput: Santa Cruz achieves 99% improvement

Delay: Santa Cruz achieves 42 - 58% improvement over Reno

Page 83: 1 CMPE 252A: Computer Networks Set 13: End-to-End Transmission Control (TCP and UDP)

83

Conclusion High throughput Low end-to-end delay and delay variation Isolate forward throughput from events on

reverse path ACK loss, congestion on reverse path,

asymmetric links Problems:

Modify AID (additive increase and decrease) to ensure fairness under multiple sources!

Problems with different bottlenecks? Wireless

Page 84: 1 CMPE 252A: Computer Networks Set 13: End-to-End Transmission Control (TCP and UDP)

84

TCP-SACK Goal: Improve TCP error recovery mechanism Selectively acknowledge lost data within the transmission

window Uses sequence number ranges

example: ACK = 1000, SACK = 1040:1080 Limited by max. size of TCP header to 3 distinct ranges Important when there are multiple losses per window

multiple losses often results in a timeout Significance performance improvements in wired networks

Page 85: 1 CMPE 252A: Computer Networks Set 13: End-to-End Transmission Control (TCP and UDP)

85

MSS (Maximum Segment Size)

Largest “chunk” of application-level data can be specified in SYN segment, else

default Typical values are 1500, 536 and 512 bytes

“non-local address” - default 536 bytes

Ex: In practice MSS is limited by MTU of LAN… limited by outgoing interface’s MTU of 1500 bytes