tcp: reliable, in-order deliveryinst.eecs.berkeley.edu/~ee122/fa06/notes/17-tcp-2up.pdf · tcp...
TRANSCRIPT
1
1
TCP: Reliable, In-Order DeliveryEE 122: Intro to Communication Networks
Fall 2006 (MW 4-5:30 in Donner 155)
Vern PaxsonTAs: Dilip Antony Joseph and Sukun Kim
http://inst.eecs.berkeley.edu/~ee122/
Materials with thanks to Jennifer Rexford, Ion Stoica,and colleagues at Princeton and UC Berkeley
2
Announcements• Sukun is away this week. Dilip will cover his
section and office hours.
• Dilip’s office has moved to 751 Soda (alcove).His office hours remain Fri 11-12.
• Today is the deadline for requesting discussion /possible regarding of midterm questions. Sendemail to do so.
• Project #3 out on Wednesday–Can do individual or in a team of 2 people– First phase due November 16 - no slip days–Exercise good (better) time management
2
3
Today’s Lecture• How does TCP achieve correct operation?
• Reliability in the face of IP’s meager “best effort”service
• 3-way handshake to establish connections
• 3-way or 4-way handshake to terminate conn.
• Retransmission to recover from loss–We’ll only look at timeout-based retransmission today
• State diagrams as a tool for understandingcomplex protocol operation
4
TCP Service Model• Reliable, in-order, byte-stream delivery
– and with good performance
• Challenges - the network can– drop packets
Even perhaps a large number– delay packets
Even perhaps for many seconds– deliver packets out-of-order
Follows from possibility of arbitrary delay– replicate packets
Weird, but it does sometimes happen– corrupt packets– (What’s missing?)
3
5
TCP Support for Reliable Delivery• Checksum
– Used to detect corrupted data at the receiver– …leading the receiver to drop the packet
• Sequence numbers– Used to detect missing data– ... and for putting the data back in order
• Retransmission– Sender retransmits lost or corrupted data– Timeout based on estimates of round-trip time– Fast retransmit algorithm for rapid retransmission
6
TCP Header
Source port Destination port
Sequence number
Acknowledgment
Advertised windowHdrLen Flags0
Checksum Urgent pointer
Options (variable)
Data
4
7
TCP Header
Source port Destination port
Sequence number
Acknowledgment
Advertised windowHdrLen Flags0
Checksum Urgent pointer
Options (variable)
Data
These shouldbe familiar
8
TCP Header
Source port Destination port
Sequence number
Acknowledgment
Advertised windowHdrLen Flags0
Checksum Urgent pointer
Options (variable)
Data
Startingsequencenumber (byteoffset) of datacarried in thissegment
5
9
TCP Header
Source port Destination port
Sequence number
Acknowledgment
Advertised windowHdrLen Flags0
Checksum Urgent pointer
Options (variable)
Data
Acknowledgmentgives seq # justbeyond highestseq. received inorder.
If sender sendsN in-order bytesstarting at seq Sthen ack for it willbe S+N.
10
TCP Header
Source port Destination port
Sequence number
Acknowledgment
Advertised windowHdrLen Flags0
Checksum Urgent pointer
Options (variable)
Data
Number of 4-bytewords in TCPheader;5 = no options
6
11
TCP Header
Source port Destination port
Sequence number
Acknowledgment
Advertised windowHdrLen Flags0
Checksum Urgent pointer
Options (variable)
Data
“Must Be Zero”6 bits reserved
12
TCP Header
Source port Destination port
Sequence number
Acknowledgment
Advertised windowHdrLen Flags0
Checksum Urgent pointer
Options (variable)
Data
We will get tothese shortly
7
13
TCP Header
Source port Destination port
Sequence number
Acknowledgment
Advertised windowHdrLen Flags0
Checksum Urgent pointer
Options (variable)
Data
Buffer spaceavailable forreceiving data.Used for TCP’ssliding window.
Interpreted asoffset beyondAcknowledgmentfield’s value.
14
TCP Header
Source port Destination port
Sequence number
Acknowledgment
Advertised windowHdrLen Flags0
Checksum Urgent pointer
Options (variable)
Data
Used with URGflag to indicateurgent data (notdiscussed further)
8
15
TCP “Stream of Bytes” Service
Byte 0
Byte 1
Byte 2
Byte 3
Byte 0
Byte 1
Byte 2
Byte 3
Host A
Host B
Byte 80
Byte 80
16
… Provided Using TCP “Segments”B
yte 0B
yte 1B
yte 2B
yte 3
Byte 0
Byte 1
Byte 2
Byte 3
Host A
Host B
Byte 80
TCP Data
TCP Data
Byte 80
Segment sent when:1. Segment full (Max Segment Size),2. Not full, but times out, or3. “Pushed” by application.
9
17
TCP Segment
• IP packet–No bigger than Maximum Transmission Unit (MTU)–E.g., up to 1,500 bytes on an Ethernet
• TCP packet– IP packet with a TCP header and data inside– TCP header ≥ 20 bytes long
• TCP segment–No more than Maximum Segment Size (MSS) bytes–E.g., up to 1460 consecutive bytes from the stream
IP HdrIP Data
TCP HdrTCP Data (segment)
18
Sequence NumbersHost A
Host B
TCP Data
TCP Data
TCP HDR
TCP HDR
ISN (initial sequence number)
Sequencenumber = 1st
byte ACK sequencenumber = nextexpected byte
10
19
Initial Sequence Number (ISN)• Sequence number for the very first byte–E.g., Why not just use ISN = 0?
• Practical issue– IP addresses and port #s uniquely identify a connection–Eventually, though, these port #s do get used again–… ∃ a chance an old packet is still in flight–… and might be associated with new connection
• ∴ TCP requires (RFC793) changing ISN over time–Set from 32-bit clock that ticks every 4 microseconds–… only wraps around once every 4.55 hours
• To establish a connection, hosts exchange ISNs
20
Connection Establishment:TCP’s Three-Way Handshake
11
21
Establishing a TCP Connection
• Three-way handshake to establish connection–Host A sends a SYN (open; “synchronize sequence
numbers”) to host B–Host B returns a SYN acknowledgment (SYN ACK)–Host A sends an ACK to acknowledge the SYN ACK
SYN
SYN ACK
ACK
A B
DataData
Each host tellsits ISN to theother host.
22
TCP Header
Source port Destination port
Sequence number
Acknowledgment
Advertised windowHdrLen Flags0
Checksum Urgent pointer
Options (variable)
Data
Flags: SYNACKFINRSTPSHURG
See /usr/include/netinet/tcp.h on Unix Systems
12
23
Step 1: A’s Initial SYN Packet
A’s port B’s port
A’s Initial Sequence Number
(Irrelevant since ACK not set)
Advertised window5=20B Flags0
Checksum Urgent pointer
Options (variable)
Flags: SYNACKFINRSTPSHURG
A tells B it wants to open a connection…
24
Step 2: B’s SYN-ACK Packet
B’s port A’s port
B’s Initial Sequence Number
ACK = A’s ISN plus 1
Advertised window20B 0
Checksum Urgent pointer
Options (variable)
Flags: SYNACKFINRSTPSHURG
B tells A it accepts, and is ready to hear the next byte…
… upon receiving this packet, A can start sending data
Flags
13
25
Step 3: A’s ACK of the SYN-ACK
A’s port B’s port
B’s ISN plus 1
Advertised window20B Flags0
Checksum Urgent pointer
Options (variable)
Flags: SYNACKFINRSTPSHURG
A tells B it’s likewise okay to start sending
A’s Initial Sequence Number
… upon receiving this packet, B can start sending data
26
Timing Diagram: 3-Way Handshaking
Client (initiator)
Server
SYN, SeqNum = x
SYN + ACK, SeqNum = y, Ack = x + 1
ACK, Ack = y + 1
ActiveOpen
PassiveOpen
connect()
listen()
accept()
14
27
What if the SYN Packet Gets Lost?• Suppose the SYN packet gets lost–Packet is lost inside the network, or:–Server discards the packet (e.g., listen queue is full)
• Eventually, no SYN-ACK arrives–Sender sets a timer and waits for the SYN-ACK–… and retransmits the SYN if needed
• How should the TCP sender set the timer?–Sender has no idea how far away the receiver is–Hard to guess a reasonable length of time to wait–SHOULD (RFCs 1122 & 2988) use default of 3 seconds
Other implementations instead use 6 seconds
28
SYN Loss and Web Downloads• User clicks on a hypertext link–Browser creates a socket and does a “connect”– The “connect” triggers the OS to transmit a SYN
• If the SYN is lost…– 3-6 seconds of delay: can be very long–User may become impatient–… and click the hyperlink again, or click “reload”
• User triggers an “abort” of the “connect”–Browser creates a new socket and another “connect”–Essentially, forces a faster send of a new SYN packet!–Sometimes very effective, and the page comes quickly
15
29
5 Minute Break
Questions Before We Proceed?
30
Tearing Down the Connection
16
31
Normal Termination, One Side At A Time
• Finish (FIN) to close and receive remaining bytes– FIN occupies one octet in the sequence space
• Other host ack’s the octet to confirm
• Closes A’s side of the connection, but not B’s– Until B likewise sends a FIN– Which A then acks
SYN
SYN
ACK
ACK
Dat
a
FIN
ACK
ACK
timeA
B
FIN
ACK
Timeout:Avoid reincarnationCan retransmitFIN ACK if lost
Connectionnow half-closed
Connectionnow closed
32
Normal Termination, Both Together
• Same as before, but B sets FIN with their ack of A’s FIN
SYN
SYN
ACK
ACK
Dat
a
FIN
FIN + A
CK
ACK
timeA
B
ACK
Connectionnow closed
Timeout:Avoid reincarnationCan retransmitFIN ACK if lost
17
33
Sending/Receiving the FIN Packet• Sending a FIN: close()–Process has finished
sending data via thesocket
–Process calls “close()” toclose the socket
–Once TCP has sent all ofthe outstanding bytes…
–… then TCP sends a FIN Even if bytes not yet ack’d Because FIN has seqno
beyond all the bytes … … and thus won’t be ack’d
until all bytes are delivered
• Receiving a FIN: EOF–Process is reading data
from the socket–Eventually, the attempt
to read returns an EOF–All bytes prior to sender
calling close() have beendelivered
34
Abrupt Termination
• A sends a RESET (RST) to B– E.g., because app. process on A crashed
• That’s it– B does not ack the RST– Thus, RST is not delivered reliably– And: any data in flight is lost– But: if B sends anything more, will elicit another RST
SYN
SYN
ACK
ACK
Dat
a
RSTA
CK
timeA
B
Data RS
T
18
35
Reliability: TCP Retransmission
36
Reasons for Retransmission
Packet
ACK
Tim
eout
Packet
ACK
Tim
eout
Packet
Tim
eout
Packet
ACK
Tim
eout
Packet
ACK
Tim
eout
Packet
ACK
Tim
eout
ACK lostDUPLICATE
PACKET
Packet lost Early timeoutDUPLICATEPACKETS
19
37
How Long Should Sender Wait?• Sender sets a timeout to wait for an ACK– Too short: wasted retransmissions– Too long: excessive delays when packet lost
• TCP sets retransmission timeout (RTO) as functionof RTT–Expect ACK to arrive an RTT after data sent–… plus slop to allow for variations (e.g., queuing, MAC)
• But: how does the sender know the RTT?
• And: what’s a good estimate for “slop”?
38
RTT Estimation
• Use exponential averaging:
!
SampleRTT = AckRcvdTime " SendPacketTime
EstimatedRTT =# $ EstimatedRTT + (1"#) $ SampleRTT
# = 7 /8 (for one measurement per flight)
Estim
ated
RTT
Time
SampleRTT
20
39
Jacobson/Karels Algorithm• Compute “slop” in terms of observed variability
–One solution: use standard deviation (requiresexpensive square root computation)
–Use mean deviation instead
!
Difference = SampleRTT " EstimatedRTT
Deviation = Deviation + # $ (|Difference |"Deviation)
RTO = µ $ EstimatedRTT + % $Deviation
# =1/4 (again, for one measurement per flight)
µ =1
% = 4
– Implementations often use a coarse-grained (500 msec)timer, so resulting value is large
40
Problem: Ambiguous Measurement• How to differentiate between the real ACK, and
ACK of the retransmitted packet?
ACK
Retransmission
Original Transmission
Sam
pleR
TT ?
Sender Receiver
ACKRetransmission
Original Transmission
Sam
pleR
TT ?
Sender Receiver
21
41
Karn/Partridge Algorithm
• Measure SampleRTT only for original transmissions–Once a segment has been retransmitted, do not use it for
any further measurements
• Also, employ exponential backoff–Every time RTO timer expires, set RTO ← 2·RTO– (Up to maximum ≥ 60 sec)–Every time new measurement comes in (= successful
original transmission), collapse RTO back to computedvalue
42
State Diagrams• For complicated protocols, operation depends
critically on current mode of operation
• Important tool for capture this: state diagram
• At any given time, protocol endpoint is in aparticular state–Dictates its current behavior
• Endpoint transitions to other states on events– Interaction with lower layer
Reception of certain types of packets– Interaction with upper layer
New data arrives to send, or received data is consumed– Timers
22
43
TCP State Diagram
44
23
45
46
Summary• Reliable, in-order, byte-stream delivery–Sequence numbers–Acknowledgments– 3-way handshake to establish– 3-way or 4-way handshake to terminate– Timer-based retransmission–State diagram to keep it all straight
• What’s missing?–Performance
• Next lecture–Congestion control