programming tcp for responsiveness

30
Copyright (C) 2016 DeNA Co.,Ltd. All Rights Reserved. Programming TCP for responsiveness DeNA Co., Ltd. Kazuho Oku 1

Upload: kazuho-oku

Post on 24-Jan-2017

5.972 views

Category:

Internet


4 download

TRANSCRIPT

Page 1: Programming TCP for responsiveness

Copyright(C)2016DeNACo.,Ltd.AllRightsReserved.

Programming TCP for responsiveness

DeNA Co., Ltd.Kazuho Oku

1

Page 2: Programming TCP for responsiveness

Copyright(C)2016DeNACo.,Ltd.AllRightsReserved.

explains TCP latency optimization implemented in H2O HTTP/2 server 2.1

2Programming TCP for responsivesess

Page 3: Programming TCP for responsiveness

Copyright(C)2016DeNACo.,Ltd.AllRightsReserved.

Background

3Programming TCP for responsivesess

Page 4: Programming TCP for responsiveness

Copyright(C)2016DeNACo.,Ltd.AllRightsReserved.

TCP slow start

n  Initial Congestion Window (IW)=10⁃  only 10 packets can be sent in first RTT⁃  used to be IW=3

n  window increase: 1.5x/RTT

4Programming TCP for responsivesess

0

100,000

200,000

300,000

400,000

500,000

600,000

700,000

800,000

1 2 3 4 5 6 7 8

bytestransmi,ed

RTT

TCPslowstart(IW10,MSS1460)

Page 5: Programming TCP for responsiveness

Copyright(C)2016DeNACo.,Ltd.AllRightsReserved.

Why 1.5x?

During slow start, a TCP increments cwnd by at most SMSS bytes for each ACK received that cumulatively acknowledges new data.(snip)The delayed ACK algorithm specified in [RFC1122] SHOULD be used by a TCP receiver. When using delayed ACKs, a TCP receiver MUST NOT excessively delay acknowledgments. Specifically, an ACK SHOULD be generated for at least every second full-sized segment, and MUST be generated within 500 ms of the arrival of the first unacknowledged packet.

TCP Congestion Control (RFC 5681)

5Programming TCP for responsivesess

Page 6: Programming TCP for responsiveness

Copyright(C)2016DeNACo.,Ltd.AllRightsReserved.

Flow of the ideal HTTP

n  fastest within the limits of TCP/IPn  receive a request 0-RTT, and:

⁃  first send CSS/JS*⁃  then send the HTML⁃  then send the images*

*: but only the ones not cached by the browser

6Programming TCP for responsivesess

client server

1RT

T

request

response

Page 7: Programming TCP for responsiveness

Copyright(C)2016DeNACo.,Ltd.AllRightsReserved.

The reality in HTTP/2

n  TCP establishment: +1 RTT*n  TLS handshake: +2 RTT**n  HTML fetch: +1 RTTn  JS,CSS fetch: +2 RTT***

n  Total: 6 RTT

*: 0 RTT on reconnection**: 1 RTT on reconnection***: servers often cannot switch to sending JS,CSS instantly, due to the output buffered in TCP send buffer

7Programming TCP for responsivesess

client server

1RT

T

TCPSYN

TCPSYNACK

TLSHandshake

TLSHandshake

TLSHandshake

TLSHandshake

GET/

HTML

GETcss,js

css,js〜〜

Page 8: Programming TCP for responsiveness

Copyright(C)2016DeNACo.,Ltd.AllRightsReserved.

Ongoing optimizations

n  TCP Fast Open⁃  initial establishment in 1 RTT⁃  re-establishment in 0 RTT

n  TLS 1.3⁃  initial handshake complete in 1 RTT⁃  resumption in 0 RTT

n  what can be done in the HTTP/2 layer?

8Programming TCP for responsivesess

Page 9: Programming TCP for responsiveness

Copyright(C)2016DeNACo.,Ltd.AllRightsReserved.

Programming TCP for responsiveness

9Programming TCP for responsivesess

Page 10: Programming TCP for responsiveness

Copyright(C)2016DeNACo.,Ltd.AllRightsReserved.

Programming TCP for responsiveness

Answer: TCP Urgent Indications (i.e. MSG_OOB)

10Programming TCP for responsivesess

Page 11: Programming TCP for responsiveness

Copyright(C)2016DeNACo.,Ltd.AllRightsReserved.

Programming TCP for responsiveness

Answer: TCP Urgent Indications (i.e. MSG_OOB)

11Programming TCP for responsivesess

Page 12: Programming TCP for responsiveness

Copyright(C)2016DeNACo.,Ltd.AllRightsReserved.

TCP Urgent Indications

n  out-of-band messaging for TCP⁃  used by telnet!

n  can only send 1 octet⁃  conflicting specs on how to handle multi-octet

messagesn  cannot be used for HTTP/2n  RFC 6093 “recommends against the use of urgent

mechanism” (RFC 7414)

12Programming TCP for responsivesess

Page 13: Programming TCP for responsiveness

Copyright(C)2016DeNACo.,Ltd.AllRightsReserved.

Typical sequence of HTTP/2

13Programming TCP for responsivesess

HTTP/2 200 OK

<!DOCTYPE HTML>…<SCRIPT SRC=”jquery.js”>…

client server

GET /

GET /jquery.js

needtoswitchsendingfromHTMLtoJSatthisverymoment(meansthatamountofdatasentin*mustbesmallerthanIW)

1RTT

*

Page 14: Programming TCP for responsiveness

Copyright(C)2016DeNACo.,Ltd.AllRightsReserved.

Buffering in TCP and TLS layer

14Programming TCP for responsivesess

TCPsendbuffer

CWNDunacked pollthreshold

BIObuf.

// ordinary code (non-blocking)while (SSL_write(…) != SSL_ERR_WANT_WRITE) ;

TLSRecords

sentimmediately notimmediatelysent

HTTP/2frames

Page 15: Programming TCP for responsiveness

Copyright(C)2016DeNACo.,Ltd.AllRightsReserved.

Why do we have buffers?

15Programming TCP for responsivesess

n  TCP send buffer:⁃  reduce ping-pong bet. kernel and application

n  BIO buffer:⁃  for data that couldnʼt be stored in TCP send buffer

TCPsendbuffer

CWNDunacked pollthreshold

BIObuf.

TLSRecords

sentimmediately notimmediatelysent

HTTP/2frames

Page 16: Programming TCP for responsiveness

Copyright(C)2016DeNACo.,Ltd.AllRightsReserved.

Improvement: poll-then-write

16Programming TCP for responsivesess

TCPsendbuffer

CWNDunacked pollthreshold

// only call SSL_write when polls notifies the app.while (poll_for_write(fd) == SOCKET_IS_READY) SSL_write(…);

TLSRecords

sentimmediately notimmediatelysent

HTTP/2frames

Page 17: Programming TCP for responsiveness

Copyright(C)2016DeNACo.,Ltd.AllRightsReserved.

Adjust poll threshold

17Programming TCP for responsivesess

TCPsendbuffer

CWNDunacked pollthreshold

n  set poll threshold to the end of CWND?⁃  setsockopt(TCP_NOTSENT_LOWAT)⁃  in linux, the minimum is CWND + 1 octet•  becomes unstable when set to CWND + 0

TLSRecords

sentimmediately notimmediatelysent

HTTP/2frames

Page 18: Programming TCP for responsiveness

Copyright(C)2016DeNACo.,Ltd.AllRightsReserved.

Adjust poll threshold

18Programming TCP for responsivesess

CWNDunacked pollthreshold

// only call SSL_write when polls notifies the app.while (poll_for_write(fd) == SOCKET_IS_READY) SSL_write(…);

TLSRecords

sentimmediately notimmediatelysent

HTTP/2frames

TCPsendbuffer

Page 19: Programming TCP for responsiveness

Copyright(C)2016DeNACo.,Ltd.AllRightsReserved.

Further improvement: read TCP states

19Programming TCP for responsivesess

CWNDunacked pollthreshold

// calc size of data to send by calling getsockopt(TCP_INFO)if (poll_for_write(fd) == SOCKET_IS_READY) { capacity = CWND - unacked + TWO_MSS - TLS_overhead; SSL_write(prepare_http2_frames(capacity));}

TLSRecords

sentimmediately notimmediatelysent

HTTP/2frames

TCPsendbuffer

Page 20: Programming TCP for responsiveness

Copyright(C)2016DeNACo.,Ltd.AllRightsReserved.

Negative impact of additional delay

n  increased delay bet. ACK recv. → data send, since:⁃  traditional approach: completes within kernel⁃  this approach: application needs to be notified to

generate new datan  outcome:

⁃  increase of CWND becomes slower⁃  leads to slower peak speed?•  depends on how CWND at peak is calculated

⁃  does kernel use TCP timestamp for the matter?

20Programming TCP for responsivesess

Page 21: Programming TCP for responsiveness

Copyright(C)2016DeNACo.,Ltd.AllRightsReserved.

Countermeasures

n  optimize for responsiveness only when necessary⁃  i.e. when RTT is big and CWND is small⁃  impact of optimization is proportional to

unsent_bytes / CWNDn  disable optimization if additional delay is significant

⁃  when epoll returns immediately, estimated additional delay is equal to the time spent by the loop

21Programming TCP for responsivesess

Page 22: Programming TCP for responsiveness

Copyright(C)2016DeNACo.,Ltd.AllRightsReserved.

Configuration Directives

n  http2-latopt-min-rtt⁃  minimum TCP RTT to enable the optimization⁃  default: UINT_MAX (disabled)

n  http2-latopt-max-cwnd⁃  maximum CWND to enable (in octets)⁃  default: 65535

n  http2-max-additional-delay⁃  max. additional delay (as the ratio to TCP RTT)⁃  latopt disabled if the delay is greater⁃  default: 0.1

22Programming TCP for responsivesess

Page 23: Programming TCP for responsiveness

Copyright(C)2016DeNACo.,Ltd.AllRightsReserved.

Pseudo-codesize_t get_suggested_write_size() { getsockopt(fd, IPPROTO_TCP, TCP_INFO, &tcp_info, sizeof(tcp_info)); if (tcp_info.tcpi_rtt < min_rtt || tcp_info.tcpi_snd_cwnd > max_cwnd) return UNKNOWN;

switch (SSL_get_current_cipher(ssl)->id) { case TLS1_CK_RSA_WITH_AES_128_GCM_SHA256: case …: tls_overhead = 5 + 8 + 16; break; default: return UNKNOWN; }

packets_sendable = tcp_info.tcpi_snd_cwnd > tcp_info.tcpi_unacked ? tcp_info.tcpi_snd_cwnd - tcp_info.tcpi_unacked : 0; return (packets_sendable + 2) * (tcp_info.tcpi_snd_mss - tls_overhead);}

23Programming TCP for responsivesess

Page 24: Programming TCP for responsiveness

Copyright(C)2016DeNACo.,Ltd.AllRightsReserved.

Benchmark (1)

24Programming TCP for responsivesess

n  conditions:⁃  server in Ireland, client in Tokyo (RTT 250ms)⁃  load tiny js at the top of a large HTML

n  result: delay decreased from 511ms to 250ms⁃  i.e. JS fetch latency was 2RTT, became 1 RTT•  similar results in other environments

Page 25: Programming TCP for responsiveness

Copyright(C)2016DeNACo.,Ltd.AllRightsReserved.

Benchmark (2)

n  using same data as previousn  server: Sakura VPS (Ishikari DC)

25Programming TCP for responsivesess

0

50

100

150

200

250

300

HTML JS

millisecon

ds�

downloadingHTML(andJSwithin)RTT~25ms�

master latopt

Page 26: Programming TCP for responsiveness

Copyright(C)2016DeNACo.,Ltd.AllRightsReserved.

Conclusion

n  near-optimal result can be achieved⁃  by adjusting poll threshold and reading TCP

states⁃  1-packet overhead due to restriction in Linux

kerneln  1-RTT improvement in H2O

⁃  estimated 1-RTT improvement per the depth of the load graph

26Programming TCP for responsivesess

Page 27: Programming TCP for responsiveness

Copyright(C)2016DeNACo.,Ltd.AllRightsReserved.

Under the hood

27Programming TCP for responsivesess

Page 28: Programming TCP for responsiveness

Copyright(C)2016DeNACo.,Ltd.AllRightsReserved.

TCP_NOTSENT_LOWAT

n  supported by Linux, OS Xn  on Linux:

⁃  sysctl:•  set to -1: use kernel default•  set to 0: sshd hangs•  set to positive int: override kernel default

⁃  setsockopt:•  set to 0: use default (sysctl or kernel)•  set to int: override default

28Programming TCP for responsivesess

Page 29: Programming TCP for responsiveness

Copyright(C)2016DeNACo.,Ltd.AllRightsReserved.

Unit of CWND

n  Linux: # of packets⁃  if INITCWND is 10, you can send at most 10

packets at once, regardless of their sizen  BSD (incl. OS X): octets

⁃  you can send CWND*MSS octets, regardless of the number of packets•  if CWND=10 and MSS=1460, it is possible to send

14,600 packets containing 1-octet payload

29Programming TCP for responsivesess

Page 30: Programming TCP for responsiveness

Copyright(C)2016DeNACo.,Ltd.AllRightsReserved.

Determining amount of data that can be sent immediately

OS MSS CWND inflight sendbuffer(inflight+unsent)

Linux tcpi_snd_mss tcpi_snd_cwnd* tcpi_snd_unacked* ioctl(SIOCOUTQ)

OSX** tcpi_maxseg tcpi_snd_cwnd - tcpi_snd_sbbytes

FreeBSD tcpi_snd_mss tcpi_snd_cwnd - ioctl(FIONWRITE)

NetBSD tcpi_snd_mss tcpi_snd_cwnd* - ioctl(FIONWRITE)

30Programming TCP for responsivesess

n  calculate either of:⁃  CWND - inflight⁃  min(CWND - (inflight + unsent), 0)

n  units used in the calculation must be the same⁃  NetBSD: fail

*:unitsofvaluesmarkedarepackets,unmarkedareoctets**:somefmesthevaluesoftcpi_*arereturnedaszeros