effect of pipelining and multiplexing in estimating http/2.0 web … · 2017-07-04 · effect of...

Effect of Pipelining and Multiplexingin Estimating HTTP/2.0 Web Object Sizes

Ricardo [email protected]

INESC TEC, Faculty of Engineering, University of PortoPorto, Portugal

Abstract—HTTP response size is a well-known side channelattack. With the deployment of HTTP/2.0, response size esti-mation attacks are generally dismissed with the argument thatpipelining and response multiplexing prevent eavesdroppers fromfinding out response sizes. Yet the impact that pipelining andresponse multiplexing actually have in estimating HTTP responsesizes has not been adequately investigated. In this paper we setout to help understand the effect of pipelining and responsemultiplexing in estimating the size of web objects on the Internet.We conduct an experiment that collects HTTP response sizes andTLS record sizes from 10k popular web sites. We gather evidenceon and discuss reasons for the limited amount of pipelining andresponse multiplexing used on the Internet today: only 29% ofthe HTTP2 web objects we observe are pipelined and only 5%multiplexed. We also provide worst case results under differentattack assumptions and show how effective a simple model forestimating response sizes from TLS record sizes can be. Ourconclusion is that pipelining and especially response multiplexingcan yield, as expected, a perceivable increase in relative object sizeestimation error yet the limited extent of multiplexing observedon the Internet today and the relative simplicity of attacks tothe current pipelining mechanisms hinder their ability to helpprevent web object size estimation.

I. INTRODUCTIONHTTP with TLS encryption prevents attacks that inspect

HTTP payload and signaling. HTTP response size analysis isa well-known side-channel attack [1] that overcomes encryptedpayload inspection by using eavesdropped sizes of web objectsto identify web applications. Up to HTTP/1.1, the web clienttypically waits for the response to the current HTTP requestbefore issuing the next request. This makes it straightforwardto find the size of HTTP responses by tapping into the TCP/IPconnection and filtering data by client-to-server and server-to-client directions. With the deployment of HTTP/2.0 [2]with its pipelining and multiplexing mechanisms, most authorsassume HTTP response size analysis attacks can be prevented.With request pipelining, clients no longer need to wait for theresponse to the current HTTP request to issue the next request.With response multiplexing, servers no longer need to waitfor the end of the current response to start sending the nextresponse. Distinguishing web object sizes by eavesdroppingTLS record sizes should thus be unfeasible, or at least harderthan without HTTP/2.0.

Yet the extent of the effect of pipelining and multiplexingon estimating HTTP response sizes on the Web has not beenadequately investigated. The fact that these mechanisms existand are deployed does not mean that they are used and that

they have an effect in response size estimation. Web contentfrom a web page is often not pulled from the server at onceand, if it is, HTTP signaling information may leak throughTLS to help the attacker. This means that the privacy of regularweb site users who will not have any particular reason forusing anonymity tools like Tor [3] may be more at risk thanwhat is believed and that the transition to HTTP/2.0 alonemay not fully prevent this risk. This is especially relevantfor the growing amount of traffic that goes through proxiesand content delivery networks that share IP addresses betweenapplications and for which a simple IP database lookup wouldnot suffice for identifying web sites and applications.

In this paper we collect network traffic that our browsergenerates when opening a web page from a list of popularweb sites, which may or not be using the HTTP/2.0 protocol.We analyze a set of empirical statistical distributions obtainedfrom this captured traffic and report on the details of howthe different concepts of HTTP/2.0 are used and the impactthis has on the ability of an attacker to estimate web objectsizes without having to break the TLS encryption. We providedefinitions for HTTP/2.0 concepts and for pipelining andmultiplexing in section II, which can be useful for betterunderstanding the rest of the paper. In section III we provideexamples of pipelining and multiplexing for a small set of websites and describe the evolution of pipelining and multiplexingfor that set of web sites throughout a month. Then we describedetails of the traffic capture methodology in section IV andcharacterize HTTP/2.0 traffic in section V, starting with TCPstreams and HTTP/2.0 web objects and drilling down toHTTP/2.0 frames and how they’re mapped into TLS records.Evidence for the limited extent of pipelining and multiplexingis presented in section VI, followed by a discussion on twopotential reasons. In particular we explore the relation betweenpipelining and the number of web objects per TCP streamand between multiplexing and the number of HTTP/2.0 framesegments per web object. We then consider the perspectiveof an attacker and define three increasingly stronger attackassumptions in section VII. We estimate web object sizesunder each assumption and quantify the worst case error of anattack that does no better than meeting the assumption. Finally,we present an example of an actual attack in section VIIIwhere we explore TLS record size patterns for a set of serverIP addresses and their relation with HTTP/2.0 signaling. Wediscuss related work in section IX and present our conclusions

1

arX

iv:1

707.

0064

1v1

[cs

.CR

] 3

Jul

201

7

and future work in section X.

II. DEFINITIONS

Figure 1 illustrates the request-response sequence for anHTTP/2.0 toy example with three web objects. We can observethat the three web objects are pipelined since the requestheader for the second object is sent by the client before theresponse data for the first object is fully received. The samesituation happens with the second and third object. We canalso observe that the second and third objects are multiplexedbecause response data for the third web object is sent beforeall response data for the second object is sent by the server.This figure also illustrates mapping of web objects to framesand TLS records.

TLS Records

Frame Seg.

Stream 1

Stream 2

Web Obj. 2

Web Obj. 1

Web Obj. 3

Stream 3

C2S S2C S2C S2C S2C S2C S2C S2C S2C S2C

H

H

H

H

H

H

D

D

D

D

D D

D

index.html

image.gif

javascript.js

Pipelining segmentMultiplexing segment

MuxedRecords

Pipelined objects

Muxed objectsC S

H

HH

H

D

H

D

H

D

Figure 1. HTTP/2.0 toy example for three web objects. Left: request and responsesequence diagram for the three objects. Right: TLS records and HTTP/2.0 framesegments, HTTP/2.0 streams and frames for the three web objects. C2S: client to serverrecords, S2C: server to client records, H: header frames, D: data frames.

The rest of this section provides definitions related toHTTP/2.0, pipelining, and multiplexing. We also include thedefinition of a set of indicators related to pipelining andmultiplexing that we believe are useful for observing privacyon the Internet.

A. HTTP/2.0

1) HTTP/2.0 web object: typically HTML, script, image, orother web resource response data sent in an HTTP/2.0 streamby the server.

2) HTTP/2.0 stream: defines request, response, header, databytes for a web object and also supports HTTP/2.0 signalingmessages.

3) HTTP/2.0 frame: encapsulates request, response, header,and data as well as signalling for an HTTP/2.0 stream. Twoor more HTTP/2.0 frames may be needed to send request orresponse bytes.

4) HTTP/2.0 frame segment: supports splitting frames andpackaging into encrypted TLS application records. One framesegment contains an HTTP/2.0 frame or part of it. One or moreHTTP/2.0 frame segments are encapsulated in TLS applicationrecords, which are then multiplexed on the TCP stream by theserver and the client.

B. Pipelining and Multiplexing

1) Pending requests and Active HTTP/2.0 streams: Pendingrequests are HTTP/2.0 stream requests which the server hasreceived but not started or finishing responding to. ActiveHTTP/2.0 streams are streams for which the server is in theprocess of and has not finished sending header and data bytes.A server can have pending requests and no active HTTP/2.0streams if it has pending requests to which it did not startresponding.

2) Pipelining Segment: a set of consecutive bytes sent bythe server on a particular TCP stream containing header anddata bytes for one or more HTTP/2.0 streams and that a)begins when a server with no prior pending requests startsresponding to a new request and b) ends when the serversends the last bytes of all active HTTP/2.0 streams and has nopending requests.

3) Multiplexing Segment: a set of data bytes of a PipeliningSegment that are sent when the server has two or more activeHTTP/2.0 streams. Multiplexing segments start and finishwhen the number of active HTTP/2.0 streams changes, i.e.after sending the last bytes of an active HTTP/2.0 stream orupon sending the first bytes of an active HTTP/2.0 stream.The number of objects in a multiplexing segment is definedas the number of active HTTP/2.0 streams in that segment.

4) Multiplexing Record: a TLS record that contains twoor more HTTP/2.0 frame segments from different HTTP/2.0streams. This occurs typically with small, HTTP/2.0 signalingand header frames.

C. NPO Network Privacy Observatory Indicators

1) HTTP/2.0 / Encrypted HTTP: Proportion of the sumof TLS record content lengths for TLS records carryingHTTP/2.0 frames to the sum of all TLS record content lengths.

2) Pipelining / HTTP/2.0: Proportion of the number ofbytes in pipelining segments to the sum of TLS record contentlengths for TLS records carrying HTTP/2.0 frames.

3) Multiplexing / Pipelining: Proportion of the numberof bytes in multiplexing segments to the number of bytespipelining segments.

III. OBSERVING PIPELINING AND MULTIPLEXING DAILY

We have been observing multiplexing and pipelining dailyon a small set of web site pages. The evolution of the NPOindicators for the set of web page sites is shown in Figure2. We can see that the NPO indicators are relatively stableand do not show a tendency to increase or decrease duringthe month, with HTTP/2.0 at one half, Pipelining at 4/5ths,and Multiplexing at 1/5th. Small variation may be caused bychanges in actual number of captures performed per web site,deviating from target of 3 daily captures per each of the 10web sites. The May 7 capture in particular has a significantdrop in unique web sites (8/10) and captures (12/30), yet notvisibly impacting the NPO indicators more than in other days.

These results and in particular the pipelining results seempromising. However, when we split the data by website, wesee that the results are not equally distributed per web site. The

2

2017

0502

2017

0503

2017

0504

2017

0505

2017

0506

2017

0507

2017

0508

2017

0509

2017

0510

2017

0511

2017

0515

2017

0516

2017

0517

2017

0518

2017

0519

2017

0520

2017

0521

2017

0522

2017

0523

2017

0524

2017

0525

2017

0526

2017

0527

2017

0528

2017

0529

0

0.2

0.4

0.6

0.8

1

amazon.combaidu.com

bing.comfacebook.com

google.comtaobao.com

twitter.comwikipedia.org

yahoo.comyoutube.com

Capture countper web site Unique web sitesCapture count

0.0e

+00

1.0e

+07

2.0e

+07

HTTPS HTTP2 Pipelining Multiplexing

2017

0502

2017

0503

2017

0504

2017

0505

2017

0506

2017

0507

2017

0508

2017

0509

2017

0510

2017

0511

2017

0515

2017

0516

2017

0517

2017

0518

2017

0519

2017

0520

2017

0521

2017

0522

2017

0523

2017

0524

2017

0525

2017

0526

2017

0527

2017

0528

2017

0529

Total byte count

0.0

0.2

0.4

0.6

0.8

1.0

HTTP2 Pipelining MultiplexingNPO indicators

Figure 2. Evolution of NPO Indicators in May 2017. Top: number of unique web sitescaptured in the day in proportion to the 10 target web sites (thick line), number capturesper day in proportion to 3x10 attempted captures (thin line), proportion of captures perweb site (boxes). Center: sum for all web sites of per web site average byte count. Bottom:daily NPO indicators. In some days of the month we were not able to successfully captureany data.

average NPO indicators for each web site and their boxplotsare shown in Figure 3. We can observe that 4 web sites havesmall values for the three indicators, meaning they are notusing HTTP/2.0 greatly; and in the cases where they are,estimating web object size is not harder than in HTTP1/1. Oneweb site does not use HTTP/2.0 in full, its HTTP/2.0 boxplotshowing a close to zero median and almost 0-to-1 interquartilerange. The remaining 5 web sites more fully utilize HTTP/2.0,with smaller interquartile ranges and larger than 80% median.Looking at the Pipelining indicator for these 5 web sites, wesee that only three have larger than 75% median and that forthe other two pipelining is only at around half. When we lookat the Multiplexing indicator the landscape is dire: no web sitehas median larger than 20%.

To illustrate the variable nature of pipelining and multi-plexing, in Figure 4 we show three diagrams of web objects,their sizes, and the relative time at which they are sent. Eachdiagram represents the web objects in a TCP stream that wefound in each of three captures of the same web site. Thefirst observation is that although the sizes and timings of theweb objects are not exactly the same, they are very similarand this part of the web site is likely to be relatively static.The second observation is that although most objects are eitherpipelined, multiplexed, or not multiplexed or pipelinined in allthe three captures, some are not. If we intuitively map webobjects between the TCP streams of the three captures andhighlight objects A, B, and E, and groups C and D as shownin Figure 4, this is more evident. Object E is pipelined incaptures 1 and 3 and not pipelined in capture 2, perhaps giventhe proximity in time to the objects in group D in captures1 and 3. Arguments for why some objects are pipelined insome captures and multiplexed in others end there. Object Ais multiplexed in capture 3 but not capture 1, when neighboring

amazon.com baidu.com bing.com facebook.com google.com taobao.com twitter.com wikipedia.org yahoo.com youtube.com

HTTP2PipeliningMultiplexing

0.0

0.2

0.4

0.6

0.8

1.0

●

●●

●●

●

●

●●●

●

●●●

●

●

●●●●●

●

●

●

●

●●

●

●

●

●

●●●

●●●

●●

●

●●

●

●


0.0

0.2

0.4

0.6

0.8

1.0

HTTP2

●●

●

●●●

●

●●

●

●

●

●●●

●●

●●●

●●

●

●●

●

●●

●●●


0.0

0.2

0.4

0.6

0.8

1.0

Pipelining

●●●●

●

●

●●

●

●●

●●

●●●

●

●

●

●

●

●

●

●

●

●

●

●●

●

●

●

●

●

●●

●


0.0

0.2

0.4

0.6

0.8

1.0

Multiplexing

Figure 3. NPO Indicators per web site: average (top) and boxplot diagrams (bottomthree).

objects (B, C, and the two pipelined objects before A) do notchange much between these captures. In capture 3 objects Aand C seem to be multiplexed together, while in capture 2 thisseems to happen with A and B.

IV. EXPERIMENT METHODOLOGY

The basis for the remaining of this work is a set of HTTPSrequests issued on January 2017 to the first 10k of Alexa’stop 1M web site pages [4]. We prepend the "https://" prefixto each of the 10k Alexa’s web site page names and loadthis URL in the browser. For each web site page we collect atcpdump capture for all incoming and outgoing traffic throughthe client’s Ethernet interface. We also save a file with the pre-master secrets from the browser on each capture, which wethen use to decrypt TLS records in the tcpdump pcap capturefiles. This is critical for the validation of our work as wecan know exactly which part of the HTTP/2.0 streams eachTLS record carries. Because of the complexity of web sitepages, the HTTPS request for each of the 10k web sites willlikely cause the browser to issue other HTTPS and non-HTTPSrequests to the same and other server IP addresses. We usethe pre-master secrets to parse the pcap files and populate adatabase with TLS records, HTTP/2.0 frames, and HTTP/2.0streams. With this information we are able to characterize theextent of HTTP/2.0 pipelining and multiplexing and assess thesuccess of an attack.

We use tcpdump 4.7.4 to capture pcap traffic files, tshark2.2.5 to parse pcap files, and Chromium 57.0 on Ubuntu 16.04desktop Linux with Selenium python webdriver 3.3.3 to openweb pages.

3

Figure 4. Web object size, time, pipelining, and multiplexing for three TCP streamsof three different captures of the same web site and that appear to carry the same webobjects. Web objects are marked with asterisk if they are pipelined, asterisk and circle ifthey are multiplexed, and a plus sign if they’re not pipelined or multiplexed. Web objectsA, B, and E as well as groups C and D are marked out in the three captures.

V. WEB TRAFFIC CHARACTERIZATION

Approximately 30% of traffic bytes from our data set arenot TLS encrypted and can be analyzed directly. 40% ofthe traffic bytes are from prior versions of HTTP/2.0, whichdo not fully support pipelining and multiplexing and thatmostly follow a simple sequential request-reply protocol thatis straightforward to attack. Only 30% of the traffic bytes areused for HTTP/2.0 and thus potentially offer pipelining andmultiplexing protection to response size attacks.

A. HTTP/2.0 Server IP Addresses, TCP Streams, Web Objects

We counted 75k TCP streams that send HTTP/2.0 framesfor 145k HTTP/2.0 web objects from 2.8k distinct server IPaddresses. The distributions of the number of TCP streams perdistinct server IP address for the whole experiment is heavy-tailed. 50% of the distinct server IP addresses have only oneor two TCP streams each, whereas the top 1% of server IPaddresses have more than 42% of the 75k TCP streams. Thelargest number of TCP streams per server IP address is 2334.This may relate to the popularity of the content available insome servers throughout the set of 10k web sites for whichwe captured traffic.

We also found that one third of the TCP streams do not carryany data or header frames but only HTTP/2.0 signalling framessuch as SETTINGS or WINDOW_UPDATE. Only 49k TCPstreams from 2.5k distinct server IP addresses actually carryweb objects. The largest number of web objects downloaded

from a single TCP connection is 541, the top 1% of TCPstreams have 19% of the web objects, and more than 60%of the TCP streams have only one web object, totaling 18%of the web objects. This means that 18% of the web objectscarried in the 30% of traffic bytes that fully support pipeliningand multiplexing are not multiplexed or pipelined - as they donot share a TCP stream with other web objects.

B. Web Object Size and HTTP/2.0 Frame Size

Figure 5 shows the distribution of web object sizes and howthis changes with the number of frames per object. 45% of theweb objects are delivered in a single HTTP/2.0 frame, 18%in two HTTP/2.0 frames, and the remaining 37% in three ormore HTTP/2.0 frames. As expected, the single-frame webobject sizes tend to be smaller than two-frame and three ormore-frame web objects. However we notice that 1) half ofthe single frame web objects are larger than 1k bytes, somereaching 10k bytes and 2) more than 20% web objects aresent in two frames even if their size is relatively small, below1k bytes.

Figure 5. Distribution of the sizes of web objects.

Figure 6 shows the distribution of HTTP/2.0 frame sizesand how this changes with the number of segments per frame.35% of the frames are delivered in a single HTTP/2.0 framesegment, 15% in two HTTP/2.0 frame segments, 22% in 12frame segments, and 14% in 13 HTTP/2.0 frame segments.The largest observed frame size is 16384 bytes, which ac-cording to the standard corresponds to the smallest possiblemaximum frame size setting for HTTP/2.0 [5].

Figure 6. Distribution of the sizes of HTTP/2.0 response data frames.

C. TLS records and HTTP/2.0 Frame Segments

We collected over 8.6 million TLS records, from which4.1 million are HTTP/2.0 application records and 3.7 million

4

HTTP/2.0 application records sent by the server. We observethat most TLS records are either used to send data framesor non-data frames. In fact, only 1% of server-to-client TLSrecords have both data and non-data frame segments. 3.29million TLS records are used to send HTTP/2.0 data framesand 145k TLS records to send header HTTP/2.0 frames.Figure 7 shows the distribution of TLS record and HTTPframe segment sizes, for either data or header frames. Mostdata HTTP/2.0 frame segments are 1381 and 1389 byteslong, corresponding to TLS record sizes of 1405 bytes and1413 bytes, respectively. This could be related to cross-layerinteraction between IP MTU and TLS record size.

From the TLS application records that encapsulate server-to-client HTTP/2.0 data frames, 94% encapsulate one HTTP/2.0data frame segment. The difference between TLS record sizeand the size of the HTTP/2.0 frame segment it carries is 24bytes for an overwhelming 96% of data frame records, mostlydue to the (TLS record, frame segment) size pairs (1381, 1405)and (1389, 1413) that take up 86% of the single frame TLSrecords. There is also a 24 byte difference for 74% of theother single frame TLS records. More than 99% of the TLSrecords that encapsulate HTTP/2.0 header frames are singleframe segment records and 97% out of those have a 33 bytedifference between TLS record size and HTTP frame segmentsize.

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

1 10 100 1000 10000 100000

P(x)

75%) pipelining or multiplexing effect are small in number:only 1% of the captures for strong pipelining, 0.3% forstrong multiplexing, and 0.08% for both strong pipelining andmultiplexing.

1http://isthewebhttp2yet.com

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

P(x)

the stronger will the pipelining effect be. The precise timingof the object requests should of course also be of relevance.

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

1E+00 1E+01 1E+02 1E+03

P(x)

10k]1k, 10k]]100, 1k][10,100]

Figure 9. Distribution of the number of web objects in the same TCP stream of a webobject of a given size.

B. Multiplexing and the number of HTTP/2.0 frame segmentsper pipelined web object

Out of 42.7k pipelined web objects, 14.3k have a singleHTTP/2.0 frame segment. If by chance the single HTTP/2.0frame segment objects are sent by the server in the middle ofother, multi-HTTP/2.0 frame segment web objects, then theywill be part of a multiplexing segment. This happens to only1.7k of the single-HTTP/2.0 frame segment web objects.

Out of the remaining 28.4k web objects with two or moreHTTP/2.0 segments, 22.7k objects are not multiplexed. Thisis extremely interesting as there does not appear to be anyobvious reason why pipelined web objects with more than oneHTTP/2.0 frame segment should not be multiplexed. Havingreceived requests for two or more web objects in the pipeliningsegment and with more than two HTTP/2.0 frame segmentsto send, the server should be able to round robin the framesegments and multiplex the web objects. In a majority of cases(22.7k out of 28.4k) it does not. This contributes to the smallextent of multiplexing.

The extent of multiplexing for smaller web objects is muchsmaller than for larger sizes. From 6% for web objects largerthan 10k and 8% for web objects with size in range ]1k,10k], multiplexing drops to 2% for web objects with size inrange ]100, 1k] and to 0.3% for range ]10,100]. This could beexplained by the number of single HTTP/2.0 frame segmentweb objects in web object size ranges: 95% of the web objectsin size range ]10, 100] have a single HTTP/2.0 frame segment,80% in range ]100, 1k], 40% in range ]1k, 10k], and only 4%for web objects larger than 10k bytes. Smaller web objectsseem to be multiplexed to smaller extent than others becausethey mostly have single HTTP/2.0 frame segment.

VII. ATTACK ASSUMPTIONS AND WORST CASE RESULTS

Worst case attack results can be characterized by definingthe boundary for which the attacker cannot do worse inunderestimating or overestimating web object size, accordingto some attack assumptions. We define a set of attack assump-tions and provide results for worst case attacks under eachassumption. Our rationale is that 1) best case attacks wouldalways be able to predict web object sizes accurately so thereis no point in defining best case boundaries here and 2) if the

resulting worst case boundary turns out to be good enoughfor the attacker, then the only thing that the attacker needs isto meet the assumption in order to be successful. We start byshowing that even without pipelining estimating the size ofweb objects is not free of error and establish a no-pipeliningbaseline to better understand the impact of pipelining andmultiplexing.

A. Estimating Web Object Size Without Pipelining

For the 102k web objects that are not pipelined it isstraightforward to estimate their size. For non-pipelined webobjects, the client issues the request for the object and has towait for the server to send the web object before it sends anew request. Thus we simply need to sum the sizes of server-to-client TLS records between client-to-server TLS records.However, this sum will almost never be exactly equal tothe size of the web object: TLS records carrying the webobject have encryption overhead and may include HTTP/2.0signaling.

We take object size sact and estimated size sest as the sumof all TLS records that contain frame segments of that objectand compute error e = (sest − sact)/(sact). Figure 10 showsthe distribution of error e for different ranges of web objectsize sact. Larger web objects have smaller relative error andvery small (< 100 bytes) web objects have extremely largeerror. Objects smaller than 10 bytes have all error e largerthan 10.

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

1E-03 1E-02 1E-01 1E+00 1E+01 1E+02 1E+03

P(x)

10k]1k, 10k]]100, 1k][10,100]

Figure 10. Cumulative distributions of non-pipelined web object estimate error fordifferent ranges of web object sizes (in bytes)

B. Attack Assumptions

Assumption 1: The positions of the beginning and end bytesof all pipelining segments are known, as well as the numberof web objects in each pipelining segment.

Assumption 2: The positions of the beginning and end bytesof all multiplexing segments are known, as well as the numberof web objects in each multiplexing segment.

Assumption 3: The TLS records that carry each HTTP/2.0stream are known.

Assumption 3 is the strongest assumption and coverspipelining and multiplexing segments. Assumption 2 coverspipelining segments only. Assumption 1 is the weakest as-sumption, only covering client-server requests and not pipelin-ing or multiplexing segments. Multiplexing records are notcovered by any of the assumptions and we believe this wouldrequire breaking the encoding on the TLS records.

6

Our three assumptions yield high and low value estimates.The worst case estimates sest are computed using the highor low value estimate that yields the largest error e. In eachof the assumptions and in the case where two or more webobjects are in the same segment (or record in the case ofassumption 3), we additionally assume that all web objects inthe segment/record are extremely small except for one, whichis dominant and has approximately the size of the pipeliningsegment. Since the exact dominant web object is not assumedto be known, the high value estimate for all web objects inthe segment/record is the size of the segment/record and thelow value estimate is zero. In the case of single web objectsegment/record, both the high and low value estimates areequal to the size of the segment/record.

For assumptions 1 and 3 we additionally require that therequest header size is known so that it can yield a value forsest that considers only the data part of the server response.

C. Worst Case Results

We provide results per assumption broken down by webobject size in Figure 11 and per web object size broken downby assumption in Figure 12.

We can see from Figure 11 that in general and for thedifferent assumptions, larger web objects have smaller relativeerror. The impact of the mechanisms that can affect estimatedweb object size (like pipelining and multiplexing segments,multiplexing TLS records, and TLS encoding overhead) seemsnot to grow as much as the web object size. Figure 10 showsthe error for non-pipelined web objects and also supports thisintuition. One exception is the case of web objects between1k and 10k under assumption A1. The worse error distributionin this size range is likely due to the proportion of pipelinedweb objects, which in this range is almost twice as much thanin other ranges.

Figure 12 shows that the error is relatively small underweaker assumption A1 and for larger web objects. Underassumption A1, 70% of web objects larger than 10k have anerror smaller than 2%, while 50% of web objects between1k and 10k have an error smaller than 10%. Figure 12 alsoshows that A2 and A3 curves are very close to each other.An attacker would thus not be required to meet stronger A3assumption but only A2 to have worst case results similar toA3. Overall we observe a more extensive use of pipeliningand multiplexing that can bring the error curves closer to thebottom right part of the CDFs, especially for 1k and largerweb objects.

D. Results for Pipelining and Multiplexing Objects Only

The previous section provides worst case results for non-pipelined, pipelined, and multiplexed objects together underthe different attack assumptions. Here we focus on pipelinedand multiplexed web objects only, in order to better gaugethe effect of pipelining and multiplexing. Figure 13 showshow strong the pipelining and multiplexing effects are bycomparing worst case results for the weaker assumption thatcovers the effect (A2 for pipelining and A3 for multiplexing)

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

1E-03 1E-02 1E-01 1E+00 1E+01 1E+02 1E+03 1E+04 1E+05

P(x)

10k]1k, 10k]]100, 1k][10,100]

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

1E-04 1E-03 1E-02 1E-01 1E+00 1E+01 1E+02 1E+03

P(x)

10k]1k, 10k]]100, 1k][10,100]

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

1E-04 1E-03 1E-02 1E-01 1E+00 1E+01 1E+02 1E+03

P(x)

10k]1k, 10k]]100, 1k][10,100]

Figure 11. Error of worst case assumptions, broken down by assumption: A1, A2, A3from top to bottom.

with the strongest assumption that does not cover the effect(A1 for pipelining and A2 for multiplexing).

VIII. EXAMPLE ATTACK

In the previous section we defined attack assumptions anddiscussed worst case results without proposing a specificattack. In this section we provide an illustrative example ofan attack that meets a weaker form of Assumption 2 in whichthe number of web objects in each segment is not known. Thisattack, however, is only valid for some server IP addresses.

A. Description

Our example attack has two parts. In the first part weaddress the feature in HTTP/1.1 and beyond that allows a TCPconnection to be reused for multiple requests. We segmentthe TLS records of each TCP connection into sets of HTTPrequest-response sequences that are contained entirely in eachsegment. Our assumption in segmenting TCP connectionsis that the TLS records of ready-to-send HTTP responsesare sent back-to-back. We look for gaps in the sequence oftimestamps of these records and use these gaps to segment theTCP connections. In HTTP/1.1 a gap will exist between thetimestamp of the last TLS record of the current response andthe timestampt of the first TLS record of the next response.This gap is at least one round trip time, as the client hasto wait until the the current response is completely receivedbefore sending the next request. More generically in order toinclude pipelining and response multiplexing, we take a gap

7

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

1E-04 1E-03 1E-02 1E-01 1E+00 1E+01 1E+02 1E+03

P(x)

of the attack target web objects have sizes between 1k and 10k,20% between 100 and 1k, and 13% between 10 and 100 bytes.Unlike the whole data set, the proportion of pipelined webobjects to the number of web objects in the different rangesis smaller: 19% for size range ]10,100], 12% for ]100,1k],15% for ]1k,10k] and 9% for web objects larger than 10k.The largest proportion for pipelining is in the range ]10,100],while for the whole data set this is range ]1k, 10k].

C. Number of Web Objects

Our attack found 92.7k web objects, 4.8k more than thenumber of web objects in the target TCP streams. Thisdifference can be explained in three parcels. 1) Our attackfound 1.1k web objects in 0.8k TCP streams that do not carryweb objects, resulting in an additional 1.1k web objects. 2)Our attack was able to find the correct number of web objectsin 28.8k TCP streams. The attack underestimated the numberof web objects in 2.3k TCP streams, mostly (96%) by only oneweb object less. The attack overestimated the number of webobjects in a similar number of TCP streams (2.3k), howeverwith 30.0% of the attacks to TCP streams yielding 3 or moreweb objects than the actual number. The difference betweenthe number of overestimated (6.9k) and underestimated (2.5k)web objects is an additional 4.3k web objects. 3) Finally, theattack did not find any web objects for 0.4k TCP streamsthat do carry web objects; this corresponds to 0.6k fewer webobjects. Summing parcels 1 and 2 and subtracting parcel 3results in the 4.8k more objects found in the attack.

We now try to understand the difference between thenumber of web objects in a TCP stream and the number ofweb objects found in the attack to that stream, which we callAttack. To do so, we explore the relation between Attack andthe number of pipelined and multiplexed objects in each TCPstream. Figure 14 shows a stronger correlation between theAttack variable and the number of multiplexed objects thanwith the number of pipelined objects. The CDFs in Figure 14confirm this; the difference between the Attack distributionsfor TCP streams with and without multiplexing is larger thanfor TCP streams with and without pipelining. This makessense since our attack is related to Assumption 2 that coverspipelining but not multiplexing.

D. Web Object Size

We take all TCP streams for which our attack resulted inthe correct number of web objects in the stream and analyzethe error of the web object sizes found in the attack. To mapthe web object sizes found in the attack with the actual webobjects in the TCP stream we order both sets of response sizesby header response timestamp and assign the first web objectfound in the attack to the first web object in the stream, andso forth. We then take estimated web object size and actualweb object size and apply error e as defined in section VII-A.Figure 15 shows the resulting distributions of error e, groupedby web object size and whether the web object was pipelined,multiplexed, or neither.

●

●

●

●

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Obj

Atta

ck

Pip

e

Mux

Obj

Attack

Pipe

Mux

0 20 40 60 80 100

Pipelining

X: Attack per TCP Stream

No Pipelined Objects1+ Pipelined Objects

P(X

<x)

0 20 40 60 80 100

0.0

0.2

0.4

0.6

0.8

1.0

Multiplexing

X: Attack per TCP Stream

P(x

<X

)

No Multiplexed Objects1+ Multiplexed Objects

Figure 14. Left: Correlation between the number of web objects (Obj), the differencebetween Obj and the number of web objects found in the attack (Attack), and thenumber of pipelined (Pipe) and multiplexed (Mux) web objects - all per TCP stream.Middle and right: comparison of the distribution of the Attack variable per TCP streamsfor a) no pipelining objects , b) one or more pipelined objects, c) no multiplexed objects,d) one or more multiplexed objects.

0.0 0.5 1.0 1.5 2.0

All objects

P(X

<x)

No PipePipeMux

0.0 0.5 1.0 1.5 2.0

0.0

0.2

0.4

0.6

0.8

1.0

> 10k

Fn(

x)

0.0 0.5 1.0 1.5 2.0

0.0

0.2

0.4

0.6

0.8

1.0

]1k,10k]

Error e

Fn(

x)

0.0 0.5 1.0 1.5 2.0

0.0

0.2

0.4

0.6

0.8

1.0

]100,1k]

Fn(

x)

0.0 0.5 1.0 1.5 2.0

0.0

0.2

0.4

0.6

0.8

1.0

< 100

Fn(

x)

Figure 15. Distribution of the error e of the attack (cf. section VII-A). This distributionis shown for 1) all objects (left) and for objects with size in the ranges in bytes shownfor each of the 4 images on the right and 2) objects that were not pipelined (thick line),objects that were pipelined (normal line), and obejcts that were multiplexed (thin line).Notice that no objects smaller than 100 bytes were multiplexed, thus only thick andnormal lines are visible on the image on the right.

We observe that non-pipelined and pipelined object errordistributions are very similar. The exception are the distribu-tions for web objects smaller than 100 bytes, which cause thedifference between pipelined and non-pipelined distributionsfor all objects between e = 1.0 and e = 1.5. The errordistributions for pipelined web objects are much closer to zerothan the error distribution for multiplexed objects. For webobject sizes larger than 1k in particular, the distributions forpipelined object error e are extremely close to zero - more than94% of web objects larger than 1k have error e smaller than6% and more than 95% larger than 10k have error smaller than3%. Distributions for multiplexed objects show higher errorthan for non-pipelined and pipelined objects. This is expectedgiven that our attack is related to assumption 2 which coverspipelining but not multiplexing.

IX. RELATED WORK

We are not aware of any related work that tries to under-stand the effect of pipelining and response multiplexing inestimating HTTP/2.0 response sizes. The extent of pipeliningand multiplexing on a set of web pages on the Internet is alsonot discussed in any work that we are aware of. In this sectionwe go through related work in areas close to other aspects of

9

our work – namely side channel attacks on encrypted channelsand response size estimation.

A. Encrypted Channels

Many side channel attacks have been proposed on differentencrypted channels. A decade ago, and motivated by the CPU-intensive nature of application traffic classification techniquessuch as snort [7] and by the legal difficulties of analyzingtraffic content, researchers started using features of the trafficthat did not rely on its content – namely packet size, direction,and timing [8]. Quickly researchers realized these featurescould be used together with machine learning techniques [9] toclassify encrypted traffic, which snort cannot not. Experimentsfor classifying SSH-tunneled application traffic [10] or identi-fying HTTPS webmail traffic [11] have been motivated mostlyby network management requirements, for example the needto provide different quality of service for applications withdifferent traffic requirements, which existing methods wereunable to do for encrypted traffic.

Another and possibly more pressing motivation for sidechannel attacks on encrypted channels is an attack on userprivacy and the possibility that some information related toapplication data and user behavior is leaked through thechannel, which could be used against its owner. The channelthat possibly raises most privacy concerns is TOR [12], whichis designed precisely to protect user anonymity. Numerousattacks on TOR have been reported [13], [14] focusing forexample on ingress-egress traffic correlation, the specifics ofthe TOR onioning protocol including circuits and layers, andactive attacks from compromised browser or TOR nodes.The goal for attacks on TOR could be inferring the type ofapplication being used [15] or identifying the user. Attacksto more traditional VPN tunnels with the aim of identifyingweb pages have been described [16] and use traffic featuressuch as burst size and surge period. Encrypted voice and videotraffic using voice- and video-specific protocols has also beensubject to side channel attacks, with the intent of uncoveringspoken sentences in encrypted VoIP communications [17] oridentifying the video that a user last watched [18]. Finally,privacy attacks on wireless channels have also been reported,focusing e.g. on mobile application traffic classification overWLAN with WPA2 encryption [19] and on commercial smarthome sensors with 802.15.4 wireless interfaces [20]. Attackson wireless channels are especially motivating as the attackeris not required to enter private target premises and unlawfullytap the target network.

Recently, one of the major aspects of network management– security – is starting to again drive side-channel attack re-search. As network managers face increasingly more encryptedtraffic transiting their networks which they do not have accessor authorization to decrypt – either from customers or differentdepartments in their organization, so grows the relevanceof side channel attack techniques to identify intrusions overencrypted traffic. This may be especially relevant in a cloud en-vironment where tenants deploy wide range of modern HTTPapplications or in an IoT environment where HTTPS-capable

yet resource-limited and possibly less frequently updated smartdevices receive sensor and actuation requests from anywhereon the Internet.

HTTP over TLS is possibly the mostly used of encryptedchannels. We cover HTTP over TLS next, focusing on webobject response sizes and how they can be estimated by anattacker.

B. Estimating Response Size and Identifying Web Sites

A in-depth and interesting discussion on size-exposing tech-niques can be found in [21]. Starting with the initial warningsthat eavesdropping encrypted communications could revealinformation about its content [22] and examples of attacksto identify web sites using fingerprinting [1], the authorsare quick to focus on online social networks. In addition toidentifying which web page the user is accessing, the state ofthe user could be inferred using the set of web resources thatsocial network web sites send over the Internet. Estimatingthe size of web objects can now have much more severeconsequences than a decade or two ago as it more directlyexposes the user.

[23] illustrates the possibly high impact consequences ofestimating response sizes in health-care, tax, investment, andonline search web sites. Earlier, [24] had shown that very goodperformance could be achieved in an attack to identify websites from a set of 100k different sites, although web pageswith dynamic content could be a problem. Although the issueof identifying dynamic web pages has been addressed sincethat [16], it remains an open issue given the limited set of webpages that was studied. [21] provides a more detailed studyon dynamic pages and are able to retrieve the exact size ofweb objects but their solution requires an active attack to beperformed, which may not always be feasible.

Pipelining is referred to in [24] as a mechanism thatcould prevent web object size estimation. However, at thattime pipelining was not used despite having been defined inthe HTTP protocol. Otherwise, related work assumes it isstraightforward to obtain adequate estimations of web objectsizes from the stream of TLS records.

X. CONCLUSION

The effect of pipelining and multiplexing on the estimationof web object sizes yields a perceivable increase in the relativeobject size estimation error. However, 1) attacking pipeliningis viable as we’ve shown for some server IP addresses and2) the current extent of multiplexing – and to a lesser degreethat of pipelining – is limited. These two issues suggest thatdespite the potential of pipelining and multiplexing, their useneeds to increase if they’re to better help prevent web objectsize estimation.

We see many interesting related research issues ahead. Thefirst issue is to understand to which extent it is possible toautomatically detect new types of web servers and attacks, e.g.by correlating TLS record sizes with the beginning and endof HTTP responses. The second issue is to better understandwhy pipelining and multiplexing are used or not in apparently

10

similar circumstances and for different captures of the samesequence of requests. We plan to develop a controlled environ-ment where pipelining and multiplexing can be adjusted anddifferent web applications can be installed. This will also allowus to observe the trade-off between object size estimation andquality of service. This leads to the third research issue whichis to explore different countermeasures at the web engine,web application, browser, and service provider that can reducethe effectiveness of web object size estimation. Finally, thefourth research issue we see is related to measuring the effectof pipelining and multiplexing in reducing the efficiency ofweb page identification, for example using attack data anddeveloping features that consider specific temporal sequencesof web object sizes in each TCP stream.

REFERENCES

[1] Andrew Hintz. Fingerprinting Websites Using Traffic Analysis. In RogerDingledine and Paul Syverson, editors, Privacy Enhancing Technologies, number2482 in Lecture Notes in Computer Science, pages 171–178. Springer BerlinHeidelberg, April 2002. DOI: 10.1007/3-540-36467-6_13.

[2] Matteo Varvello, Kyle Schomp, David Naylor, Jeremy Blackburn, AlessandroFinamore, and Konstantina Papagiannaki. Is the Web HTTP/2 Yet? In ThomasKaragiannis and Xenofontas Dimitropoulos, editors, Passive and Active Measure-ment, number 9631 in Lecture Notes in Computer Science, pages 218–232. SpringerInternational Publishing, March 2016. DOI: 10.1007/978-3-319-30505-9_17.

[3] Gaofeng He, Ming Yang, Xiaodan Gu, Junzhou Luo, and Yuanyuan Ma. A novelactive website fingerprinting attack against Tor anonymous system. In Proceedingsof the 2014 IEEE 18th International Conference on Computer Supported Cooper-ative Work in Design (CSCWD), pages 112–117, May 2014.

[4] Zakir Durumeric, James Kasten, Michael Bailey, and J. Alex Halderman. Analysisof the HTTPS Certificate Ecosystem. In Proceedings of the 2013 Conference onInternet Measurement Conference, IMC ’13, pages 291–304, New York, NY, USA,2013. ACM.

[5] Mike Belshe, Martin Thomson, and Roberto Peon. Hypertext transfer protocolversion 2 (http/2). 2015.

[6] R. Morla. An initial study of the effect of pipelining in hiding HTTP/2.0 responsesizes. ArXiv e-prints, July 2016.

[7] Martin Roesch and others. Snort: Lightweight intrusion detection for networks. InLisa, volume 99, pages 229–238, 1999.

[8] Laurent Bernaille, Renata Teixeira, Ismael Akodkenou, Augustin Soule, and KaveSalamatian. Traffic classification on the fly. SIGCOMM Comput. Commun. Rev.,36(2):23–26, April 2006.

[9] A. Dainotti, W. de Donato, A. Pescape, and P. Salvo Rossi. Classification ofNetwork Traffic via Packet-Level Hidden Markov Models. In IEEE GlobalTelecommunications Conference, 2008. IEEE GLOBECOM 2008, pages 1–5,November 2008.

[10] M. Dusi, A. Este, F. Gringoli, and L. Salgarelli. Using GMM and SVM-BasedTechniques for the Classification of SSH-Encrypted Traffic. In IEEE InternationalConference on Communications, 2009. ICC ’09, pages 1–6, June 2009.

[11] Dominik Schatzmann, Wolfgang Mühlbauer, Thrasyvoulos Spyropoulos, and Xeno-fontas Dimitropoulos. Digging into HTTPS: Flow-based Classification of WebmailTraffic. In Proceedings of the 10th ACM SIGCOMM Conference on InternetMeasurement, IMC ’10, pages 322–327, New York, NY, USA, 2010. ACM.

[12] Roger Dingledine, Nick Mathewson, and Paul Syverson. Tor: The second-generation onion router. Technical report, Naval Research Lab Washington DC,2004.

[13] Iskander Sanchez-Rola, Davide Balzarotti, and Igor Santos. The onions haveeyes: A comprehensive structure and privacy analysis of tor hidden services. InProceedings of the 26th International Conference on World Wide Web, pages 1251–1260. International World Wide Web Conferences Steering Committee, 2017.

[14] Prateek Mittal, Ahmed Khurshid, Joshua Juen, Matthew Caesar, and Nikita Borisov.Stealthy traffic analysis of low-latency anonymous communication using throughputfingerprinting. In Proceedings of the 18th ACM Conference on Computer andCommunications Security, CCS ’11, pages 215–226, New York, NY, USA, 2011.ACM.

[15] Gaofeng He, Ming Yang, Junzhou Luo, and Xiaodan Gu. Inferring ApplicationType Information from Tor Encrypted Traffic. In 2014 Second InternationalConference on Advanced Cloud and Big Data (CBD), pages 220–227, November2014.

[16] Yan Shi and S. Biswas. Website fingerprinting using traffic analysis of dynamicwebpages. In 2014 IEEE Global Communications Conference (GLOBECOM),pages 557–563, December 2014.

[17] C.V. Wright, L. Ballard, S.E. Coull, F. Monrose, and G.M. Masson. Spot Me ifYou Can: Uncovering Spoken Phrases in Encrypted VoIP Conversations. In IEEESymposium on Security and Privacy, 2008. SP 2008, pages 35–49, May 2008.

[18] Ran Dubin, Amit Dvir, Ofir Pele, and Ofer Hadar. I Know What You SawLast Minute - Encrypted HTTP Adaptive Video Streaming Title Classification.arXiv:1602.00490 [cs], February 2016. arXiv: 1602.00490.

[19] Q. Wang, A. Yahyavi, B. Kemme, and W. He. I know what you did on yoursmartphone: Inferring app usage over encrypted data traffic. In 2015 IEEEConference on Communications and Network Security (CNS), pages 433–441,September 2015.

[20] B. Copos, K. Levitt, M. Bishop, and J. Rowe. Is Anybody Home? Inferring ActivityFrom Smart Home Network Traffic. In 2016 IEEE Security and Privacy Workshops(SPW), pages 245–251, May 2016.

[21] Tom Van Goethem, Mathy Vanhoef, Frank Piessens, and Wouter Joosen. Requestand Conquer: Exposing Cross-Origin Resource Size. In USENIX Security Sympo-sium, pages 447–462, 2016.

[22] David Wagner and Bruce Schneier. Analysis of the SSL 3.0 Protocol. InProceedings of the 2Nd Conference on Proceedings of the Second USENIXWorkshop on Electronic Commerce - Volume 2, WOEC’96, pages 4–4, Berkeley,CA, USA, 1996. USENIX Association.

[23] S. Chen, R. Wang, X. Wang, and K. Zhang. Side-Channel Leaks in WebApplications: A Reality Today, a Challenge Tomorrow. In 2010 IEEE Symposiumon Security and Privacy, pages 191–206, May 2010.

[24] Qixiang Sun, Daniel R. Simon, Yi-Min Wang, Wilf Russell, Venkata N. Padman-abhan, and Lili Qiu. Statistical Identification of Encrypted Web Browsing Traffic.In Proceedings of the 2002 IEEE Symposium on Security and Privacy, SP ’02,pages 19–, Washington, DC, USA, 2002. IEEE Computer Society.

11

I IntroductionII DefinitionsII-A HTTP/2.0II-A1 HTTP/2.0 web objectII-A2 HTTP/2.0 streamII-A3 HTTP/2.0 frameII-A4 HTTP/2.0 frame segment

II-B Pipelining and MultiplexingII-B1 Pending requests and Active HTTP/2.0 streamsII-B2 Pipelining SegmentII-B3 Multiplexing SegmentII-B4 Multiplexing Record

II-C NPO Network Privacy Observatory IndicatorsII-C1 HTTP/2.0 / Encrypted HTTPII-C2 Pipelining / HTTP/2.0II-C3 Multiplexing / Pipelining

III Observing Pipelining and Multiplexing DailyIV Experiment MethodologyV Web Traffic CharacterizationV-A HTTP/2.0 Server IP Addresses, TCP Streams, Web ObjectsV-B Web Object Size and HTTP/2.0 Frame SizeV-C TLS records and HTTP/2.0 Frame Segments

VI Extent of Pipelining and MultiplexingVI-A Pipelining and the number of web objects per TCP streamVI-B Multiplexing and the number of HTTP/2.0 frame segments per pipelined web object

VII Attack Assumptions and Worst Case ResultsVII-A Estimating Web Object Size Without PipeliningVII-B Attack AssumptionsVII-C Worst Case ResultsVII-D Results for Pipelining and Multiplexing Objects Only

VIII Example AttackVIII-A DescriptionVIII-B Target CharacterizationVIII-C Number of Web ObjectsVIII-D Web Object Size

IX Related WorkIX-A Encrypted ChannelsIX-B Estimating Response Size and Identifying Web Sites

X ConclusionReferences

effect of pipelining and multiplexing in estimating http/2.0 web … · 2017-07-04 · effect of...

Documents