moving edge-side includes to the real edge – the clients

24
Moving Edge-Side Includes to the Real Edge – the Clients Zhen Xiao AT&T Labs -- Research Joint work with Misha Rabinovich (AT&T Labs – Research), Fred Douglis (IBM T.J. Watson Research Center), Chuck Kalmanek (AT&T Labs – Research)

Upload: gaston

Post on 10-Jan-2016

31 views

Category:

Documents


2 download

DESCRIPTION

Moving Edge-Side Includes to the Real Edge – the Clients. Zhen Xiao AT&T Labs -- Research Joint work with Misha Rabinovich (AT&T Labs – Research), Fred Douglis (IBM T.J. Watson Research Center), Chuck Kalmanek (AT&T Labs – Research). Motivation. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Moving Edge-Side Includes to the Real Edge – the Clients

Moving Edge-Side Includes to the Real Edge – the Clients

Zhen Xiao

AT&T Labs -- Research

Joint work with Misha Rabinovich (AT&T Labs – Research),

Fred Douglis (IBM T.J. Watson Research Center),

Chuck Kalmanek (AT&T Labs – Research)

Fred Douglis
affiliations are std
Page 2: Moving Edge-Side Includes to the Real Edge – the Clients

2

Motivation

• Exponential growth of Web traffic on the internet.• Caching is essential for reducing network congestion

and page display time.• But more and more pages contain dynamic content.

– News headlines, stock quotes, time of day, etc..– Bad for caching!

Problem: how to facilitate caching for dynamic content?

Page 3: Moving Edge-Side Includes to the Real Edge – the Clients

3

A Closer Look …

• Dynamic pages are not all that dynamic– Most bytes are in a static page template.– Dynamic portions are a small fraction of the entire

page.

Page 4: Moving Edge-Side Includes to the Real Edge – the Clients

4

Fragment 1

Fragment 2

full page: 30731 bytes

stock quotes: 1231 bytes (4%)

news headlines: 927 bytes (3%)

Solution: separate cachecontrol for each component of the page!

Page 5: Moving Edge-Side Includes to the Real Edge – the Clients

5

Edge-Side Includes (ESI)

• An XML-based mark-up language proposed in W3Chttp://www.w3.org/TR/esi-lang

• A mechanism for fragmenting a Web page into a set of componentsExample: <esi:include src=“http://www.att.com/news.xml” />– Exception handling– Conditional inclusion

• Separate cache control for each component• Download changed fragments only• Assemble the page at the edge servers (i.e. reverse

proxies)

Fred Douglis
at the edge!!
Page 6: Moving Edge-Side Includes to the Real Edge – the Clients

6

Comparison with HTML Tags

• IMG: inclusion of images only• APPLET: inclusion of Java applets• OBJECT: generic inclusion of HTML code, but

– Not supported by any major browser yet– Only allows a simple inclusion

Page 7: Moving Edge-Side Includes to the Real Edge – the Clients

7

AT&T Page with ESI Mark-ups

News headlines (changes a few

times day)

Stock quotes (changes every minute)

Page template(seldom changes)

News Fragment(expires Feb 17,

15:30)

Stocks Fragment (Expires Feb 17, 12:30)

Page template(seldom changes)

ESI

<esi: include src=/news.xml />

<esi:include src=/stocks.xml />

http://www.att.com/index.html(Expires Feb 17, 12:30)

http://www.att.com/index.html(Expires May 1, 10:30)

Page 8: Moving Edge-Side Includes to the Real Edge – the Clients

8

Akamai’s Approach for ESI-encoded Contents

Browser Edge server Origin server

GET /www.att.com

Full page

stocks

No ESI Encoding

With ESIEncoding

(template, news cached)

Page Assembly

Full page

GET /www.att.com

GET /www.att.com

Full page

GET /stocks.xml

Example: A client fetches the AT&T entry page. Assume that only the stock quotes have changed in the cache.

Fred Douglis
this slide cries out for animation :)
Page 9: Moving Edge-Side Includes to the Real Edge – the Clients

9

Bottleneck of the Last Mile

• A large percentage of Internet users still rely on dial-up connections.– Network traffic & revenue analysis: 79% of consumer

subscribers as of March 2002.– Jupiter Media Metrix, Aug 2001: 59% of the predicated on-

line households in the US in 2006.

• The speed of the last mile dominates the page display time.

ESI does NOT help dial-up customers!

Page 10: Moving Edge-Side Includes to the Real Edge – the Clients

10

Client-Side Includes: Addressing the Last Mile

• Key idea: Assemble page components in the clients’ browsers instead of on edge servers.

• Use existing technologies inside Internet Explorer– Page parsing and assembly: JavaScript– Retrieval of page components: ActiveX

• No browser modifications or reconfigurations necessary.

• Work well with or without a Content Distribution Network (CDN)!

Page 11: Moving Edge-Side Includes to the Real Edge – the Clients

11

Comparison of Page Assembly Alternatives

Browser Edge server Origin server

GET /www.att.com

Full page

stocks

No ESIEncoding

With ESI

With CSI

(template, news cached)

Page Assembly

Full page

GET /www.att.com

GET /www.att.com

Full page

GET /stocks.xml

Page Assembly

stocks

GET /stocks.xml(template, news cached)

GET /stocks.xml

stocks

Page 12: Moving Edge-Side Includes to the Real Edge – the Clients

12

ESI versus CSI

• Same markup language (ESI)• ESI:

– Reduces content providers’ costs at origin server (less load and bandwidth).

• CSI:– Reduces content providers’ costs at origin server (less load

and bandwidth).– Reduces content providers’ costs for CDN (less bandwidth

from edge to clients).– Reduces bandwidth consumption over the last mile – Reduces browser download times.

Fred Douglis
Be sure everyone knows who the CUSTOMER is
Page 13: Moving Edge-Side Includes to the Real Edge – the Clients

13

Wrapper

Implementation (with a CDN)

Typically satisfied from client’s cache

Browser

Obtain fragments Using HTTPObtain fragments using ActiveX

GET /www.att.com

(cacheable, immutable for given page)

GET CSI Javascript

Edge server Origin server

(cacheable, same for all pages)

Page 14: Moving Edge-Side Includes to the Real Edge – the Clients

14

Wrapper for JavaScript/ActiveX Implementations of CSI

<HTML> <BODY> <SCRIPT SRC=“csi.js”></SCRIPT> <SCRIPT>run(“template.html”)</SCRIPT> </BODY></HTML>

Page 15: Moving Edge-Side Includes to the Real Edge – the Clients

15

Implementation (without a CDN)

Wrapper

Typically satisfied from client’s cache

Browser

Obtain fragments using ActiveX

GET /www.att.com

(cacheable, immutable for given page)

GET CSI Javascript

Origin server

(cacheable, same for all pages)

Page 16: Moving Edge-Side Includes to the Real Edge – the Clients

16

What about non-IE browsers?

• No ActiveX• Small fraction of all clients• Solution: CSI/ESI

– Optimize for the common case (MSIE browsers)– JavaScript redirection– Use CSI for MSIE; ESI for the rest

• No performance benefit for non-IE users, but support them functionally

Fred Douglis
Don't call it Akamai ESI -- supposedly a std
Page 17: Moving Edge-Side Includes to the Real Edge – the Clients

17

Wrapper choosing between client and server-side page assembly

<HTML> <BODY> <SCRIPT> <!-- if (!window.ActiveXObject) window.location = “/cgi-bin/esi.pl/template.html” ; // --> </SCRIPT> <SCRIPT SRC=“csi.js”></SCRIPT> <SCRIPT> <!-- run(“template.xml”) ; //--> </SCRIPT> If your browser does not support JavaScript, please click <A href=“/cgi-bin/esi.pl/template.html”>here</A> </BODY></HTML>

Page 18: Moving Edge-Side Includes to the Real Edge – the Clients

18

Performance Evaluation

• Synthetic pages: random generated contents– Sizes: 20K, 60K, 100K– Template (80%) + four fragments (5% each)

• AT&T page: http://www.att.com

• Wall Street Journal page: http://online.wsj.com– One template, three fragments

• Network Connection: 56K modem• Server: 864 MHz Pentium III, 256 MB memory,

RedHat Linux 7.0, Apache 1.3.• Client: IBM T22 Thinkpad laptop, 1GHz CPU, 128MB

memory, Windows 2000, Internet Explorer 6.0.

Page 19: Moving Edge-Side Includes to the Real Edge – the Clients

19

Sizes of AT&T and WSJ Page with ESI encoding

AT&T Page WSJ Page

Full Page 30731 (100%) 79608 (100%)

Page Template 28661 (93%) 56324 (71%)

Current Time N/A 55 (0%)

News Headlines 927 (3%) 20161 (25%)

Stock Quotes 1231 (4%) 3166 (4%)

All numbers are in bytes

Page 20: Moving Edge-Side Includes to the Real Edge – the Clients

20

Download Time of Synthetic Pages

ESI processing overhead

nothing cached

CSI script cached

Template cached

Conclusion: substantial improvement in display time in the common case

Page 21: Moving Edge-Side Includes to the Real Edge – the Clients

21

Download Time of AT&T Page

25%

45%

Page 22: Moving Edge-Side Includes to the Real Edge – the Clients

22

Download Time of WSJ Page

27%

38%

Page 23: Moving Edge-Side Includes to the Real Edge – the Clients

23

Related Work

• XInclude: http://www.w3.org/TR/xinclude– Not supported by any major browser– Requires the template and fragments be valid XML pages

• Server-side includes: ASP, JSP, PHP– Easy management of the Web site– No reduction in bandwidth consumption

• HPP: closest to our work– Implemented as a browser plug-in– Supports loop construct

• Delta-encoding

Page 24: Moving Edge-Side Includes to the Real Edge – the Clients

24

Benefits of CSI

• Improves user experience.– Reduces amount of content transferred over the last mile.

• No browser modifications or reconfiguration.• Reduces total cost for content providers by:

– Reducing bandwidth consumption of origin server.– Reducing amount of content served by CDN’s edge servers.

• Extends benefits of ESI to content providers who do not use a CDN.