web overview
DESCRIPTION
Web Overview. The birth of Web: 1989 Now Web is about everything Business (HR systems, e.g. NUHR) Online Shopping (Amazon), Banking (Citibank, Chase) Communications (Gmail, Facebook) Become mission-critical Performance Security. Web 2.0. Web 1.0 Basic HTML + Images - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Web Overview](https://reader036.vdocuments.mx/reader036/viewer/2022062810/56815dc3550346895dcbee87/html5/thumbnails/1.jpg)
Web Overview
• The birth of Web: 1989• Now Web is about everything
– Business (HR systems, e.g. NUHR)– Online Shopping (Amazon), Banking (Citibank,
Chase)– Communications (Gmail, Facebook)
• Become mission-critical – Performance– Security
![Page 2: Web Overview](https://reader036.vdocuments.mx/reader036/viewer/2022062810/56815dc3550346895dcbee87/html5/thumbnails/2.jpg)
Web 2.0
• Web 1.0– Basic HTML + Images
• What is Web 2.0?– No one really gives a clear definition– Features
• AJAX ( Asynchronous JavaScript and XML)• DOM (Document Object Model) • Flash• CSS (Cascading Style Sheets)• User involvement: Wiki, Blog, Social Networks
![Page 3: Web Overview](https://reader036.vdocuments.mx/reader036/viewer/2022062810/56815dc3550346895dcbee87/html5/thumbnails/3.jpg)
Web 2.0 Basics - JavaScript
• JavaScript– A scripting language with C/C++ like grammar– Dynamic, weakly typed language
• Eval()• No need to claim the object types
• Web 2.0 websites are JavaScript heavy– Google Maps (510KB)– Google Calendar (152KB)– Facebook (558KB)
![Page 4: Web Overview](https://reader036.vdocuments.mx/reader036/viewer/2022062810/56815dc3550346895dcbee87/html5/thumbnails/4.jpg)
DOM (Document Object Model)
• One of the first JavaScript/DOM heavy apps: Gmail• DOM Event API: Keyboard and mouse events• DOM CSS API
<html> <head> <title>Sample Document</title> </head> <body> <h1>An HTML Document</h1> <p>This is a <i>simple</i> document.</html>
![Page 5: Web Overview](https://reader036.vdocuments.mx/reader036/viewer/2022062810/56815dc3550346895dcbee87/html5/thumbnails/5.jpg)
AJAX ( Asynchronous JavaScript and XML)
• Foundation of popular web apps: Google Map, Gmail, Facebook, etc.
• Can transfer any object between browser Web server, e.g. XML or JSON (JavaScript Object Notation)
req = new XMLHttpRequest();
function callback () { … }function handler () { if (req.readyState == 4 && req.status == 200) { callback(req.responseText); }}
req.onreadystatechange = handler;req.open(“GET”, url, true);req.send(null);
XMLHttpRequest
register a callback function to be asynchronous
enable JavaScript visit the url directly
response can be either plain text or XML
![Page 6: Web Overview](https://reader036.vdocuments.mx/reader036/viewer/2022062810/56815dc3550346895dcbee87/html5/thumbnails/6.jpg)
6
WebProphet: Automating Performance Prediction for Web
Services
Zhichun Li, Ming Zhang, Zhaosheng Zhu, Yan Chen, Albert Greenberg and Yi-min Wang
Northwestern UniversityMicrosoft Research
![Page 7: Web Overview](https://reader036.vdocuments.mx/reader036/viewer/2022062810/56815dc3550346895dcbee87/html5/thumbnails/7.jpg)
7
Large-scale Web Services• Most large-scale online services today are web-
based– Web search, map, Webmail, calendar, online stores, etc.
• Provided by Online Service Providers (OSPs)– MSN, Google, Yahoo, Amazon, etc.
• Hosted by multiple data-centers around the world• More and more complex
– Yahoo Maps: 110 embedded objects, complex object dependencies and 670KB JavaScript
![Page 8: Web Overview](https://reader036.vdocuments.mx/reader036/viewer/2022062810/56815dc3550346895dcbee87/html5/thumbnails/8.jpg)
8
• Amazon: 1% sale loss at the cost of 100ms extra delay• Google found 500 ms extra delay reduce revenues by up
to 20%
Performance Is Important
OSP A
OSP B
Revenue
Revenue
SLOW!Need a tool to understand and improve the user perceived
performance.
![Page 9: Web Overview](https://reader036.vdocuments.mx/reader036/viewer/2022062810/56815dc3550346895dcbee87/html5/thumbnails/9.jpg)
9
Large Web Services Are Complex
BrowserBrowser
Backend DCs
Frontend DCs
OSP Internal Network
DNS
Internet
Complex UI large browser delayPoor object dependency more RTTs(online map needs 40~60 http objects)
Potential Performance Problems
Complex DNS redirection long dns query(CNAME)Different servers more dns queries
RTTPacket loss interact with TCP
Overload Long response time
Overload Long response time for dynamic contents
RTTPacket loss
Need a tool to diagnose why slow? and where is the
bottleneck?
![Page 10: Web Overview](https://reader036.vdocuments.mx/reader036/viewer/2022062810/56815dc3550346895dcbee87/html5/thumbnails/10.jpg)
Performance Prediction Problem
• Many ways can be used for performance optimization. However, cannot try them one by one, huge cost!
• What the performance will be under hypothetical optimization strategies?
• How to quickly evaluate the predicted performance?
Performance ???Optimization
![Page 11: Web Overview](https://reader036.vdocuments.mx/reader036/viewer/2022062810/56815dc3550346895dcbee87/html5/thumbnails/11.jpg)
11
Outline
• Motivation• Design• Dependency Extraction• Performance Prediction• Implementation• Evaluation• Conclusion
![Page 12: Web Overview](https://reader036.vdocuments.mx/reader036/viewer/2022062810/56815dc3550346895dcbee87/html5/thumbnails/12.jpg)
Client Side Performance Prediction
• Provider-based techniques – Hard to consider multiple data sources– Object dependencies– Page rendering time
Internet
CDN
Data Center
Data Center
![Page 13: Web Overview](https://reader036.vdocuments.mx/reader036/viewer/2022062810/56815dc3550346895dcbee87/html5/thumbnails/13.jpg)
The Page Load Time Decomposition
Page Load time
ObjectDependency
Client Delay Net Delay Server Delay
DNS Delay Data Transfer
RTT Packet loss
Load time of Object i
TCP 3-WAY
![Page 14: Web Overview](https://reader036.vdocuments.mx/reader036/viewer/2022062810/56815dc3550346895dcbee87/html5/thumbnails/14.jpg)
System Architecture
Measurement Engine
Dependency Extractor
Performance Predictor
New Scenarios
Results
PDGs
![Page 15: Web Overview](https://reader036.vdocuments.mx/reader036/viewer/2022062810/56815dc3550346895dcbee87/html5/thumbnails/15.jpg)
15
Outline
• Motivation• Design• Dependency Extraction• Performance Prediction• Implementation• Evaluation• Conclusion
![Page 16: Web Overview](https://reader036.vdocuments.mx/reader036/viewer/2022062810/56815dc3550346895dcbee87/html5/thumbnails/16.jpg)
What are dependencies?
• The embedded objects in an HTML page• Object requests generated by JavaScript
depend on the corresponding .JS files• External CSS and JavaScript files blocks the
other embedded objects in the HTML page• Event triggers, such as when image B trigger
“onload” event, then image A will be load by JavaScript
![Page 17: Web Overview](https://reader036.vdocuments.mx/reader036/viewer/2022062810/56815dc3550346895dcbee87/html5/thumbnails/17.jpg)
Dependency Definitions
• Descendant(X): objects that depend on X • Ancestor(X): objects that X depends on• Parent(X): The objects that X directly depends
on. Direct means can be the last among ancestors
• Based on parent relationship build PDG (parental dependency graph)
![Page 18: Web Overview](https://reader036.vdocuments.mx/reader036/viewer/2022062810/56815dc3550346895dcbee87/html5/thumbnails/18.jpg)
Discover Ancestors and Descendants
• We discover the descendant(X) sets by using time perturbation through HTTP proxy.
![Page 19: Web Overview](https://reader036.vdocuments.mx/reader036/viewer/2022062810/56815dc3550346895dcbee87/html5/thumbnails/19.jpg)
Extract non-stream parents
• Stream VS. Non-stream– HTML is stream objects and other types of objects
are non-stream
• Non-stream parent extraction
A B
C
D
X
Y Z
Descendant(A)={B,D}Descendant(B)={D}
![Page 20: Web Overview](https://reader036.vdocuments.mx/reader036/viewer/2022062810/56815dc3550346895dcbee87/html5/thumbnails/20.jpg)
Extract stream parents
• 1) Load the HTML page very slow
• 2) Delay other known non-stream parents
X
Y Z
Offset(Z)
X
Y Z
![Page 21: Web Overview](https://reader036.vdocuments.mx/reader036/viewer/2022062810/56815dc3550346895dcbee87/html5/thumbnails/21.jpg)
Extract stream parents
• 1) Load the HTML page very slow
• 2) Delay other known non-stream parents
X
Y Z
Offset(Z)
X
Y Z
Offset2(Z)
![Page 22: Web Overview](https://reader036.vdocuments.mx/reader036/viewer/2022062810/56815dc3550346895dcbee87/html5/thumbnails/22.jpg)
22
Outline
• Motivation• Design• Dependency Extraction• Performance Prediction• Implementation• Evaluation• Conclusion
![Page 23: Web Overview](https://reader036.vdocuments.mx/reader036/viewer/2022062810/56815dc3550346895dcbee87/html5/thumbnails/23.jpg)
Performance Prediction Procedure
Extract Object timing
information
Annotate client delay
Adjust each of object according to new scenario
Simulate the page load
process
Packettrace PDG New
Scenario PDG
![Page 24: Web Overview](https://reader036.vdocuments.mx/reader036/viewer/2022062810/56815dc3550346895dcbee87/html5/thumbnails/24.jpg)
Object Timing Info
• Basic object timing info
• Adding client delay info
X
Parent(X)Client delay
DNSTCP
HTTP
DNS lookup time
TCP handshaking time
Response time
Reply transfer timeRequest transfer time
![Page 25: Web Overview](https://reader036.vdocuments.mx/reader036/viewer/2022062810/56815dc3550346895dcbee87/html5/thumbnails/25.jpg)
Adjust Object Timing Info
• Adjust DNS lookup time directly• Server response time: change the response
time• RTT:
m * ΔRTT n * ΔRTT
ΔRTT
![Page 26: Web Overview](https://reader036.vdocuments.mx/reader036/viewer/2022062810/56815dc3550346895dcbee87/html5/thumbnails/26.jpg)
Simulating Page Load Process I• Browser behaviors
HTTP time
(c) without DNS and TCP time
(b) with TCP time
(a) with both DNS and TCP time
HTTP timeTCP time
Client Delay
HTTP timeTCP timeDNStime
Tr HTTP request ready
Tp last parent
available Tf Tl
Tf Tl
TrTf Tl
TrTp
Tp
HTTP time
(d) with TCP waiting time
Tf TlTp
TCP waiting time
Tr
Client Delay
Client Delay
Client Delay
![Page 27: Web Overview](https://reader036.vdocuments.mx/reader036/viewer/2022062810/56815dc3550346895dcbee87/html5/thumbnails/27.jpg)
Simulating Page Load Process II
• Page load process– Find the earliest candidate C from
CandidateQueue– Load C according to the conditions in the pervious
slide– Find new candidates whose parents are all
available– Adjust timings of new candidates– Insert new candidates into CanidateQueue
![Page 28: Web Overview](https://reader036.vdocuments.mx/reader036/viewer/2022062810/56815dc3550346895dcbee87/html5/thumbnails/28.jpg)
28
Outline
• Motivation• Design• Dependency Extraction• Performance Prediction• Implementation• Evaluation• Conclusion
![Page 29: Web Overview](https://reader036.vdocuments.mx/reader036/viewer/2022062810/56815dc3550346895dcbee87/html5/thumbnails/29.jpg)
29
WebProphet Framework
Browser
Control plug-in
Web robotScripting API
Application transaction
script snippet
Pcap trace loggerAgent network
Results
New scenario input
PDGsWeb Agent
Web Proxy
Dependency Extractor
Annotate object timing info
Page simulator
Trace Analyzer
Performance Predictor
TracesThe whole system is about
12,000 lines of code
![Page 30: Web Overview](https://reader036.vdocuments.mx/reader036/viewer/2022062810/56815dc3550346895dcbee87/html5/thumbnails/30.jpg)
Dependency Extraction Results
• Google and Yahoo Search
• Validation: manual code analysis
HTML
9 Images
HTML
CSS Image
Image Image Image Image
Javascript
Google Yahoo
![Page 31: Web Overview](https://reader036.vdocuments.mx/reader036/viewer/2022062810/56815dc3550346895dcbee87/html5/thumbnails/31.jpg)
Dependency Extraction Results• Google and Yahoo Maps
• Validation: Create fake pages with the same PDGs and validate the fake pages
#HTML=1
#JS=1 #HTML=1,#JS=1,#IMG=17
#JS=1#JS=1,#IMG=28#JS=1 #JS=3
#IMG=1
#IMG=8
#HTML=1
#CSS=2
#JS=1
#JS=1
#JS=1
#JS=1
#IMG=1
#IMG=2 #IMG=2
#JS=5,#IMG=3 #JS=1 #JS=1
#IMG=1 #IMG=4 #JS=1 #JS=1 #JS=1 #HTML=2,#IMG=65
#IMG=1 #IMG=10 #HTML=1 #IMG=1
#IMG=1
Yahoo
![Page 32: Web Overview](https://reader036.vdocuments.mx/reader036/viewer/2022062810/56815dc3550346895dcbee87/html5/thumbnails/32.jpg)
Predication Accuracy
• Evaluate both median and 95-percentile• Control experiment
– 50% cases with predication error less than 6.1%– 90% cases with predication error less than 16.2%
• Planetlab experiment– Predication error of median less than 6.1%– Predication error of 95-percentile less than 10.7%
![Page 33: Web Overview](https://reader036.vdocuments.mx/reader036/viewer/2022062810/56815dc3550346895dcbee87/html5/thumbnails/33.jpg)
Usage Scenarios• Analyze how to improve Yahoo Maps
– Only want to optimize a small number of objects– Use a greedy based search– Evaluate 2,176 hypothetical scenarios, find that
• Move 5 objects to CDN: 14.8%• Reduce client delays of 14 objects to half: 26.6%• Combine both: 40.1%
![Page 34: Web Overview](https://reader036.vdocuments.mx/reader036/viewer/2022062810/56815dc3550346895dcbee87/html5/thumbnails/34.jpg)
34
Outline
• Motivation• Design• Dependency Extraction• Performance Prediction• Implementation• Evaluation• Conclusion
![Page 35: Web Overview](https://reader036.vdocuments.mx/reader036/viewer/2022062810/56815dc3550346895dcbee87/html5/thumbnails/35.jpg)
Conclusions
• Develop a novel technique to extract the object dependencies of complex web pages
• Implement a simple but yet effective model to simulate the page load process
• Apply Webprophet to Yahoo Map to show that it can be useful for performance optimization