locality and service structure of popular web · pdf file1 1 © nokia siemens networks...
TRANSCRIPT
1
1 © Nokia Siemens Networks Inernet Success Factors and Challenges / Joachim Charzinski / Sep. 2008Public
Locality and service structure of popular Web sitesJoachim CharzinskiNokia Siemens Networks
ITG Fachgruppe 5.2.1 14.Nov.2008Network Planning & Traffic Economics
2 © Nokia Siemens Networks Inernet Success Factors and Challenges / Joachim Charzinski / Sep. 2008Public
Disclaimers
• The opinions stated in this presentation represent the author’s scientific views and are not necessarily identical to Nokia Siemens Networks’business directions.
• Trademarks are owned by their respective owners and used here only to illustrate things in context.
3 © Nokia Siemens Networks Inernet Success Factors and Challenges / Joachim Charzinski / Sep. 2008Public
There are many ideas and questions around…
• service based charging for Internet traffic with DPI• distance based charging for Internet traffic
• QoS reservations for applications• end-to-end Ethernet pipes
• local breakouts and services• network based cacheing in access/aggregation networks• NAT dimensioning
• Web 2.0 “mashups” traffic characteristics
• strict packet filter configurations for security
4 © Nokia Siemens Networks Inernet Success Factors and Challenges / Joachim Charzinski / Sep. 2008Public
Evolution of traffic relationsearly computertelephone
broadcast
multicast peer to peer
early Web
1
2
today‘s Web
What’s different?How much?
2
5 © Nokia Siemens Networks Inernet Success Factors and Challenges / Joachim Charzinski / Sep. 2008Public
existingservice3
existingservice2
API
existingservice1
Web 2.0 Mashups
• re-mixing of existing services into new services
API
image source: flickr via http://metaatem.net/words/mashup
newservice B
client side mashup
server side mashup
newserviceA
API
6 © Nokia Siemens Networks Inernet Success Factors and Challenges / Joachim Charzinski / Sep. 2008Public
Web inclusion versus mashups
AD BANNER
current train delays
advertisement banner server
data server
AD BANNER
current train delays
advertisement banner server
data server
“Classical” Web inclusion Mashup existingservice3
existingservice3
existingservice2
APIAPI APIAPI
newservice B
client side mashup
newservice B
client side mashup
• whole element included from remote service
• images, frames• advertisement servers, click-
through billing services, contents distribution networks
<li><img src="http://ad.de.doubleclick.net/ad/N3995.yahoo_DE/B3152413.7;sz=1x1;ord=1223414745?" width=1 height=1 border=0 style='display:none;'>
<script type="text/javascript" src="http://www.google.com/jsapi?key=ABQIAAAA3nuEoGKhRfKaTwhFg7OdgxS6mGdPc-RjK_luIMxI5IejX_bbThTmLSxjVS7PFK_Jwc8dOvCuFMqQOw"></script>
…
google.maps.Event.addListener(markers[46], "click", function() {…
• raw data included from remote services
• local data processing• maps, images, blogs/information
7 © Nokia Siemens Networks Inernet Success Factors and Challenges / Joachim Charzinski / Sep. 2008Public
Outline
• Motivation• Measurements• Results
8 © Nokia Siemens Networks Inernet Success Factors and Challenges / Joachim Charzinski / Sep. 2008Public
Internet measurement methods
• passive measurement– observe (aggregate) traffic from real users– lots of data, often statistically significant– only anonymized traces available / usable due to privacy legislation– no correlation back to user actions or Web sites visited– no full address visibility required for CIDR prefix investigations
Internet• active measurement
– injection of IP packets or TCP data transfers– measurement of Internet (not Web) characteristics
▪ latency, packet loss, re-ordering
• actively initiated measurements– defined Web workload
▪ list of elements to retrieve▪ list of sites to visit
– concentrate on service rather than packet level– observe latency, download speeds– analysis of IP addresses, networks and service structures
InternetURLList
Internet
CIDR Classless Inter-Domain Routing
3
9 © Nokia Siemens Networks Inernet Success Factors and Challenges / Joachim Charzinski / Sep. 2008Public
Measurements used here• actively initiated measurements• visit 100 Web sites most popular in the US
according to alexa.com (06/2008)• visited homepages only• automated process: for each site do
▪ start packet trace▪ open browser to load home page▪ close browser after home page has loaded (or after 1min timeout)▪ stop and store packet trace
• observed measures– traffic (rates, volumes, number of packets)– locality structure (number of hosts, network prefixes, AS numbers, DNS SLDs)– analysis of CDN usage
103046
packets
492
unique hosts
234
unique NPs
157
unique ASs
2294100
connectionsSites
AS Autonomous SystemCDN Content Distribution NetworkNP Network PrefixSLD Second Level Domain 10 © Nokia Siemens Networks Inernet Success Factors and Challenges / Joachim Charzinski / Sep. 2008
Public
Internet
Different Notions of Locality
Hosts
server 1 server 2
server n
. . .
64.236.16.136
64.236.16.160
157.166.255.12
Client88.64.48.1
Organizations
. . .
akamai.net
doubleclick.netcnn.com
google.com
arcor-ip.netClient
Routing Domains (NPs, ASs)
network 1network 2
network n
. . .AS 3356
AS 1668
AS 5662
AS 3209
access network 88.64.0.0/14
64.236.0.0/16157.166.224.0/19195.50.128.0/18
204.160.0.0/13207.120.0.0/14
Geographic Locations
server 1 server n
server 2 client
11 © Nokia Siemens Networks Inernet Success Factors and Challenges / Joachim Charzinski / Sep. 2008Public
Active and passive usage of services
loadingcnn.com
home page
loadingebay.com
home page
loadingwikipedia.orghome page
loadingweather.comhome page
• passive usage: Which sites are using this service?• active usage: Which services does this site use?
12 © Nokia Siemens Networks Inernet Success Factors and Challenges / Joachim Charzinski / Sep. 2008Public
Outline
• Motivation• Measurements• Results
4
13 © Nokia Siemens Networks Inernet Success Factors and Challenges / Joachim Charzinski / Sep. 2008Public
Locality example for a top Web site’s home page
client
Internet
xlink.net
Yahoo! homepage http://www.yahoo.com/30 connections298 kbytes15 hosts11 network prefixes9 ASsAkamai share: 8 hosts, 5NPs,
5 ASs, 3 DNS SLDs
loadingyahoo.comhome page
14 © Nokia Siemens Networks Inernet Success Factors and Challenges / Joachim Charzinski / Sep. 2008Public
Page size and locality characteristics
DNSASNetworkHTTPNumber ofNumber of ConnectionsSize
11.7M
600k
943(Bytes)
107
22.94
1total
45
15.3
1max. conc.
21
8.24
1Hosts
284
62.2
2Requests
14
5.65
1Prefixes
1513max.
5.155.04average
11min.SLDsNumbers
0
0.2
0.4
0.6
0.8
1
0 50 100 150 200
ccdf
number
HTTP requestsconnections
max. concurrent conn.
number
10034
0
0.2
0.4
0.6
0.8
1
0 5 10 15 20 25
ccdf
number
HostsNPsASs
DNS SLDs
number
138
15 © Nokia Siemens Networks Inernet Success Factors and Challenges / Joachim Charzinski / Sep. 2008Public
Content distribution networks (CDNs) allow loading more contents during the same time
LatencyData RateASsNumber ofPage Size
651k465k599kin Bytes
9.93.98.2Hosts
6.12.45.0
320k196k286kin bit/s
10.4with CDN9.2without CDN10.0Average (all)in sec
0
0.2
0.4
0.6
0.8
1
0 500000 1e+006 1.5e+006 2e+00
ccdf
Bytes downloaded for home page
sites using CDNssites not using CDNs
0
0.2
0.4
0.6
0.8
1
0 10 20 30 40 50 60 70
ccdf
Time to load home page in sec
sites using CDNssites not using CDNs
Total time to load home page in secBytes downloaded for home page0 500k 1M 1.5M 2M
16 © Nokia Siemens Networks Inernet Success Factors and Challenges / Joachim Charzinski / Sep. 2008Public
20%
12%4% 2%
62%
AkamaiFootprintGoogleMirrorimagePantherCDN
68%7%
17%
3% 5% AkamaiFootprintGoogleMirrorimagePantherCDN
Usage of Contents Distribution Networks
CDN share of connectionsCDN share of total: 35%
CDN share of volumeCDN share of total: 46%
CDN domains in Arcor networkAkamai = akamai.net, akamaiedge.net, akadns.net, akam.net via arcor-ip.de, level3.netFootprint = footprint.net via level3.netGoogle = l.google.com via google.comMirrorimage = mirrorimage.net, instacontent.net, mirror-image.net via mii.net, ripe.netPanthercdn = panthercdn.com via ripe.net
*data sets used: Web top 100 home pages + Web top25 browsing sessions
5
17 © Nokia Siemens Networks Inernet Success Factors and Challenges / Joachim Charzinski / Sep. 2008Public
Passive usage of services
loadingcnn.com
home page
loadingebay.com
home page
loadingwikipedia.orghome page
loadingweather.comhome page
18 © Nokia Siemens Networks Inernet Success Factors and Challenges / Joachim Charzinski / Sep. 2008Public
Passive usage results
• typical popularity rank distribution known from contents
1
10
100
1 10 100
Num
ber o
f site
s us
ing
serv
ice
Service rankService rank
Rank distribution of service usage
0
2
4
6
8
10
12
14
0 20 40 60 80 100 120 140 160 180
Num
ber o
f NP
s se
en
Service rank
• heavily used services are more distributed
Service rank
Number of network prefixes observed per service
19 © Nokia Siemens Networks Inernet Success Factors and Challenges / Joachim Charzinski / Sep. 2008Public
Outline
• Motivation• Measurements• Results• Conclusions
20 © Nokia Siemens Networks Inernet Success Factors and Challenges / Joachim Charzinski / Sep. 2008Public
Web results summary• similar to the Internet itself, also the Web has become a
highly distributed infrastructure• most Web sites are realizing their service in a highly distributed way
– deliberate distribution of contents (CDNs)– service hosting– advertisement related element inclusion– mashup of Internet Web services
• consequences for network architectures etc.– service or distance oriented charging would be highly intransparent to users– explicit QoS support by admission control is practically infeasible for Web traffic– local caching in access or aggregation networks
▪ can cover only top sites▪ will have to interwork with multiple CDN services
– availability of whole service critical due to large number of components▪ loose coupling service model already employed on the Web
– dedicated security configurations hard to do without impact on important sites– large number of ports (~30) required simultaneously per user
(e.g. on NAT devices or firewalls)