#16 application measurement

41
#16 Application Measurement Presentation by Bobin John

Upload: william-mason

Post on 30-Dec-2015

44 views

Category:

Documents


3 download

DESCRIPTION

#16 Application Measurement. Presentation by Bobin John. 1 st paper:. Measurement, Modeling & Analysis of a Peer-to-Peer File-Sharing Workload (KaZaa paper). KaZaa paper. P2P file sharing is the most dominant This paper deals with KaZaa 200-day trace is taken Model is developed - PowerPoint PPT Presentation

TRANSCRIPT

#16 Application Measurement

Presentation by Bobin John

1st paper:

Measurement, Modeling & Analysis of a Peer-to-Peer File-Sharing Workload (KaZaa paper)

KaZaa paperP2P file sharing is the most dominantThis paper deals with KaZaa

200-day trace is taken Model is developed Locality-awareness can improve KaZaa

performance

KaZaa paper Trace Methodology

KaZaa trace summary statistics

KaZaa “usernames” used KaZaaLite … IPs used Easy to distinguish KaZaa-specific HTTP headers Auto-update transactions filtered out

KaZaa paperUser Characteristics

KaZaa users are patient

KaZaa paper User Characteristics

Users slow down as they age

2 reasons: attrition & slowing down over time

KaZaa paperClient Activity

KaZaa paperObject Characteristics

Diverse workload

KaZaa paperObject Characteristics

Object Dynamics Clients fetch objects at most once Popularity of objects is often short-lived Most popular objects tend to be recently born

objects Most requests are for old objects

KaZaa paperObject Characteristics

NOT Zipf-like Web access patterns follow the Zipf property

KaZaa paperModel

KaZaa paperModel for P2P file-sharing workloads

Model Description

KaZaa paperModel for P2P

File-Sharing effectiveness diminishes with client age

KaZaa paperModel for P2P

New Object Arrivals improve performance

KaZaa paperModel for P2P

New clients cannot stabilize performance

KaZaa paperModel for P2P

Model validation

KaZaa paperNew idea!

How to reduce bandwidth cost? Use a proxy cache

Legal & political problems Locality-aware request routing

Centralized request redirection redirector

Decentralized request redirection supernodes

KaZaa paperLocality awareness

Methodology Benefits

KaZaa paperLocality awareness

Accounting for Hits & Misses

KaZaa paperLocality awareness

Availability

KaZaa paper Conclusion

KaZaa workload is different Does not follow Zipf Can be improved with locality awareness

Drawbacks A trace from a university ought not to be

generalized to all KaZaa/P2P applications Further implementation details of locality-

awareness? Scope of use for such a locality awareness tool?

I don’t think universities would like this

2nd paper:

An analysis of Internet Chat systems

Chat paperWhy is chat a worthwhile target for

traffic characterization? Chat offers computer mediated

communication Used by a large number of people …

potential of being habit-forming

Chat paperDifferent types of chat systems:

Internet Relay Chat [IRC] Web-based chat systems ICQ & AIM Gale

Chat paperProblem in analyzing chat traffic

Multitude & diversity of systems & protocols

Chat protocol realized on top of HTTP protocol … difficult to separate chat traffic

Resource limitations due to filtering demands

Chat paper IRC

Set of connected servers Client connection requests on port 6667 Unique nicknames Discussion channels Channel operators Medium to share data IRC operator

Chat paperWeb-chat

Not tty-based … Web browser interface A single server to connect to 3 classes of chat systems:

HTML-Web-Chat Applet-Web-Chat Applet-IRC-Chat

Difference between IRC & Web-chat is only “social”

Chat paper Identifying IRC chat traffic

Packet monitor that captures all TCP traffic involving port 6667

Can only capture text & control messages Data/file transfers cannot be captured as they run

on other TCP connections IRC’s packet size distribution is mainly dominated

by small packets IRC session should last more than a few minutes IRC sends keep-alive messages

Chat paper Identifying Web-chat traffic

HTML-Web-chat: Appropriate cache-control-headers Adding state information Cache-Control: Must-revalidate & Cache-Control: Private indicates non-chat traffic

Use of scripting languages e.g.,Javascript Use of applet windows e.g., Java

Chat paper Identifying Web-chat traffic

Applet-Web-chat: User would have accessed a Java file or a

script or even a page like “xxxchatyyy” … “chat” could occur even in the path

Chat paperOverall strategy for extracting chat

traffic

Chat paperOverall strategy for extracting chat

traffic Repeat this process

Identify traffic that cannot be chat traffic Remove it

Steps that filter out more non-chat traffic has to be implemented earlier

Other steps that need more processin gor pre-processing should be implemented later

Chat paperOverall strategy for extracting chat

traffic Eliminate traces from ports < 1024 except

port 80 Also eliminate trace from well-known

application ports (e.g., Gnutella - 6346) Group packets into flows Mark & filter them according to the

previous table

Chat paperExperiment

At University of Saarland Resource partitioning Traces were generated after filtering 950GB > 1.2GB > 238MB (WEBCHAT1) 192MB (IRC1) 350MB (WEBCHAT2)

Chat paper:Validation

2 aspects: Recall – ability of a system to present all

relevant items Precision – ability of a system to present only

relevant items

Chat paperValidation

Lots of calculations

“we can expect to locate about 91.7% of all real chat connections and that we expect that at least 93.1% of all connections we identify are indeed chat connections. “

Chat paperResults

Session durations

Chat paperResults

Interarrival times of sessions

Chat paperResults

Packet sizes

Chat paperResults

Sent & Received bytes

Chat paperConclusion

Chat-traffic was successfully filtered out Accuracy was above 90%

Drawbacks Use of this work?