#16 application measurement
DESCRIPTION
#16 Application Measurement. Presentation by Bobin John. 1 st paper:. Measurement, Modeling & Analysis of a Peer-to-Peer File-Sharing Workload (KaZaa paper). KaZaa paper. P2P file sharing is the most dominant This paper deals with KaZaa 200-day trace is taken Model is developed - PowerPoint PPT PresentationTRANSCRIPT
KaZaa paperP2P file sharing is the most dominantThis paper deals with KaZaa
200-day trace is taken Model is developed Locality-awareness can improve KaZaa
performance
KaZaa paper Trace Methodology
KaZaa trace summary statistics
KaZaa “usernames” used KaZaaLite … IPs used Easy to distinguish KaZaa-specific HTTP headers Auto-update transactions filtered out
KaZaa paper User Characteristics
Users slow down as they age
2 reasons: attrition & slowing down over time
KaZaa paperObject Characteristics
Object Dynamics Clients fetch objects at most once Popularity of objects is often short-lived Most popular objects tend to be recently born
objects Most requests are for old objects
KaZaa paperNew idea!
How to reduce bandwidth cost? Use a proxy cache
Legal & political problems Locality-aware request routing
Centralized request redirection redirector
Decentralized request redirection supernodes
KaZaa paper Conclusion
KaZaa workload is different Does not follow Zipf Can be improved with locality awareness
Drawbacks A trace from a university ought not to be
generalized to all KaZaa/P2P applications Further implementation details of locality-
awareness? Scope of use for such a locality awareness tool?
I don’t think universities would like this
Chat paperWhy is chat a worthwhile target for
traffic characterization? Chat offers computer mediated
communication Used by a large number of people …
potential of being habit-forming
Chat paperDifferent types of chat systems:
Internet Relay Chat [IRC] Web-based chat systems ICQ & AIM Gale
Chat paperProblem in analyzing chat traffic
Multitude & diversity of systems & protocols
Chat protocol realized on top of HTTP protocol … difficult to separate chat traffic
Resource limitations due to filtering demands
Chat paper IRC
Set of connected servers Client connection requests on port 6667 Unique nicknames Discussion channels Channel operators Medium to share data IRC operator
Chat paperWeb-chat
Not tty-based … Web browser interface A single server to connect to 3 classes of chat systems:
HTML-Web-Chat Applet-Web-Chat Applet-IRC-Chat
Difference between IRC & Web-chat is only “social”
Chat paper Identifying IRC chat traffic
Packet monitor that captures all TCP traffic involving port 6667
Can only capture text & control messages Data/file transfers cannot be captured as they run
on other TCP connections IRC’s packet size distribution is mainly dominated
by small packets IRC session should last more than a few minutes IRC sends keep-alive messages
Chat paper Identifying Web-chat traffic
HTML-Web-chat: Appropriate cache-control-headers Adding state information Cache-Control: Must-revalidate & Cache-Control: Private indicates non-chat traffic
Use of scripting languages e.g.,Javascript Use of applet windows e.g., Java
Chat paper Identifying Web-chat traffic
Applet-Web-chat: User would have accessed a Java file or a
script or even a page like “xxxchatyyy” … “chat” could occur even in the path
Chat paperOverall strategy for extracting chat
traffic Repeat this process
Identify traffic that cannot be chat traffic Remove it
Steps that filter out more non-chat traffic has to be implemented earlier
Other steps that need more processin gor pre-processing should be implemented later
Chat paperOverall strategy for extracting chat
traffic Eliminate traces from ports < 1024 except
port 80 Also eliminate trace from well-known
application ports (e.g., Gnutella - 6346) Group packets into flows Mark & filter them according to the
previous table
Chat paperExperiment
At University of Saarland Resource partitioning Traces were generated after filtering 950GB > 1.2GB > 238MB (WEBCHAT1) 192MB (IRC1) 350MB (WEBCHAT2)
Chat paper:Validation
2 aspects: Recall – ability of a system to present all
relevant items Precision – ability of a system to present only
relevant items
Chat paperValidation
Lots of calculations
“we can expect to locate about 91.7% of all real chat connections and that we expect that at least 93.1% of all connections we identify are indeed chat connections. “