internet measurement masterclass 2006

34
Internet Measurement Masterclass 2006 10:00 Session 1: Kick off, problem space, thinking ahead, you and the law Andrew Moore - Queen Mary, University of London 11:00 Morning tea 11:15 Session 2: Monitoring with Windows and how not to be deluged with data Dinan Gunawardena - Microsoft Research Cambridge 12:15 Hardware selection for monitoring Fabian Schneider - TU Berlin 12:45 Lunch + concurrently with Endace hardware demonstration 13:45 Session 3: Netflow, and routing data as a source of measurement Steve Uhlig - Delft University of Technology 14:45 Afternoon tea 15:00 Session 4: Statistics for the measurement community Steven Gilmour - Queen Mary, University of London 15:45 Wrap-up 16:00 beer / NGN ProgNet06 workshop starts

Upload: kane-christensen

Post on 02-Jan-2016

25 views

Category:

Documents


0 download

DESCRIPTION

Internet Measurement Masterclass 2006. 10:00 Session 1: Kick off, problem space, thinking ahead, you and the law Andrew Moore - Queen Mary, University of London 11:00 Morning tea 11:15 Session 2: Monitoring with Windows and how not to be deluged with data - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Internet Measurement Masterclass 2006

Internet Measurement Masterclass 2006

10:00 Session 1:Kick off, problem space, thinking ahead, you and the law

Andrew Moore - Queen Mary, University of London

11:00 Morning tea11:15 Session 2:

Monitoring with Windows and how not to be deluged with dataDinan Gunawardena - Microsoft Research Cambridge

12:15Hardware selection for monitoring

Fabian Schneider - TU Berlin

12:45 Lunch + concurrently with Endace hardware demonstration13:45 Session 3:

Netflow, and routing data as a source of measurementSteve Uhlig - Delft University of Technology

14:45 Afternoon tea15:00 Session 4:

Statistics for the measurement communitySteven Gilmour - Queen Mary, University of London

15:45 Wrap-up16:00 beer / NGN ProgNet06 workshop starts

Page 2: Internet Measurement Masterclass 2006

Kick-off

Andrew Moore

Queen Mary, University of London

www.dcs.qmul.ac.uk/~awm

Page 3: Internet Measurement Masterclass 2006

What we won’t cover

• Active measurement (AMP, ping, traceroute, rrt, planetlab)

• Exhaustive survey of current measurement research

• I’m happy to provide opinion on these things in a break, but

I am not an active-measurement expert, I don’t even play-one on television.

Page 4: Internet Measurement Masterclass 2006

WHY Measure?

• Measuring something helps you understand it

Few would argue the Internet is important enough to understand

- Good data outlives bad theory- Jeff Dozier

- Measure what is measurable, make measurable what is not.

- after Galelio

Page 5: Internet Measurement Masterclass 2006

Why?a non-exhaustive list

• Measurements are inputs to– validate a model– drive a simulation– test a new approach

• Measurements help understanding (fault-finding)

• Measurements are often part of the accounting process

Page 6: Internet Measurement Masterclass 2006

Why so hard?

Wrong.

-Law

-Level 2 is not always

-accessible

-monitor-able

-Operations staff hate you

1Other monitoring boards are available

Pick your (Endace1) Dag board, plug it in and go. Right?

-Data on the wire is not the only first class measurement object

-Hardware doesn’t work

-Wrong Measurements

-Wrong Interpretation

-Wrong Problem

Page 7: Internet Measurement Masterclass 2006

Where should I start?

• Ask WHY are you measuring?

“Measure twice & cut once”

great for carpenters but

“Think (at least) twice and measure once”

is better for us.

Page 8: Internet Measurement Masterclass 2006

Pick the right tool for the right job

• Measurement of packets on a wire in your lab– Great for observing once specific use of

one set of applications in one place in the Internet

– Terrible for telling you how many mobile devices are used for IPtv in China, or the connectivity among world ISPs, or ….

Page 9: Internet Measurement Masterclass 2006

Uh-Oh

• Who are you going to measure? 1 user? 1000 users?

• When? (what time of the day?)• Where? (your personal machine, a

campus? a country?)• How?

– How-long? a day? week? month?– What method are you going to use?

Page 10: Internet Measurement Masterclass 2006

Law(I am Not a Lawyer and this is UK Law)

• If in doubt, seek out advice• Everything is illegal• Don’t ask a question you don’t want to know

the answer to.

• We care about– RIPA (Interception)– DPA (personal-data storage)

Many Thanks to Richard Clayton and Andrew Cormack

Page 11: Internet Measurement Masterclass 2006

Data Protection Act 1998

• Overriding aim is protect the interests of (and avoid risks to) the Data Subject

• Data processing must comply with the eight principles (as interpreted by the regulator)

• All data controllers must “notify” (£35) the Information Commissioner (unless exempt)– Exceptions for “private use”, “basic business purpose”: see the website

Page 12: Internet Measurement Masterclass 2006

Data Protection act (1998)

• Principle 7 is specially relevant– Appropriate technical and organization measures

shall be taken against unauthorized or unlawful processing of personal data and against accidental loss or destruction of, or damage to personal data

• The Information Commissioner advises that a risk-based approach should be taken in determining what measures are appropriate– Management and organizational measures are as

important as technical ones– Pay attention to data over its entire lifetime

Page 13: Internet Measurement Masterclass 2006

RIP Act 2000

• Part I, Chapter I interception

• Part I, Chapter II communications data

• Part II surveillance & informers

• Part III encryption– not as relevant for this

• Part IV oversight– sets up tribunal and interception commissioner

Page 14: Internet Measurement Masterclass 2006

RIP Act 2000 - Interception

• Tapping a telephone (or copying an email) is “interception”. It must be authorized by a warrant signed by the secretary of state.– SoS means the home secretary (or similar). Power

delegation is temporary. Product is not admissible in court

• Some sensible exceptions exist– Delivered data– Stored data that can be accessed by the production of

an order– Techies running a network– “Lawful business practice”

Page 15: Internet Measurement Masterclass 2006

Lawful Business Practice

• Regulations prescribe how not to commit an offence under the RIP act. They do not specify how to avoid problems with DPA (or other legislation)

• Must make all reasonable efforts to tell all users of system that interception may occur

Page 16: Internet Measurement Masterclass 2006

Law One-slider• If in doubt - ask someone!• Why do you want to do this?

– bare minimum, no “data for data’s sake”– the onus is on you at all times to justify what you

are doing

• Unless you want to keep the DPA happy; don’t keep any personal identifiers

• Use your University ethics committee

I am NOT a Lawyer!

Page 17: Internet Measurement Masterclass 2006

(Good) Measurement Principles

• Check your methodology• Keep all Meta-data• Calibrate your experiments• Automate all processing

– it’s a documentation trail– cache those intermediate results; they tell

you where you went wrong

• Visualize your data at every stage– this helps ensure you didn’t goof

Page 18: Internet Measurement Masterclass 2006

Check your Methodology

• Talk to people around you, find a mentor and even an antagonist

• Better they find something wrong than the external examiner or the reviewers of the paper

• Consider the scope of a reasonable measurement and the claims you can make

Page 19: Internet Measurement Masterclass 2006

Meta-Data

• the filter you used on tcpdump is meta-data.

• your methodology is meta-data• the day/time of the week is meta-data• the hardware you used is meta-data• (possibly) how much alcohol in your

blood-stream is meta-dataKeep it all

Page 20: Internet Measurement Masterclass 2006

Calibrate your experiments• Test your assumptions

• (been assuming the network is busiest at midday - okay this is the moment you find that 3:30 is the busy time)

• “bench-test” your setup; this is just good science – test your processing scripts many (many)

times

• Most departments do not have good test equipment, this is no excuse

Page 21: Internet Measurement Masterclass 2006

Automate your processing

• Make is your friend

• intermediate processing (and the scripts/code that did it) are more meta-data

• critical when you want to reproduce your results (and have others reproduce your results)

Page 22: Internet Measurement Masterclass 2006

Visualize your data

• visualize your data early and often

• scatter plots are always useful

• identify/understand those outliers now– problem? or expected result?

Page 23: Internet Measurement Masterclass 2006

My first network monitor

• configurations– monitor and method

• gotcha

• backhaul network

• storage, archive, index

Page 24: Internet Measurement Masterclass 2006

Configuration

• Hardware selection– How are you going to remote-admin this machine?

• OS / Software selection– Much work in unix domain; that doesn’t make it

good-work; Dinan – tcpdump/pcap is standard and lots of tools

• Not fast, loss-error prone, timestamps are junk,

– divorce the data representation from the method• tcpdump is a useful offline tool but dagtools, CoMo and

others (nprobe, etc) are simply better online

– consider the right tool for the task

Page 25: Internet Measurement Masterclass 2006

Hardware (getting the traffic)

• Passive taps– invasive installation– no impact in operation– “stealing photons”

• Port Mirrors (e.g. Cisco SPAN)– be vewy vewy careful.

• jitter, loss, reordering

– fantastic for multiple/redundant links• multiple copies of packets

QuickTime™ and aTIFF (Uncompressed) decompressorare needed to see this picture.

Page 26: Internet Measurement Masterclass 2006

Hardware 2

• Remember about physical layers?• Observing traffic at end systems is pretty

easy (but imposes an overhead)• intermediate networks may not be trivial to

monitor:– Packet over Ethernet, Packet over Sonet are not

the only possibilities

• Aside from weird layer-2s, maybe encrypted,

Page 27: Internet Measurement Masterclass 2006

Getting the data to somewhere useful

• Out of Band backhaul

– Co-schedule Measurements– FedEx the disks

(realistically - postgrad-u-haul)

– Co-locate storage/processing• storage & processing = heat/power

– Dedicated backhaule.g. using (a piece of) the dedicated research net

Page 28: Internet Measurement Masterclass 2006

Tools• tcpdump (libpcap) - but know the limitationsa) no records of lossb) microsecond accuracy only - and RARELY thatc) simultaneous arrival times are possibled) no record of precision or accuracy or filter or conditions

or monitor-circumstance or equipment failure or …

• gnuplot (or any plotting packet)scatter plot are always useful (combined with eye-

squared)

Page 29: Internet Measurement Masterclass 2006

SharingProviding Access to the data

• Law may prevent access• Either need to control who gets dataOR• Ship code to monitor

(Mogul et al, MineNet 2005/6)

• One PlatformCoMo http://como.sourceforge.net

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

Page 30: Internet Measurement Masterclass 2006

These guys do run the Internet(or why I should be nice to my ops guys)

• Looking for a real problem?• Wondering about actual impact?• Talk to your front line• Sysadmins and Operators are front-line• They are rarely stupid• Don’t have the time to “think outside the box”• they will be honest with you (brutally honest in

most cases)• www.nanog.org • www.ripe.org

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

Page 31: Internet Measurement Masterclass 2006

Next….

• Lets examine hardware and Operating Systems issues, specifically:– Windows: the other operating-system– Data-management: how to prevent success-

disaster

– So you want to monitor 10Gbps?

Page 32: Internet Measurement Masterclass 2006

Suppliers

• NetOptics - fibre splitters

• Endace - capture hardware

Page 33: Internet Measurement Masterclass 2006

UK specific resources

• Janet’s NDA and AUP:http://www.ja.net/development/traffic-data/

• Data Protection Act:http://www.hmso.gov.uk/acts/acts1998/19980029.htm

• RIPAhttp://www.legislation.hmso.gov.uk/acts/acts2000/20000023.htm

Page 34: Internet Measurement Masterclass 2006

Specific references• Mark Crovella & Bala Krishnamurthy, Internet Measurement, Wiley

2006

• Walter Willinger, Pragmatic Approach to Dealing with High Variability, IMC 2004

• Vern Paxson, Sound Internet Measurement, IMC 2004

Very early “what I did with my measurements” paper; these papers grandparent much Internet measurement work

• kc claffy, etal, A parameterizable methodology for Internet traffic flow profiling, IEEE JSAC, 1995

• V. Paxson, End-to-End Routing Behavior in the Internet. IEEE/ACM Transactions on Networking, Vol.5, No.5, pp. 601-615, October 1997