cse 592 internet censorship (fall 2015) lecture 07 prof. phillipa gill – stony brook university
TRANSCRIPT
CSE 592INTERNET CENSORSHIP
(FALL 2015)
LECTURE 07
PROF. PHILLIPA GILL – STONY BROOK UNIVERSITY
WHERE WE ARE
Admin note:
• Update on grades (coming soon!)
Last time:
• Challenges of measuring censorship
• Principles to keep in mind when designing censorship measurements
• Ethics/legality (finishing this today)
• Creative solutions to the challenges
• SpookyScan
• Questions?
TEST YOUR UNDERSTANDING
1. What types of information controls might we want to study?
2. What sort of data/applications would we want to study in each case?
3. What does the lack of ground truth mean for how we interpret censorship data?
4. What challenges arise because of the adversarial environment where we make censorship measurements?
5. How can we try to reduce these challenges on our work?
6. What data sources might we correlate with to validate censorship?
7. What is the dual-use that censorship measurements can exploit?
8. What is an important property of communication channels for censorship?
9. Can someone explain how SpookyScan works?
TODAY
• Encore/Ethics of measuring censorship (wrap up of last time)
• Traffic Differentiation & Net Neutrality
• Glasnost
5
ENCORE: LIGHTWEIGHT MEASUREMENT OF WEB CENSORSHIP WITH CROSS-ORIGIN REQUESTS
Governments around the world realize Internet is a key communication tool
• … working to clamp down on it!
How can we measure censorship?
Main approaches:
User-based testing: Give users software/tools to perform measurements
• E.g., ONI testing, ICLab
External measurements: Probe the censor from outside the country via carefully crafted packets/probes
• E.g., IPID side channels, probing the great firewall/great cannon
6
ENCORE: LIGHTWEIGHT MEASUREMENT OF WEB CENSORSHIP WITH CROSS-ORIGIN REQUESTS
Censorship measurement challenges:
Gaining access to vantage points
Managing user risk
Obtaining high fidelity technical data
Encore key idea:Script to have
browser query Web
sites for testing
ENCORE: USING CROSS SITE JAVA SCRIPT TO MEASURE CENSORSHIP
• Basic idea: Recruit Web masters instead of vantage points
• Have the Web master include a javascript that causes the user’s browser to fetch sites to be tested
• Use timing information to infer whether resources are fetched directly
• Operates in an ‘opt-out’ model
• User may have already executed the javascript prior to opting out• Argument
• Not requiring informed consent gives users plausible deniability• Steps taken to mitigate risk
• Include common 3rd party domains (they’re already loaded by many pages anyways)
• Include 3rd parties that are already included on the main site• One project option is to investigate these strategies!
Example site hosting Encore: http://www.cs.princeton.edu/~feamster/
ETHICAL CONSIDERATIONS
• Different measurement techniques have different levels of risk
• In-country measurements
• How risky is it to have people access censored sites?• What is the threshold for risk?• Risk-benefit trade off?• How to make sure people are informed?
• Side channel measurements
• Causes unsuspecting clients to send RSTs to a server• What is the risk? • Not stateful communication …
• … but what about a censor that just looks at flow records?
• Mitigation idea: make sure you’re not on a user device• Javascript-based measurements
• Is lack of consent enough deniability?
TRAFFIC DIFFERENTIATION
• The act of identifying and discriminating against certain types of Internet traffic
• Example:
• Comcast + BitTorrent
Comcast's interference affects all types of content, meaning that, for instance, an independent movie producer who wanted to distribute his work using BitTorrent and his Comcast connection could find that difficult or impossible — as would someone pirating music.
THE RESULT?
WHAT EXACTLY IS TRAFFIC DIFFERENTIATION?
• Traffic is identified and performance is degraded
• How can traffic be identified?
• IP address• Port• Host name• Payload• Flow level characteristics
• Large body of work on “traffic classification” to identify different types of traffic
• Many products: e.g., Sandvine• How might performance be degraded?
• Lower priority queues • Spoofing dupacks (tested but not deployed)
UNDERLYING ISSUE: NET NEUTRALITY
They want to deliver vast amounts of information over the Internet. And again, the Internet is not something that you just dump something on. It's not a big truck. It's a series of tubes.
And if you don't understand, those tubes can be filled and if they are filled, when you put your message in, it gets in line and it's going to be delayed by anyone that puts into that tube enormous amounts of material, enormous amounts of material.
NET NEUTRALITY
The principle that ISPs and governments should treat data on the Internet equally
• No discrimination (performance or cost) based on
• User, content, site, application, etc.• Debated since early 00’s
• Mainly in context of last-mile providers wanting to block certain sites/protocols
• Example: A local ISP approached a colleague for a collaboration on traffic classification… guess why?
• Vint Cerf (co-inventor of IP), Tim Berners-Lee (creator of Web) speak out in favor of Net Neutrality
HISTORY OF NET NEUTRALITY IN US
• 2008 FCC serves cease and desist to Comcast in relation to BitTorrent blocking
• June 2010 US court of appeals rules that FCC doesn’t have power to regulate ISP networks or management of its practices
• Dec. 2010 FCC Open Internet Order: bans cable television and phone providers from preventing access to competing services (eg., Netflix)
• 2012 variety of complaints: vs. AT&T (for restricting Facetime), Comcast (for restricting Netflix)
• Jan. 2014 court says FCC doesn’t have authority to enforce net neutrality because ISPs are not “common carriers”
• Common carrier is liable for goods it carries• E.g., oil pipelines• ISPs treated like common carriers but not liable for third party
content (e.g., slander, copyright infringement)
HISTORY OF NET NEUTRALITY IN US (2)
• As of Jan. ‘14 FCC could not enforce net neutrality because ISPs were not common carriers
• Issue: should ISPs be reclassified as common carriers (under Title II of the Communications act of 1934)
• Feb. 2015 – FCC votes to apply common carrier status to ISPs
• Mar. 2015 – FCC published new net neutrality rules
• Now net neutrality applies also to mobile networks
ALTERNATE VIEWS ON NET NEUTRALITY
• FCC rules about ``no blocking, no throttling and no paid prioritization’’ sound good but don’t address the real problem
• Key issue: lack of competition
• If ISPs had to compete on price and service there would be incentives for them to provide good performance
• Without competition…
• … ISPs can leave congested interconnects until content providers yield and pay for private interconnects
• Two technical mechanisms:
• Traffic differentiation: identify + degrade• Interconnect congestion: refuse to provide higher bandwidth
• …forces content providers into paid private peerings• Currently outside of the scope of the current FCC rules!
HOW CAN TECHNOLOGY HELP?
• Increasing transparency of traffic differentiation
• Give users tools to detect traffic differentiation when it happens
• Glasnost (reading presentation)• Traffic Differentiator • (ACKs: Slides prepared by Arash Molavi Kakhki (NEU) &
Adrian Li (SBU))• Measure interconnect congestion
• https://www.caida.org/publications/presentations/2015/mapping_internet_interdomain_congestion_aims/mapping_internet_interdomain_congestion_aims.pdf
Goals
Reliably detect differentiation in cellular networks
• On any app traffic
• Without requiring root privileges or OS modifications
• with few assumptions about traffic characteristics or packet shaper implementations
Our approach is the only known way to test differentiation from non-rooted mobile devices
19
Related Work
20
Switzerland Glasnost Us
Applications Tested P2P P2P and video Any application
Features Tested Packet manipulation Performance Both
Desktop App Yes Browser plugin Yes
Smartphone App No No Yes
Previous work explored this problem for limited protocols and in limited environments.
Other closely related work
NetDiff, NetPolice, NANO, Bonafide
Key Contributions• Design and implementation of a traffic differentiation detecting
system
• Validation of our approach using commercial shaping devices
• Evaluating statistical techniques for identifying such differentiation
• An Android app for any user to run our tests and see the results from our server
21
Record & Replay
22
1- Record:
2- Replay:
Methodology
23
Record target application traffic using meddle and
tcpdump
Replay traffic alternately,
tunneled and untunneled
Parse pcap and create transcript of
packets
Analyze throughput, RTT, jitter, packet loss
Record:• Avoid running TCPDUMP on users’ devices.• Utilize Meddle (a VPN proxy over IPsec) to record network traffic.
Methodology
24
Record target
application traffic using meddle and
tcpdump
Replay traffic
alternately, tunneled
and untunneled
Parse pcap and
create transcript of packets
Analyze throughput, RTT, jitter, packet loss
Parse:• Create two objects, one for the client side and one for server side.• Handle unrelated/noise traffic.
Parserclient
server
Methodology
25
Record target
application traffic using
meddle and
tcpdump
Replay traffic
alternately, tunneled
and untunneled
Parse pcap and
create transcript of packets
Analyze throughput, RTT, jitter, packet loss
Replay:
• replay the salient features of application traffic such that it will be subject to differentiation from middleboxes
• alternate tunneling and plaintext to control visibility for packet shapers
Methodology
26
Record target
application traffic using
meddle and
tcpdump
Replay traffic
alternately, tunneled
and untunneled
Parse pcap and
create transcript of packets
Analyze throughput, RTT, jitter, packet loss
Analyze:
• quantify differentiation in term of throughput, round trip time, jitter, loss, …
Analyzer
Proof of Concept
27
Replay produces traffic nearly identical to the original traffic.
Validation: VPN Overhead
28
VPN overhead introduced by:• IPsec encapsulation• Latency added by going through the VPN server
We put VPN and replay server on the same machine to minimize la-tency
Validation: Detectable?
29
Shaper effect on YouTube replay traffic
Validation: Shaping Result
30
Effect of changing different parameters of YouTube traffic on detection of commercial shaping device
Changes in traffic Detection result using:
Original ports Different ports
No changes YouTube YouTube
Added a packet with 1 byte of data to the beginning of traffic HTTP P2P
Added 1 byte of random data to the beginning of first packet HTTP P2P
Replaced “GET” with a random string (same size) HTTP P2P
Replaced “youtube” string with a random one (first packet only) HTTP P2P
Replaced “youtube” string with a random one (first packet, HOST header only)
YouTube YouTube
Added one byte of random data to the end of first packet YouTube YouTube
Added “GET” to beginning of first packet YouTube YouTube
Evaluating Techniques
31
How to determine differentiation?
Hard to detect shaping here!
Techniques Comparison
32
Comparison of two-sample KS test (NetPo-lice) and our Weighted KS test
We allow a difference up to: threshold t = amax/w
a/w
Accuracy
33
Accuracy against loss
Accuracy for different apps
HOW CAN TECHNOLOGY HELP?
• Increasing transparency of traffic differentiation
• Give users tools to detect traffic differentiation when it happens
• Glasnost (reading presentation)• Traffic Differentiator • (ACKs: Slides prepared by Arash Molavi Kakhki (NEU) &
Adrian Li (SBU))• Measure interconnect congestion
• https://www.caida.org/publications/presentations/2015/mapping_internet_interdomain_congestion_aims/mapping_internet_interdomain_congestion_aims.pdf
HANDS ON ACTIVITY
Try Differentiation detector app
(Differentiation detector in
Android market)