cse 592 internet censorship (fall 2015) lecture 07 prof. phillipa gill – stony brook university

CSE 592INTERNET CENSORSHIP

(FALL 2015)

LECTURE 07

PROF. PHILLIPA GILL – STONY BROOK UNIVERSITY

WHERE WE ARE

Admin note:

• Update on grades (coming soon!)

Last time:

• Challenges of measuring censorship

• Principles to keep in mind when designing censorship measurements

• Ethics/legality (finishing this today)

• Creative solutions to the challenges

• SpookyScan

• Questions?

TEST YOUR UNDERSTANDING

1. What types of information controls might we want to study?

2. What sort of data/applications would we want to study in each case?

3. What does the lack of ground truth mean for how we interpret censorship data?

4. What challenges arise because of the adversarial environment where we make censorship measurements?

5. How can we try to reduce these challenges on our work?

6. What data sources might we correlate with to validate censorship?

7. What is the dual-use that censorship measurements can exploit?

8. What is an important property of communication channels for censorship?

9. Can someone explain how SpookyScan works?

TODAY

• Encore/Ethics of measuring censorship (wrap up of last time)

• Traffic Differentiation & Net Neutrality

• Glasnost

5

ENCORE: LIGHTWEIGHT MEASUREMENT OF WEB CENSORSHIP WITH CROSS-ORIGIN REQUESTS

Governments around the world realize Internet is a key communication tool

• … working to clamp down on it!

How can we measure censorship?

Main approaches:

User-based testing: Give users software/tools to perform measurements

• E.g., ONI testing, ICLab

External measurements: Probe the censor from outside the country via carefully crafted packets/probes

• E.g., IPID side channels, probing the great firewall/great cannon

https://opennet.net/

https://iclab.org/

http://arxiv.org/pdf/1312.5739v1.pdf

http://www.csd.uoc.gr/~hy558/papers/conceptdoppler.pdf

https://citizenlab.org/2015/04/chinas-great-cannon/

6

ENCORE: LIGHTWEIGHT MEASUREMENT OF WEB CENSORSHIP WITH CROSS-ORIGIN REQUESTS

Censorship measurement challenges:

Gaining access to vantage points

Managing user risk

Obtaining high fidelity technical data

Encore key idea:Script to have

browser query Web

sites for testing

ENCORE: USING CROSS SITE JAVA SCRIPT TO MEASURE CENSORSHIP

• Basic idea: Recruit Web masters instead of vantage points

• Have the Web master include a javascript that causes the user’s browser to fetch sites to be tested

• Use timing information to infer whether resources are fetched directly

• Operates in an ‘opt-out’ model

• User may have already executed the javascript prior to opting out• Argument

• Not requiring informed consent gives users plausible deniability• Steps taken to mitigate risk

• Include common 3rd party domains (they’re already loaded by many pages anyways)

• Include 3rd parties that are already included on the main site• One project option is to investigate these strategies!

Example site hosting Encore: http://www.cs.princeton.edu/~feamster/

ETHICAL CONSIDERATIONS

• Different measurement techniques have different levels of risk

• In-country measurements

• How risky is it to have people access censored sites?• What is the threshold for risk?• Risk-benefit trade off?• How to make sure people are informed?

• Side channel measurements

• Causes unsuspecting clients to send RSTs to a server• What is the risk? • Not stateful communication …

• … but what about a censor that just looks at flow records?

• Mitigation idea: make sure you’re not on a user device• Javascript-based measurements

• Is lack of consent enough deniability?

TRAFFIC DIFFERENTIATION

• The act of identifying and discriminating against certain types of Internet traffic

• Example:

• Comcast + BitTorrent

Comcast's interference affects all types of content, meaning that, for instance, an independent movie producer who wanted to distribute his work using BitTorrent and his Comcast connection could find that difficult or impossible — as would someone pirating music.

THE RESULT?

WHAT EXACTLY IS TRAFFIC DIFFERENTIATION?

• Traffic is identified and performance is degraded

• How can traffic be identified?

• IP address• Port• Host name• Payload• Flow level characteristics

• Large body of work on “traffic classification” to identify different types of traffic

• Many products: e.g., Sandvine• How might performance be degraded?

• Lower priority queues • Spoofing dupacks (tested but not deployed)

UNDERLYING ISSUE: NET NEUTRALITY

They want to deliver vast amounts of information over the Internet. And again, the Internet is not something that you just dump something on. It's not a big truck. It's a series of tubes.

And if you don't understand, those tubes can be filled and if they are filled, when you put your message in, it gets in line and it's going to be delayed by anyone that puts into that tube enormous amounts of material, enormous amounts of material.

NET NEUTRALITY

The principle that ISPs and governments should treat data on the Internet equally

• No discrimination (performance or cost) based on

• User, content, site, application, etc.• Debated since early 00’s

• Mainly in context of last-mile providers wanting to block certain sites/protocols

• Example: A local ISP approached a colleague for a collaboration on traffic classification… guess why?

• Vint Cerf (co-inventor of IP), Tim Berners-Lee (creator of Web) speak out in favor of Net Neutrality

HISTORY OF NET NEUTRALITY IN US

• 2008 FCC serves cease and desist to Comcast in relation to BitTorrent blocking

• June 2010 US court of appeals rules that FCC doesn’t have power to regulate ISP networks or management of its practices

• Dec. 2010 FCC Open Internet Order: bans cable television and phone providers from preventing access to competing services (eg., Netflix)

• 2012 variety of complaints: vs. AT&T (for restricting Facetime), Comcast (for restricting Netflix)

• Jan. 2014 court says FCC doesn’t have authority to enforce net neutrality because ISPs are not “common carriers”

• Common carrier is liable for goods it carries• E.g., oil pipelines• ISPs treated like common carriers but not liable for third party

content (e.g., slander, copyright infringement)

HISTORY OF NET NEUTRALITY IN US (2)

• As of Jan. ‘14 FCC could not enforce net neutrality because ISPs were not common carriers

• Issue: should ISPs be reclassified as common carriers (under Title II of the Communications act of 1934)

• Feb. 2015 – FCC votes to apply common carrier status to ISPs

• Mar. 2015 – FCC published new net neutrality rules

• Now net neutrality applies also to mobile networks

ALTERNATE VIEWS ON NET NEUTRALITY

• FCC rules about ``no blocking, no throttling and no paid prioritization’’ sound good but don’t address the real problem

• Key issue: lack of competition

• If ISPs had to compete on price and service there would be incentives for them to provide good performance

• Without competition…

• … ISPs can leave congested interconnects until content providers yield and pay for private interconnects

• Two technical mechanisms:

• Traffic differentiation: identify + degrade• Interconnect congestion: refuse to provide higher bandwidth

• …forces content providers into paid private peerings• Currently outside of the scope of the current FCC rules!

HOW CAN TECHNOLOGY HELP?

• Increasing transparency of traffic differentiation

• Give users tools to detect traffic differentiation when it happens

• Glasnost (reading presentation)• Traffic Differentiator • (ACKs: Slides prepared by Arash Molavi Kakhki (NEU) &

Adrian Li (SBU))• Measure interconnect congestion

• https://www.caida.org/publications/presentations/2015/mapping_internet_interdomain_congestion_aims/mapping_internet_interdomain_congestion_aims.pdf

Goals

Reliably detect differentiation in cellular networks

• On any app traffic

• Without requiring root privileges or OS modifications

• with few assumptions about traffic characteristics or packet shaper implementations

Our approach is the only known way to test differentiation from non-rooted mobile devices

19

Related Work

20

Switzerland Glasnost Us

Applications Tested P2P P2P and video Any application

Features Tested Packet manipulation Performance Both

Desktop App Yes Browser plugin Yes

Smartphone App No No Yes

Previous work explored this problem for limited protocols and in limited environments.

Other closely related work

NetDiff, NetPolice, NANO, Bonafide

Key Contributions• Design and implementation of a traffic differentiation detecting

system

• Validation of our approach using commercial shaping devices

• Evaluating statistical techniques for identifying such differentiation

• An Android app for any user to run our tests and see the results from our server

21

Record & Replay

22

1- Record:

2- Replay:

Methodology

23

Record target application traffic using meddle and

tcpdump

Replay traffic alternately,

tunneled and untunneled

Parse pcap and create transcript of

packets

Analyze throughput, RTT, jitter, packet loss

Record:• Avoid running TCPDUMP on users’ devices.• Utilize Meddle (a VPN proxy over IPsec) to record network traffic.

Methodology

24

Record target

application traffic using meddle and

tcpdump

Replay traffic

alternately, tunneled

and untunneled

Parse pcap and

create transcript of packets


Parse:• Create two objects, one for the client side and one for server side.• Handle unrelated/noise traffic.

Parserclient

server

Methodology

25

Record target

application traffic using

meddle and

tcpdump

Replay traffic


and untunneled

Parse pcap and



Replay:

• replay the salient features of application traffic such that it will be subject to differentiation from middleboxes

• alternate tunneling and plaintext to control visibility for packet shapers

Methodology

26

Record target

application traffic using

meddle and

tcpdump

Replay traffic


and untunneled

Parse pcap and



Analyze:

• quantify differentiation in term of throughput, round trip time, jitter, loss, …

Analyzer

Proof of Concept

27

Replay produces traffic nearly identical to the original traffic.

Validation: VPN Overhead

28

VPN overhead introduced by:• IPsec encapsulation• Latency added by going through the VPN server

We put VPN and replay server on the same machine to minimize la-tency

Validation: Detectable?

29

Shaper effect on YouTube replay traffic

Validation: Shaping Result

30

Effect of changing different parameters of YouTube traffic on detection of commercial shaping device

Changes in traffic Detection result using:

Original ports Different ports

No changes YouTube YouTube

Added a packet with 1 byte of data to the beginning of traffic HTTP P2P

Added 1 byte of random data to the beginning of first packet HTTP P2P

Replaced “GET” with a random string (same size) HTTP P2P

Replaced “youtube” string with a random one (first packet only) HTTP P2P

Replaced “youtube” string with a random one (first packet, HOST header only)

YouTube YouTube

Added one byte of random data to the end of first packet YouTube YouTube

Added “GET” to beginning of first packet YouTube YouTube

Evaluating Techniques

31

How to determine differentiation?

Hard to detect shaping here!

Techniques Comparison

32

Comparison of two-sample KS test (NetPo-lice) and our Weighted KS test

We allow a difference up to: threshold t = amax/w

a/w

Accuracy

33

Accuracy against loss

Accuracy for different apps

HOW CAN TECHNOLOGY HELP?

• Increasing transparency of traffic differentiation

• Give users tools to detect traffic differentiation when it happens

• Glasnost (reading presentation)• Traffic Differentiator • (ACKs: Slides prepared by Arash Molavi Kakhki (NEU) &

Adrian Li (SBU))• Measure interconnect congestion

• https://www.caida.org/publications/presentations/2015/mapping_internet_interdomain_congestion_aims/mapping_internet_interdomain_congestion_aims.pdf

HANDS ON ACTIVITY

Try Differentiation detector app

(Differentiation detector in

Android market)

cse 592 internet censorship (fall 2015) lecture 07 prof. phillipa gill – stony brook university

Documents

censorship data

internet censorship

testing encore

dns data

data sources

browser query web sites

users browser

recruit web masters