conceptdoppler : a weather tracker for internet censorship presenter : 장 공 수

19
ConceptDoppler : A Weather Tracker for Internet censor ship Presenter : 장 장 장

Upload: ami-caldwell

Post on 20-Jan-2016

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: ConceptDoppler : A Weather Tracker for Internet censorship Presenter : 장 공 수

ConceptDoppler : A Weather Tracker for Internet censorship

Presenter : 장 공 수

Page 2: ConceptDoppler : A Weather Tracker for Internet censorship Presenter : 장 공 수

Hanyang Univ. Computer Security Lab.

Paper Information

Title : ConceptDoppler : A Weather Tracker for Internet Censorship

Authors : Jedidiah R. Crandall, Daniel Zinn, Michael Byrd

Publish : ACM 2007

Page 3: ConceptDoppler : A Weather Tracker for Internet censorship Presenter : 장 공 수

Hanyang Univ. Computer Security Lab.

Content

1. INTRODUCTION

3. LSA-BASED THE PROBING

2. PROBING THE GFC

4. FUTURE WORK

5. CONCULSION

Page 4: ConceptDoppler : A Weather Tracker for Internet censorship Presenter : 장 공 수

Hanyang Univ. Computer Security Lab.

• Called the “Great Firewall of China,” or “Golden Shield”– IP address blocking– DNS redirection– Legal restrictions– etc…– Keyword filtering

• Blog servers, chat, HTTP traffic

All probing can be performed from outside of China

1. Introduction(1/3)

■ Internet Censorship in China

Page 5: ConceptDoppler : A Weather Tracker for Internet censorship Presenter : 장 공 수

Hanyang Univ. Computer Security Lab.

• Where is the keyword filtering implemented?– Internet measurement techniques to locate the

filtering routers

• What words are being censored?– Efficient probing via document summary techniques

1. Introduction(2/3)

■ This Research has Two Parts

Page 6: ConceptDoppler : A Weather Tracker for Internet censorship Presenter : 장 공 수

Hanyang Univ. Computer Security Lab.

■ Keyword-based Censorship

● The ability to filter keywords is an effective tool for governments that censor the Internet.

- Numerous techniques comprise censorship, including IP address blocking, DNS redirection, and a myriad of legal restictions, but the ability to filter keywords in URL requests or HTML responses allows a high granularity of control that achieves the censor’s goal with low cost. ( Manually filtering web content can also be precise but is prohibitively expensive.)※

. ● Censorship is an economic activity.

- The Internet has economic benefits and more blunt methods of censorship than keyword filtering, such as blocking entire web sites or services, decrease those benefits

ex) while the Chinese government has shut down e-mail service for entire ISPs, temporarily blocked Internet traffic from overseas universities, and could conceivably stop any flow of information, they have also been responsive to complaints about censorship from Chinese citizens.

1. Introduction(3/3)

Page 7: ConceptDoppler : A Weather Tracker for Internet censorship Presenter : 장 공 수

Hanyang Univ. Computer Security Lab.

2. Probing The GFC(1/5)

■ ConceptDoppler’s Infrastructure

They use the netfilter module Queue to capture all packets elicited by probes.

They access these packets in Perl and Python scripts, using SWIG to wrap the system library libipq.

They recorded all packets sent and received, in their entirety, in a PostgreSQL database.

They experiments require the construction of TCP/IP packets.

For this they used Scapy, a python library for packet manipulation.

Page 8: ConceptDoppler : A Weather Tracker for Internet censorship Presenter : 장 공 수

Hanyang Univ. Computer Security Lab.

2. Probing The GFC(2/5)

■ The GFC does not Filter peremptorily at All Time

Target : They launched probes against www.yahoo.cn for 72 hours.

Method

- They started by sending “FALUN” (a known filtered keyword) until they received

RSTs from the GFC at which point they switched to “TEST” (a word known to not be

filtered) until they got a valid HTTP response to our GET request.

- After each test that provoked a RST, They waited for 30 seconds before probing with

“TEST”; after tests that did not trigger RSTs, they waited for 5 seconds, then probed with

“FALUN”.

Slipping Filtered Keywords Through

Page 9: ConceptDoppler : A Weather Tracker for Internet censorship Presenter : 장 공 수

Hanyang Univ. Computer Security Lab.

2. Probing The GFC(3/5)

■ Filtering Statistics From 00:00 to 24:00

The x-axis is the time of day and the y-axis is measured in individual probes.

What is most important to notice in Figure is that there are diurnal patterns, with the GFC

filtering becoming less effective sometimes more than one fourth of offending packets

through, possibly during busy Internet traffic periods.

(A value of 0 on the x-axis of Figure corresponds to midnight 00:00 Pacific Standard Time

which is 3 in the afternoon 15:00 in Beijing.)

Page 10: ConceptDoppler : A Weather Tracker for Internet censorship Presenter : 장 공 수

Hanyang Univ. Computer Security Lab.

2. Probing The GFC(4/5)

■ Discovering GFC Routers

<Figuer : GFC router discover using TTLs>

The goal of this experiment To identify the IP address of the first GFC router between our probing site s and t, a target web site within China, as shown in Figure. The general idea of the experiment To increase the TTL field of the packets They send out, starting from low values corresponding to routers outside of China.

To identify GFC routers, Algorithm 1 randomly selects a target IP address from T, the list of targets compiled above.

Page 11: ConceptDoppler : A Weather Tracker for Internet censorship Presenter : 장 공 수

Hanyang Univ. Computer Security Lab.

2. Probing The GFC(5/5)

<ISP Distribution of First Hops> <Filtering by hop within China>

Filtering does not always, or even principally, occur at the first hop into China’s address space, with only 29.6% of filtering occurring at the first hop and 11.8% occurring beyond the third, with as many as 13 hops in one case; and

Routers within CHINANET-* perform 83.3% of all filtering.

☞ GFC ≠ Firewall

Page 12: ConceptDoppler : A Weather Tracker for Internet censorship Presenter : 장 공 수

Hanyang Univ. Computer Security Lab.

3. LSA-Based Probing(1/4)

■ Discovering Blacklisted Keywords Using LSA

To test for new filtered keywords efficiently, They must try only words that are

related to concepts that they suspect the government might filter.

Latent semantic analysis(LSA) is a way to summarize the semantics of a corpus

of text conceptually.

■ Reason of Using LSA

They encoded the terms with UTF-8 HTTP encoding and tested each against

search.yahoo.cn.com, waiting 100 seconds after a RST and 5 seconds otherwise.

A RST packet indicates that a word was filtered and is therefore on the blacklist.

Then by manual filtering they removed 56 false positives from the final filtered

keyword list.

Page 13: ConceptDoppler : A Weather Tracker for Internet censorship Presenter : 장 공 수

Hanyang Univ. Computer Security Lab.

LSA Background(1/2)

■ What is LSA?

Latent semantic analysis

Word-document model describes the occurrences of terms in documents

■ LSA Word-document matrix W

X =

d1 d2 ......................... dj .......... dN

w1

w2

wi

wM

wij

wij: weight(importance)

tfij : j-th terms’s count in i-th documents

dfj :i-th document’s count in j-th term’s

Page 14: ConceptDoppler : A Weather Tracker for Internet censorship Presenter : 장 공 수

Hanyang Univ. Computer Security Lab.

To : orthogonal, unit-length columns Do : orthogonal, unit-length columns So : Diagonal Matrix t : Matrix X’s terms d : Matirx X’s documents m : Matix X’s rank (< min(t,d))

T : t × k S : k × k D’ : k × d

LSA Background(2/2)

ExampleExample

Page 15: ConceptDoppler : A Weather Tracker for Internet censorship Presenter : 장 공 수

Hanyang Univ. Computer Security Lab.

■ Start With a Large Corpus (Wikipedia of Chinese-lang)

3. LSA-Based Probing(2/4)

■ LSA of Chinese Wikipedia

•n=94,863 documents and m=942,033 terms

Page 16: ConceptDoppler : A Weather Tracker for Internet censorship Presenter : 장 공 수

Hanyang Univ. Computer Security Lab.

3. LSA-Based Probing(4/4)

■ LSA Results

In total, they discovered 122 unknown keywords.

Page 17: ConceptDoppler : A Weather Tracker for Internet censorship Presenter : 장 공 수

Hanyang Univ. Computer Security Lab.

4. Future Work

■ Discovering Unknown Keywords1. Applying LSA to larger Chinese corpuses

2. Keeping the corpus up-to-date on current events

3. Technical implementation

4. Implementation possibilities

5. HTML responses

6. More complex rulesets

7. Imprecise filtering(ex : breasts, Cancer-breasts)

■ Internet Measurement1. IP tunneling or traffic engineering.

2. IXPs Technical implementation.

3. Route dependency.

4. HTML responses.

5. Destination dependency.

Page 18: ConceptDoppler : A Weather Tracker for Internet censorship Presenter : 장 공 수

Hanyang Univ. Computer Security Lab.

5. Conclusions

• GFC keyword filtering is more a panopticon than a firewall

motivating surveillance rather than evasion

as a focus of technical research.

☞ GFC ≠ Firewall, GFC ≈ Panopticon

• Probing the GFC is arduous motivating efficient probing via LSA

Page 19: ConceptDoppler : A Weather Tracker for Internet censorship Presenter : 장 공 수

Hanyang Univ. Computer Security Lab.

© T

he N

ew

York

er

Colle

ctio

n 1

99

3 P

ete

r S

tein

er

from

cart

oonlin

k.c

om

. A

ll ri

ghts

rese

rved

.

Thank you very much !!!Thank you very much !!!