strengthening forensic investigations of child pornography on p2p … · 2017-03-23 ·...

26
Measurement, Forensics, and Investigations Measurement Results Tagging Strengthening Forensic Investigations of Child Pornography on P2P Networks Marc Liberatore 1 Brian Neil Levine 1 Clay Shields 2 1 University of Massachusetts Amherst 2 Georgetown University Conference on emerging Networking EXperiments and Technologies (CoNEXT 2010) Levine, Liberatore, and Shields Strengthening Forensic Investigations . . . on P2P Networks

Upload: others

Post on 05-Jun-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Strengthening Forensic Investigations of Child Pornography on P2P … · 2017-03-23 · Measurement, Forensics, and Investigations Measurement Results Tagging Strengthening Forensic

Measurement, Forensics, and InvestigationsMeasurement Results

Tagging

Strengthening Forensic Investigationsof Child Pornography on P2P Networks

Marc Liberatore1 Brian Neil Levine1 Clay Shields2

1University of Massachusetts Amherst2Georgetown University

Conference on emerging Networking EXperimentsand Technologies (CoNEXT 2010)

Levine, Liberatore, and Shields Strengthening Forensic Investigations . . . on P2P Networks

Page 2: Strengthening Forensic Investigations of Child Pornography on P2P … · 2017-03-23 · Measurement, Forensics, and Investigations Measurement Results Tagging Strengthening Forensic

Measurement, Forensics, and InvestigationsMeasurement Results

Tagging

Outline

Measurement, forensics, and investigations

Measurements of P2P distribution of child pornography (CP)

Tagging, a technique for improving the value of evidence

Levine, Liberatore, and Shields Strengthening Forensic Investigations . . . on P2P Networks

Page 3: Strengthening Forensic Investigations of Child Pornography on P2P … · 2017-03-23 · Measurement, Forensics, and Investigations Measurement Results Tagging Strengthening Forensic

Measurement, Forensics, and InvestigationsMeasurement Results

Tagging

Measurement vs. Forensics

Network measurement is a sampling of relevant information abouta network. Network measurement aims to meet a scientificstandard.

Forensic measurement is a set of measurements used to establishidentity, intent, and actions. Forensic measurement aims to meet alegal standard.

Levine, Liberatore, and Shields Strengthening Forensic Investigations . . . on P2P Networks

Page 4: Strengthening Forensic Investigations of Child Pornography on P2P … · 2017-03-23 · Measurement, Forensics, and Investigations Measurement Results Tagging Strengthening Forensic

Measurement, Forensics, and InvestigationsMeasurement Results

Tagging

RoundUp — A Tool for P2P Investigations

We built and deployed RoundUp, a tool for forensic measurementof Gnutella.

RoundUp is in use by over 600 investigators in ICAC, as well as atthe FBI.

RoundUp measures Gnutella traffic, and can forensically measurespecific traffic.

(Liberatore, Erdely, Kerle, Levine, and Shields in [DFRWS2010])

Levine, Liberatore, and Shields Strengthening Forensic Investigations . . . on P2P Networks

Page 5: Strengthening Forensic Investigations of Child Pornography on P2P … · 2017-03-23 · Measurement, Forensics, and Investigations Measurement Results Tagging Strengthening Forensic

Measurement, Forensics, and InvestigationsMeasurement Results

Tagging

Finding Candidates

Goal

Find evidence of a crime through observations on the Internet.

Evidence:

may be direct or hearsay

includes files of interest, hash values, filenames

is ultimately associated with a user (IP address? GUID?)

Use the p2p system to find candidates for further investigation.

This process is measurement!

Levine, Liberatore, and Shields Strengthening Forensic Investigations . . . on P2P Networks

Page 6: Strengthening Forensic Investigations of Child Pornography on P2P … · 2017-03-23 · Measurement, Forensics, and Investigations Measurement Results Tagging Strengthening Forensic

Measurement, Forensics, and InvestigationsMeasurement Results

Tagging

Evidence

A candidate is chosen for further investigation, by jurisdiction,type/quantity of files, observed history.

The investigator directly connects to:

determine all files shared by a peer

find other corroborating evidence (IP, GUID, vendor id)

perform a single-source download

This process should be forensic measurement!

Levine, Liberatore, and Shields Strengthening Forensic Investigations . . . on P2P Networks

Page 7: Strengthening Forensic Investigations of Child Pornography on P2P … · 2017-03-23 · Measurement, Forensics, and Investigations Measurement Results Tagging Strengthening Forensic

Measurement, Forensics, and InvestigationsMeasurement Results

Tagging

Subpoena and Search Warrant

Network investigation done; shoe-leather work remains:

Subpoena ISP for DHCP records / billing information

Search warrant for premises — written broadly

Once on site:

Examine media and seize if appropriate

Validate that evidence on media corresponds to networkobservations

Levine, Liberatore, and Shields Strengthening Forensic Investigations . . . on P2P Networks

Page 8: Strengthening Forensic Investigations of Child Pornography on P2P … · 2017-03-23 · Measurement, Forensics, and Investigations Measurement Results Tagging Strengthening Forensic

Measurement, Forensics, and InvestigationsMeasurement Results

Tagging

Identifying Offenders

Investigators use observed IPs to obtain search warrants.

Investigators use network (IP) and application (GUID, PeerIds)identifiers to identify offenders, link observations, discern intent.

What did investigators observe?How good (reliable, consistent, etc.) are IPs and GUIDs?

Levine, Liberatore, and Shields Strengthening Forensic Investigations . . . on P2P Networks

Page 9: Strengthening Forensic Investigations of Child Pornography on P2P … · 2017-03-23 · Measurement, Forensics, and Investigations Measurement Results Tagging Strengthening Forensic

Measurement, Forensics, and InvestigationsMeasurement Results

Tagging

Measurement Summary

From 2009-10-05 through 2010-03-02:

3.07 million IP addresses

799,556 GUIDs

19,000 distinct items of CP (by hash)

1.0E+00  

1.0E+01  

1.0E+02  

1.0E+03  

1.0E+04  

1.0E+05  

1.0E+06  

1   10   100   1000   10000  

GUIDs  with  ≥  x  Kn

own  CP

 Files  

Number  of  Known  Child  Pornography  Files  Shared  

All  GUIDs  

GUIDs  in  US  

0  

0.2  

0.4  

0.6  

0.8  

1  

1   10   100  

F c(x)    

Dis*nct  dates  seen  

GUIDs  with  at  least  10  CP  files  GUIDs  with  at  least  3  CP  files  GUIDs  wth  at  least  1  CP  file  

Levine, Liberatore, and Shields Strengthening Forensic Investigations . . . on P2P Networks

Page 10: Strengthening Forensic Investigations of Child Pornography on P2P … · 2017-03-23 · Measurement, Forensics, and Investigations Measurement Results Tagging Strengthening Forensic

Measurement, Forensics, and InvestigationsMeasurement Results

Tagging

Gnutella IDs

Many GUIDs are mapped 1:1 to IPs – but not all!

1.0E+00  

1.0E+01  

1.0E+02  

1.0E+03  

1.0E+04  

1.0E+05  

1.0E+06  

1.0E+07  

1   10   100   1000  

Num

ber  of  GUIDs  with  ≥    IP

s  (or  Ci7es)  

IPs  (or  Ci7es)    Per  GUID  

Ci/es  per  GUID  

IPs  per  GUID  

Levine, Liberatore, and Shields Strengthening Forensic Investigations . . . on P2P Networks

Page 11: Strengthening Forensic Investigations of Child Pornography on P2P … · 2017-03-23 · Measurement, Forensics, and Investigations Measurement Results Tagging Strengthening Forensic

Measurement, Forensics, and InvestigationsMeasurement Results

Tagging

BitTorrent IDs

Same trends are present in BitTorrent:

1.0E+00  

1.0E+01  

1.0E+02  

1.0E+03  

1.0E+04  

1.0E+05  

1.0E+06  

1.0E+07  

1   10   100   1000  

Num

ber  of  PeerIDs  with  ≥  IPs  (or  Ci5es)  

IPs  (or  Ci5es)  per  PeerID  

Ci/es  per  PeerID  

IPs  per  PeerID  

Levine, Liberatore, and Shields Strengthening Forensic Investigations . . . on P2P Networks

Page 12: Strengthening Forensic Investigations of Child Pornography on P2P … · 2017-03-23 · Measurement, Forensics, and Investigations Measurement Results Tagging Strengthening Forensic

Measurement, Forensics, and InvestigationsMeasurement Results

Tagging

IPs to GUIDs Also Unreliable

Again, many are 1:1, but not all.

1.0E+00  

1.0E+01  

1.0E+02  

1.0E+03  

1.0E+04  

1.0E+05  

1.0E+06  

1.0E+07  

1   10   100   1000  

Num

ber  of  IP

s  with  ≥  App

-­‐level  IDs  

App-­‐level  IDs  per  IP  address  

Gnutella  GUIDs  

BitTorrent  PeerIDs  

Levine, Liberatore, and Shields Strengthening Forensic Investigations . . . on P2P Networks

Page 13: Strengthening Forensic Investigations of Child Pornography on P2P … · 2017-03-23 · Measurement, Forensics, and Investigations Measurement Results Tagging Strengthening Forensic

Measurement, Forensics, and InvestigationsMeasurement Results

Tagging

What’s Going On Here?

Many anomolies can be explained:

One GUID observed in 329 cities, using 398 IP addresses —actually a botnet

Many GUIDs stay in the same geographic area — mobile users

IPs with several GUIDs may be NAT

Some clients generate new IDs per download

Tor

But we know this list isn’t exhaustive.And we can’t always map anomolies to explanations.

Levine, Liberatore, and Shields Strengthening Forensic Investigations . . . on P2P Networks

Page 14: Strengthening Forensic Investigations of Child Pornography on P2P … · 2017-03-23 · Measurement, Forensics, and Investigations Measurement Results Tagging Strengthening Forensic

Measurement, Forensics, and InvestigationsMeasurement Results

Tagging

What to Do?

Recall goals: to identify offenders, link observations, discern intent.

Proposed Solution

Tag network traffic such that recoverable markings are left onsuspect’s machine.

Levine, Liberatore, and Shields Strengthening Forensic Investigations . . . on P2P Networks

Page 15: Strengthening Forensic Investigations of Child Pornography on P2P … · 2017-03-23 · Measurement, Forensics, and Investigations Measurement Results Tagging Strengthening Forensic

Measurement, Forensics, and InvestigationsMeasurement Results

Tagging

An Analogy

Pay drug dealers with marked bills, recover bills on arrest.

Levine, Liberatore, and Shields Strengthening Forensic Investigations . . . on P2P Networks

Page 16: Strengthening Forensic Investigations of Child Pornography on P2P … · 2017-03-23 · Measurement, Forensics, and Investigations Measurement Results Tagging Strengthening Forensic

Measurement, Forensics, and InvestigationsMeasurement Results

Tagging

The Tagging Process

Deliver data to remote clients with tagged bits; recover bits onarrest. Key concerns:

Finding appropriate vectors for tag delivery.

Ensuring tags are covert.

Quantifying the false positive rate.

Levine, Liberatore, and Shields Strengthening Forensic Investigations . . . on P2P Networks

Page 17: Strengthening Forensic Investigations of Child Pornography on P2P … · 2017-03-23 · Measurement, Forensics, and Investigations Measurement Results Tagging Strengthening Forensic

Measurement, Forensics, and InvestigationsMeasurement Results

Tagging

Vectors and Covert Tags

An ideal vector allows arbitrary input, persists indefinitely, and isdetrimental to disable.We’ll take what we can get, for example:

BitTorrent peer IP caches

DNS cache entries

p2p payload data

log files

Ideally we’d find them by automated (static?) analysis.

We’ll tag with bit strings that have no overt meaning.

Levine, Liberatore, and Shields Strengthening Forensic Investigations . . . on P2P Networks

Page 18: Strengthening Forensic Investigations of Child Pornography on P2P … · 2017-03-23 · Measurement, Forensics, and Investigations Measurement Results Tagging Strengthening Forensic

Measurement, Forensics, and InvestigationsMeasurement Results

Tagging

Example Tags

BitTorrent peer caches store IPs:

{’ip’: ’83.253.52.14’,

’port’: 6886,

’prot’: 1,

’src’: ’Tracker’}, ... },

{’ip’: ’87.7.101.196’,

’port’: 54650,

’prot’: 1,

’src’: ’PeerExchange’},

...

Values can be added to a peer’s cache through peer exchange.Investigators can use these IPs (which may be spoofed) as a tag.

Levine, Liberatore, and Shields Strengthening Forensic Investigations . . . on P2P Networks

Page 19: Strengthening Forensic Investigations of Child Pornography on P2P … · 2017-03-23 · Measurement, Forensics, and Investigations Measurement Results Tagging Strengthening Forensic

Measurement, Forensics, and InvestigationsMeasurement Results

Tagging

More Tags

Vuze log files record all unknown PeerIDs:

- [2009] Log File Opened for Vuze 4.2.0.2

- [0406 09:16:22] unknown_client [LTEP]:

"Unknown KG/2.2.2.0" / "KGet/2.2.2"

[4B4765742F322E322E32],

Peer ID: 2D4B47323232302D494775533761494E45425245

- [0406 09:22:14] mismatch_id [LTEP]:

"BitTorrent SDK 2.0.0.0" / "BitTorrent SDK 2.0"

[426974546F7272656E742053444B20322E30],

Peer ID: 2D4245323030302D275951473141595027646262

PeerIDs are arbitrary, 20-byte values.sha1(‘‘Detective John Doe, case #1234, ...’’)

would make a great PeerID-based tag.

Levine, Liberatore, and Shields Strengthening Forensic Investigations . . . on P2P Networks

Page 20: Strengthening Forensic Investigations of Child Pornography on P2P … · 2017-03-23 · Measurement, Forensics, and Investigations Measurement Results Tagging Strengthening Forensic

Measurement, Forensics, and InvestigationsMeasurement Results

Tagging

False Positive Rate

Let tags be of length n.Assume a priori a number of taggable eventsT = 2n/f , where f > 1.If an investigator recovers L candidate tagsfrom a machine:

Pr{False positive}= 1− Pr{no candidate matches}

= 1− (1− 2n/f

2n)L

But often vectors have small n: If L = 2000and n ≤ 32, the chances of a false positiveis greater than 3%. Too high?

Tagging table:

...

length ntags

2^npossible tags)

T=2^(n/f)tags ever used

2^n - 2(n/f)never used

Levine, Liberatore, and Shields Strengthening Forensic Investigations . . . on P2P Networks

Page 21: Strengthening Forensic Investigations of Child Pornography on P2P … · 2017-03-23 · Measurement, Forensics, and Investigations Measurement Results Tagging Strengthening Forensic

Measurement, Forensics, and InvestigationsMeasurement Results

Tagging

Alternate Tagging Techniques: Ordered Subsets

Solution: Break each tag into k subtags that fit constraints.

Subtags can be stored in a preserved order (e.g., a log file):

Pr{False positive} = 1− Pr{no full tag matches}

≤ 1−(

1−(L

k

)1

2n

)2nf

Without ordering, there are several other approaches.

Levine, Liberatore, and Shields Strengthening Forensic Investigations . . . on P2P Networks

Page 22: Strengthening Forensic Investigations of Child Pornography on P2P … · 2017-03-23 · Measurement, Forensics, and Investigations Measurement Results Tagging Strengthening Forensic

Measurement, Forensics, and InvestigationsMeasurement Results

Tagging

Alternate Tagging Techniques: Unordered Subsets

We can subtag k times per observation

Pr{F.P.} = Pr{k or more of L subtags match}

=1−k−1∑i=0

(L

i

)(2

nfk− n

k )i (1− (2nfk− n

k ))L−i (1)

We can reserve bits to impose order:

Pr{F.P.} = 1− Pr{none of (Lk )k subtags match}

= 1−

(1− 2rk/f

2rk

)( Lk)k

Subtags can contain implicit ordering (e.g., fixed CIDR bits): aspecial case of built-in reserved bits.

Levine, Liberatore, and Shields Strengthening Forensic Investigations . . . on P2P Networks

Page 23: Strengthening Forensic Investigations of Child Pornography on P2P … · 2017-03-23 · Measurement, Forensics, and Investigations Measurement Results Tagging Strengthening Forensic

Measurement, Forensics, and InvestigationsMeasurement Results

Tagging

More Bits, Lower FP Probability

0 5 10 15 20 25 30

10−13

10−11

10−9

10−7

10−5

10−3

10−1

Bits per subtag

Pro

ba

bili

ty o

f F

als

e P

ositiv

e

sequenced subtags (A)

labeled subtags (B2)

k subtags in set (B1)

CIDR blocks

Levine, Liberatore, and Shields Strengthening Forensic Investigations . . . on P2P Networks

Page 24: Strengthening Forensic Investigations of Child Pornography on P2P … · 2017-03-23 · Measurement, Forensics, and Investigations Measurement Results Tagging Strengthening Forensic

Measurement, Forensics, and InvestigationsMeasurement Results

Tagging

More Bits, More Taggable Sessions

0 5 10 15 20 25 30

102

104

106

108

1010

1012

1014

Bits per subtag

Num

ber

of ta

ggable

sessio

ns

sequenced subtags (A)

labeled subtags (B2)

k subtags in set (B1)

CIDR blocks

Levine, Liberatore, and Shields Strengthening Forensic Investigations . . . on P2P Networks

Page 25: Strengthening Forensic Investigations of Child Pornography on P2P … · 2017-03-23 · Measurement, Forensics, and Investigations Measurement Results Tagging Strengthening Forensic

Measurement, Forensics, and InvestigationsMeasurement Results

Tagging

Conclusions

Forensic measurements have different standards and goalsfrom typical network measurements.

Network and application-level identifiers may suffice forprobable cause, but are not 100% reliable.

Tagging allows for the flexible creation of forensicallyverifiable identifiers.

Levine, Liberatore, and Shields Strengthening Forensic Investigations . . . on P2P Networks

Page 26: Strengthening Forensic Investigations of Child Pornography on P2P … · 2017-03-23 · Measurement, Forensics, and Investigations Measurement Results Tagging Strengthening Forensic

Measurement, Forensics, and InvestigationsMeasurement Results

Tagging

Acknowledgments

This work was supported in part by National Institute of Justice Award

2008-CE-CX-K005 and in part by the National Science Foundation

awards CNS-0905349, CNS-1018615 and DUE-0830876. The opinions,

findings, and conclusions or recommendations expressed in this

publication are those of the authors and do not necessarily reflect those

of their employers, the U.S. Department of Justice, the National Science

Foundation, or ICAC.

Levine, Liberatore, and Shields Strengthening Forensic Investigations . . . on P2P Networks