nsa skynet
TRANSCRIPT
-
8/9/2019 NSA SKYNET
1/40
-
8/9/2019 NSA SKYNET
2/40
-
8/9/2019 NSA SKYNET
3/40
-
8/9/2019 NSA SKYNET
4/40
-
8/9/2019 NSA SKYNET
5/40
-
8/9/2019 NSA SKYNET
6/40
-
8/9/2019 NSA SKYNET
7/40
-
8/9/2019 NSA SKYNET
8/40
-
8/9/2019 NSA SKYNET
9/40
-
8/9/2019 NSA SKYNET
10/40
-
8/9/2019 NSA SKYNET
11/40
-
8/9/2019 NSA SKYNET
12/40
-
8/9/2019 NSA SKYNET
13/40
-
8/9/2019 NSA SKYNET
14/40
-
8/9/2019 NSA SKYNET
15/40
-
8/9/2019 NSA SKYNET
16/40
-
8/9/2019 NSA SKYNET
17/40
-
8/9/2019 NSA SKYNET
18/40
-
8/9/2019 NSA SKYNET
19/40
-
8/9/2019 NSA SKYNET
20/40
-
8/9/2019 NSA SKYNET
21/40
-
8/9/2019 NSA SKYNET
22/40
Given a handful of courier selectors, can we find othersthat “behave similarly” by analyzing GSM metadata?
It’s worth noting that:• we are looking for
different people usingphones in similar ways• without using any call
chaining techniquesfrom known selectors
• by scanning throughall selectors seen inPakistan that have notleft Af/Pak (~55M)
TOP SECRET//COMINT//REL TO USA, FVEY
TOP SECRET//COMINT//REL TO USA, FVEY
-
8/9/2019 NSA SKYNET
23/40
-
8/9/2019 NSA SKYNET
24/40
This presentation describes our search forAQSL couriers using behavioral profiling
Preliminary SIGINT Findings
Cross Validation Experimenton AQSL Couriers
Behavioral Feature Extraction
TOP SECRET//COMINT//REL TO USA, FVEY
TOP SECRET//COMINT//REL TO USA, FVEY
-
8/9/2019 NSA SKYNET
25/40
Counting unique UCELLIDs shows that courierstravel more often than typical Pakistani selectors
TOP SECRET//COMINT//REL TO USA, FVEY
TOP SECRET//COMINT//REL TO USA, FVEY
TOP SECRET//COMINT//REL TO USA FVEY
-
8/9/2019 NSA SKYNET
26/40
By examining multiple features at once, we can see someindicative behaviors of our courier selectors
TOP SECRET//COMINT//REL TO USA, FVEY
TOP SECRET//COMINT//REL TO USA, FVEY
-
8/9/2019 NSA SKYNET
27/40
TOP SECRET//COMINT//REL TO USA FVEY
-
8/9/2019 NSA SKYNET
28/40
Now, we’ll describe a cross validation experimenton the AQSL selectors that we were provided
Cross Validation Experimenton AQSL Couriers
TOP SECRET//COMINT//REL TO USA, FVEY
TOP SECRET//COMINT//REL TO USA, FVEY
TOP SECRET//COMINT//REL TO USA FVEY
-
8/9/2019 NSA SKYNET
29/40
Our initial detector uses the centroid of the AQSLcouriers to “find other selectors like these”
AQSL Cross-ValidationExperiment• 7 MSISDN/IMSI pairs• Hold each pair out
and score them whentraining the centroidon the rest
• Assume that random
draws of Pakistaniselectors arenontargets
• How well do we do?
TOP SECRET//COMINT//REL TO USA, FVEY
TOP SECRET//COMINT//REL TO USA, FVEY
TOP SECRET//COMINT//REL TO USA FVEY
-
8/9/2019 NSA SKYNET
30/40
Our initial detector uses the centroid of the AQSLcouriers to “find other selectors like these”
AQSL Cross-ValidatExperiment• Initial experiments
showed EER in
10-20% range• Here, performance
much worse againthese nontargets:
• Seen in Pakistan• Not seen outside
Af/Pak• Not FVEY selecto
TOP SECRET//COMINT//REL TO USA, FVEY
TOP SECRET//COMINT//REL TO USA, FVEY
TOP SECRET//COMINT//REL TO USA FVEY
-
8/9/2019 NSA SKYNET
31/40
Statistical algorithms are able to find the couriers at verylow false alarm rates, if we’re allowed to miss half of them
Random ForestClassifier • 7 MSISDN/IMSI pairs• Hold each pair out and
then try to find them af learning how to distingremaining couriers froother Pakistanis(using 100k random selectors here)
• Assume that randomdraws of Pakistaniselectors are nontarge
• 0.18% False Alarm Ra50% Miss Rate
better
TOP SECRET//COMINT//REL TO USA, FVEY
TOP SECRET//COMINT//REL TO USA, FVEY
TOP SECRET//COMINT//REL TO USA FVEY
-
8/9/2019 NSA SKYNET
32/40
We’ve been experimenting with several errormetrics on both small and large test sets
100k Test Selectors 55M Test Selectors
Training Data Classifier Features
False AlarmRate at 50%
Miss Rate
MeanReciprocal
Rank
TaskedSelectors in
Top 500
TaskedSelectors in
Top 100
None Random None 50%1/23k
(simulated)
0.64
(active/Pak)
0.13
(active/Pak)
KnownCouriers
CentroidAll 20% 1/18k
Outgoing
43% 1/27k
RandomForest
0.18% 1/9.9 5 1+ AnchorySelectors
Random Forest:• 0.18% false alarm rate at 50% miss rate• 7x improvement over random performance when
evaluating its tasked precision at 100
TOP SECRET//COMINT//REL TO USA, FVEY
TOP SECRET//COMINT//REL TO USA, FVEY
TOP SECRET//COMINT//REL TO USA, FVEY
-
8/9/2019 NSA SKYNET
33/40
To get more training data we scraped selectors from S2I11Anchory reports containing keyword “courier”
Anchory Selectors• Searched for reports
containing “S2I11”
AND “courier”• Filtered out non-mobile
numbers and keptselectors with“interesting” travelpatterns seen inSmartTracker
TOP SECRET//COMINT//REL TO USA, FVEY
TOP SECRET//COMINT//REL TO USA, FVEY
TOP SECRET//COMINT//REL TO USA, FVEY
-
8/9/2019 NSA SKYNET
34/40
Adding selectors from Anchory reports to the training datareduced the false alarm rates even further
Anchory Selectors• Searched for reports
containing “S2I11”
AND “courier”• Filtered out non-mob
numbers and keptselectors with“interesting” travelpatterns seen inSmartTracker
TOP SECRET//COMINT//REL TO USA, FVEY
TOP SECRET//COMINT//REL TO USA, FVEY
TOP SECRET//COMINT//REL TO USA, FVEY
-
8/9/2019 NSA SKYNET
35/40
We’ve been experimenting with several errormetrics on both small and large test sets
100k Test Selectors 55M Test Selectors
Training Data Classifier Features
False AlarmRate at 50%
Miss Rate
MeanReciprocal
Rank
TaskedSelectors in
Top 500
TaskedSelectors in
Top 100
None Random None 50%1/23k
(simulated)
0.64
(active/Pak)
0.13
(active/Pak)
KnownCouriers
CentroidAll 20% 1/18k
Outgoing
43% 1/27k
RandomForest
0.18% 1/9.9 5 1+ AnchorySelectors
0.008% 1/14 21 6
Random Forest trained on Known Couriers + Anchory Selectors:• 0.008% false alarm rate at 50% miss rate• 46x improvement over random performance when
evaluating its tasked precision at 100
,
TOP SECRET//COMINT//REL TO USA, FVEY
TOP SECRET//COMINT//REL TO USA, FVEY
-
8/9/2019 NSA SKYNET
36/40
Now, we’ll investigate some findings after running theseclassifiers on +55M Pakistani selectors via MapReduce
Preliminary SIGINT Findings
,
TOP SECRET//COMINT//REL TO USA, FVEY
-
8/9/2019 NSA SKYNET
37/40
-
8/9/2019 NSA SKYNET
38/40
-
8/9/2019 NSA SKYNET
39/40
TOP SECRET//COMINT//REL TO USA, FVEY
-
8/9/2019 NSA SKYNET
40/40
Preliminary results indicate that we’re on theright track, but much remains
Cross Validation Experiment: – Random Forest classifier operating at
0.18% false alarm rate at 50% miss – Enhancing training data with Anchory
selectors reduced that to 0.008% – Mean Reciprocal Rank is ~1/10
Preliminary SIGINT Findings:
– Behavioral features helped discoversimilar selectors with “courier-like”travel patterns
– High number of tasked selectors atthe top is hopefully indicative of thedetector performing well “in the wild”