![Page 1: RAIN: Refinable Attack Investigation with On-demand Inter](https://reader034.vdocuments.mx/reader034/viewer/2022042415/625f28fb33c4a004a215a025/html5/thumbnails/1.jpg)
RAIN: Refinable Attack Investigation with
On-demand Inter-process Information
Flow Tracking
Yang Ji, Sangho Lee, Evan Downing, Weiren Wang, Mattia
Fazzini, Taesoo Kim, Alessandro Orso, and Wenke Lee
ACM CCS 2017
Oct 31, 2017
![Page 2: RAIN: Refinable Attack Investigation with On-demand Inter](https://reader034.vdocuments.mx/reader034/viewer/2022042415/625f28fb33c4a004a215a025/html5/thumbnails/2.jpg)
More and more data breaches
2
658558
819924
1029853
1155
815918
513
1594
428
2459
316427
665 721
1901
0
500
1000
1500
2000
2500
3000
2013-H1 2013-H2 2014-H1 2014-H2 2015-H1 2015-H2 2016-H1 2016-H2 2017-H1
DATA BREACHES
(SOURCE: BREACH LEVEL INDEX BY GEMALTO)
Number of data breaches Number of breached records (mil)
![Page 3: RAIN: Refinable Attack Investigation with On-demand Inter](https://reader034.vdocuments.mx/reader034/viewer/2022042415/625f28fb33c4a004a215a025/html5/thumbnails/3.jpg)
Is attack investigation accurate?
3
send
sendsendsend
read
read
read
A
B
C
“Hmm,
I only want C!”
A, B, or C ?
Dependency confusion!
![Page 4: RAIN: Refinable Attack Investigation with On-demand Inter](https://reader034.vdocuments.mx/reader034/viewer/2022042415/625f28fb33c4a004a215a025/html5/thumbnails/4.jpg)
4
“Let me change the offer price.”
recv
write
File archive
write
write
writeread
?
?
?
Is this file affected ?Dependency confusion!
![Page 5: RAIN: Refinable Attack Investigation with On-demand Inter](https://reader034.vdocuments.mx/reader034/viewer/2022042415/625f28fb33c4a004a215a025/html5/thumbnails/5.jpg)
Related work
Accuracy
Runtime
Efficiency
Analysis
Efficiency
• System-call-based
• DTrace, Protracer, LSM, Hi-Fi
• Dynamic Information Flow Tracking (DIFT)
• Panorama, Dtracker
• DIFT + Record replay
• Arnold
4
![Page 6: RAIN: Refinable Attack Investigation with On-demand Inter](https://reader034.vdocuments.mx/reader034/viewer/2022042415/625f28fb33c4a004a215a025/html5/thumbnails/6.jpg)
RAIN
Accuracy
Runtime
Efficiency
Analysis
Efficiency
• We use
• Record replay
• Graph-based pruning
• Selective DIFT
• We achieve• High accuracy
• Runtime efficiency
• Highly improved analysis efficiency RAIN
5
![Page 7: RAIN: Refinable Attack Investigation with On-demand Inter](https://reader034.vdocuments.mx/reader034/viewer/2022042415/625f28fb33c4a004a215a025/html5/thumbnails/7.jpg)
Threat model
• Trusts the OS• RAIN tracks user-level attacks.
• Tracks explicit channels• Side or covert channel is out of scope.
• Records all attacks from their inception• Hardware trojans or OS backdoor is out of scope.
7
![Page 8: RAIN: Refinable Attack Investigation with On-demand Inter](https://reader034.vdocuments.mx/reader034/viewer/2022042415/625f28fb33c4a004a215a025/html5/thumbnails/8.jpg)
8
Analysis host
Provenance
graph builder
Triggering,
reachability
analysis
Replay and
selective DIFT
RAIN
Kernel Module
Customized
libc
Target host
Logs
Coarse-level graph Pruned sub-graph Refined sub-graph
Prune Refine
Architecture
![Page 9: RAIN: Refinable Attack Investigation with On-demand Inter](https://reader034.vdocuments.mx/reader034/viewer/2022042415/625f28fb33c4a004a215a025/html5/thumbnails/9.jpg)
OS-level record replay
File
Socket
Randomness
External
inputs
1.Records external inputs
2.Captures the thread
switching from the
pthread interface, not the
produced internal data
3.Records system-wide
executions
IPC
Thread 1 Thread 2
Process group
Internal
data
Thread switching
(via Pthread) 9
![Page 10: RAIN: Refinable Attack Investigation with On-demand Inter](https://reader034.vdocuments.mx/reader034/viewer/2022042415/625f28fb33c4a004a215a025/html5/thumbnails/10.jpg)
Coarse-level logging and graph building
• Keeps logging system-call events
• Constructs a graph to represent:• the processes, files, and sockets as nodes
• the events as causality edges
10
C
B
P1
A
Send
Read
Read
A: Attacker site
B: /docs/report.doc
C: /tmp/errors.zip
P1: /usr/bin/firefox
![Page 11: RAIN: Refinable Attack Investigation with On-demand Inter](https://reader034.vdocuments.mx/reader034/viewer/2022042415/625f28fb33c4a004a215a025/html5/thumbnails/11.jpg)
Pruning
• Does every recorded execution need replay and DIFT?
• Prunes the data in the graph based on trigger analysis results• Upstream
• Downstream
• Point-to-point
• Interference
No!
11
![Page 12: RAIN: Refinable Attack Investigation with On-demand Inter](https://reader034.vdocuments.mx/reader034/viewer/2022042415/625f28fb33c4a004a215a025/html5/thumbnails/12.jpg)
F
P3
C
E
P2
B
P1
A
D
Send
Read
Read
Write
Write
Read
Read
Mmap
A: Attacker site
B: /docs/report.doc
C: /tmp/errors.zip
D: /docs/ctct1.csv
E: /docs/ctct2.pdf
F: /docs/loss.csv
P1: /usr/bin/firefox
P2: /usr/bin/TextEditor
P3: /bin/gzip
Upstream
12
![Page 13: RAIN: Refinable Attack Investigation with On-demand Inter](https://reader034.vdocuments.mx/reader034/viewer/2022042415/625f28fb33c4a004a215a025/html5/thumbnails/13.jpg)
P3
C
EP2
B
P1
A
D
Read
Write
Write
Read
Write
ReadWrite
Read
A: Tampered file /docs/ctct.csv
B: Seasonal report docs/s1.csv
C: Seasonal report docs/s2.csv
D: Budget report docs/bgt.csv
E: Half-year report docs/h2.pdf
P1: Spreadsheet editor
P2: Auto-budget program
P3: Auto-report program
Downstream
13
![Page 14: RAIN: Refinable Attack Investigation with On-demand Inter](https://reader034.vdocuments.mx/reader034/viewer/2022042415/625f28fb33c4a004a215a025/html5/thumbnails/14.jpg)
Point-to-point
FP3
C
EP2
B
A
D
P1
P4
Read
Read
Read
Read
Read
Write
Write
Write
Write
Send
1
2
A: Tampered file /docs/ctct.csv
B: Seasonal report docs/s1.csv
C: Seasonal report docs/s2.csv
D: Budget report docs/bgt.csv
E: Half-year report docs/h2.pdf
F: Document archive server
P1: Spreadsheet editor
P2: Auto-budget program
P3: Auto-report program
P4: Firefox browser
14
![Page 15: RAIN: Refinable Attack Investigation with On-demand Inter](https://reader034.vdocuments.mx/reader034/viewer/2022042415/625f28fb33c4a004a215a025/html5/thumbnails/15.jpg)
Interference
• Insight: only inbound and outbound files that interfere in a process will possibly produce causality.• We determine interference according to the time order of inbound and
outbound IO events.
D
BP2
Write
Read
t1
t2
t1<t2
CF
P3
t1
t2
t3
E
Read
Mmap
Write
t1< t2< t3 15
![Page 16: RAIN: Refinable Attack Investigation with On-demand Inter](https://reader034.vdocuments.mx/reader034/viewer/2022042415/625f28fb33c4a004a215a025/html5/thumbnails/16.jpg)
Refinement - selective DIFT
• Replays and conducts DIFT to the necessary part of the execution• Aggregation
• Upstream
• Downstream
• Point-to-point
16
![Page 17: RAIN: Refinable Attack Investigation with On-demand Inter](https://reader034.vdocuments.mx/reader034/viewer/2022042415/625f28fb33c4a004a215a025/html5/thumbnails/17.jpg)
F
P3
C
E
P2
B
P1
A
D
Send
Read
Read
Write
Write
Read
Read
Mmap
Upstream refinement
17
![Page 18: RAIN: Refinable Attack Investigation with On-demand Inter](https://reader034.vdocuments.mx/reader034/viewer/2022042415/625f28fb33c4a004a215a025/html5/thumbnails/18.jpg)
Implementation summary
• RAIN is built on top of:• Arnold, the record replay framework
• Dtracker (Libdft) and Dytan, the taint engines
Host Module LoC
Target hostKernel module 2,200 C (Diff)
Trace logistics 1,100 C
Analysis host
Provenance graph 6,800 C++
Trigger/Pruning 1,100 Python
Selective refinement 900 Python
DIFT Pin tools 3,500 C/C++ (Diff)
18
![Page 19: RAIN: Refinable Attack Investigation with On-demand Inter](https://reader034.vdocuments.mx/reader034/viewer/2022042415/625f28fb33c4a004a215a025/html5/thumbnails/19.jpg)
Evaluations
• Runtime performance
• Accuracy
• Analysis cost
• Storage footprint
19
![Page 20: RAIN: Refinable Attack Investigation with On-demand Inter](https://reader034.vdocuments.mx/reader034/viewer/2022042415/625f28fb33c4a004a215a025/html5/thumbnails/20.jpg)
20
Runtime overhead: 3.22% SPEC CPU2006
0.00%
20.00%
40.00%
60.00%
80.00%
100.00%
120.00%
140.00%
160.00%
bzi
p2
per
lben
ch
calc
ulix
gam
ess
bw
aves
sjen
g
om
net
pp
mcf
asta
r
h264r
ef
hm
mer
xala
ncb
mk
go
bm
k
libq
uan
tum
sphin
x3
milc
zeu
smp
gro
mac
s
lesl
ie3d
nam
d
lbm
dea
lII
sop
lex
po
vray
Gem
sFD
TD
ton
to wrf
gcc
Logging Logging+Recording
![Page 21: RAIN: Refinable Attack Investigation with On-demand Inter](https://reader034.vdocuments.mx/reader034/viewer/2022042415/625f28fb33c4a004a215a025/html5/thumbnails/21.jpg)
Multi-thread runtime overhead: 5.35% SPLASH-3
21
0.00%
20.00%
40.00%
60.00%
80.00%
100.00%
120.00%
140.00%
ocean-c ocean-n fmm radiosity water-n water-s barnes volrend
![Page 22: RAIN: Refinable Attack Investigation with On-demand Inter](https://reader034.vdocuments.mx/reader034/viewer/2022042415/625f28fb33c4a004a215a025/html5/thumbnails/22.jpg)
IO intensive application: less than 50%
22
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
kernelcopy
moviedownload
libccompilation
Firefoxsession
Logging Logging+Recording
![Page 23: RAIN: Refinable Attack Investigation with On-demand Inter](https://reader034.vdocuments.mx/reader034/viewer/2022042415/625f28fb33c4a004a215a025/html5/thumbnails/23.jpg)
23
0.00%
20.00%
40.00%
60.00%
80.00%
100.00%
Screengrab Cameragrab Audiograb NetRecon MotivatingExample
90.50%
32%39.70%
84.70%
67.00%
0% 0% 0%
13%
0%
Dependency confusion rate
Coarse-level Fine-level
High analysis accuracy
Scenarios from red team exercise of DARPA Transparent Computing program
![Page 24: RAIN: Refinable Attack Investigation with On-demand Inter](https://reader034.vdocuments.mx/reader034/viewer/2022042415/625f28fb33c4a004a215a025/html5/thumbnails/24.jpg)
Pruning effectiveness: ~94.2% reduction
24
0
100
200
300
400
500
600
700
800
Screengrab Cameragrab Audiograb NetRecon MotiveExample
99 141
310
138
720
5 19 11 13 34
Taint workload: #processes
None RAIN
![Page 25: RAIN: Refinable Attack Investigation with On-demand Inter](https://reader034.vdocuments.mx/reader034/viewer/2022042415/625f28fb33c4a004a215a025/html5/thumbnails/25.jpg)
Storage cost: ~4GB per day (1.5TB per year)
25
4000
740
200.6
166.1
133.6
105
113.9
0 500 1000 1500 2000 2500 3000 3500 4000 4500
Per day desktop
Libc compilation
Motive Example
NetRecon
Audiograb
Cameragrab
Screengrab
Storage overhead (MB)
![Page 26: RAIN: Refinable Attack Investigation with On-demand Inter](https://reader034.vdocuments.mx/reader034/viewer/2022042415/625f28fb33c4a004a215a025/html5/thumbnails/26.jpg)
Discussion
• Limitations• RAIN trusts the OS that needs kernel integrity protection.
• Over-tainting issue
• Direction• Hypervisor-based RAIN
• Further reduce storage overhead
26
![Page 27: RAIN: Refinable Attack Investigation with On-demand Inter](https://reader034.vdocuments.mx/reader034/viewer/2022042415/625f28fb33c4a004a215a025/html5/thumbnails/27.jpg)
Conclusion
• RAIN adopts a multi-level provenance system to facilitate fine-grained analysis that enables accurate attack investigation.
• RAIN has low runtime overhead, as well as significantly improved analysis cost.
27