trustworthy whole-system provenance for the …...florida institute for cyber security overview 6...
TRANSCRIPT
Florida Institute for Cyber Security
Trustworthy Whole-System Provenance for the Linux Kernel
Adam Bates, Dave (Jing) Tian,Thomas Moyer, and Kevin R. B. Butler
USENIX Security Symposium, Washington D.C., USA13 August, 2015
In association with…
Florida Institute for Cyber Security
Provenance Matters!
2
Florida Institute for Cyber Security
Provenance Matters!
3
Florida Institute for Cyber Security
Provenance Matters!
4
“40,000 Massachusetts defendants may be affected by chemists’s alleged misdeeds” - Morgan Windsor, CNN, Aug 2013
Florida Institute for Cyber Security
Provenance Matters!
5
“40,000 Massachusetts defendants may be affected by chemists’s alleged misdeeds” - Morgan Windsor, CNN, Aug 2013
But is our provenance secure?
Florida Institute for Cyber Security
Overview
6
• Linux Provenance Modules (LPM), for trustworthy provenance monitors in Linux.
• Provenance-Based Data Loss Prevention, to monitor and control the propagation of sensitive data in enterprise environments.
• Evaluation:
• Collection agent imposes 3%-8% runtime overhead
• Provenance queries return in under 3 milliseconds
Florida Institute for Cyber Security
Whole-System Provenance
7
• A complete description of system Agents…
• e.g., Users, Groups
• … controlling Activities…
• e.g., Processes, Forks
• … and their interactions with Controlled Data Types.
• e.g., Inodes, Sockets, IPC, Memory
əDef: prov·en·ance \pra-v -nan(t)s\ n:
Florida Institute for Cyber Security
• Provenance-Aware Adversary attempts to disable collection agent, tamper with logs, etc.
• Provenance-Aware Applications can be compromised, and may lie about system events.
• Kernel is trusted on install, but can later be attacked.
• PKI stores and distributes keys for Prov-Aware Hosts.
Threat Model
8ALL T
HE PROVENANCE
Florida Institute for Cyber Security
Design Goals
9
1. Completeness• Gapless descriptions of system activity
2. Tamperproof• Impervious to attacks launched in user space
3. Verifiable• Formal assurance of G1, G2
4. Authenticated Channel• Tamper-evident provenance transmission
5. Secure Disclosure• Validate annotations disclosed in user space
Reference Monitor Concept
Networked Provenance
Layered Provenance
Rationale…
Florida Institute for Cyber Security
Design
10
• LPM architecture mirrors Linux Security Modules.
• Kernel instrumented with 170 provenance hooks.
• Modules efficiently transmit provenance to user space with relay buffer.
user space
kernel space
Prov. Module
RelayBuffer
Prov. Hooks
System Provenance
ProvenanceRecorder
Kernel layer collection agent:
Florida Institute for Cyber Security
Design
11
user space
kernel space
Prov. Module
RelayBuffer
Prov. Hooks
System Provenance
ProvenanceRecorder
Kernel layer collection agent:user spaceText Editor
kernel spaceopen System Call
Look Up Inode
Error Checks
DAC Checks
LSM Hook
LPM Hook
Access Inode
Examine context.Does request pass policy?Grant or deny.
Examine context.Collect provenance.If successful, grant.
LSM Module
LPM Module
"Authorized?"Yes or No
"Prov collected?"Yes or No
Example control flow through an LPM provenance hook.
Florida Institute for Cyber Security 12
user space
kernel space
Neo4j
Prov. Module
RelayBuffer
SNAP
GZip
SQL
Prov. Hooks
System Provenance
ProvenanceRecorder
Design
• Recorders translate provenance stream for various storage backends.
• Support recording to file, relational DBs, graph DBs.
• Upcoming: Accumulo.
Support for provenance storage:
Florida Institute for Cyber Security 13
user space
kernel space
Neo4j
NF Hooks
Prov. Module
RelayBuffer
SNAP
GZip
SQL
Prov. Hooks
System Provenance
ProvenanceRecorder
Design
• Message Commitment Protocol enforced with Netfilter subsystem.
• LPM performs per-packet DSA signing and verification.
• Signatures are embedded in IP Options, ensuring (nearly) universal compatibility.
Support for networked provenance-aware systems:
Florida Institute for Cyber Security 14
user space
kernel space
Neo4j
NF Hooks
Prov. Module
RelayBuffer
SNAP
GZip
SQL
Prov. Hooks
System Provenance
ProvenanceRecorder
DesignSupport for networked provenance-aware systems:
user spaceWeb Browser
kernel spaceTCP Send Packet
IP Send Packet
Update IP Checksum
Netfilter Hook Iterate
IPTables Hook
LPM Hook
Network Card
Examine routing table.Does request pass policy?Grant or deny.
Sign IP HDR, Payload.Embed in IP Options.Update IP Checksum.
IPTables
LPM Module
"Ok with you?"Yes or No
"Prov embedded?"Yes or No
Example control flow for authenticated packet transmission.
Florida Institute for Cyber Security 15
user space
kernel space
Neo4j
NF Hooks
Prov. Module
RelayBuffer
SNAP
GZip
SQL
Prov. Hooks
Prov. Aware Applications
System Provenance Workflow Provenance
ProvenanceRecorder
DesignSupport for layered provenance-aware systems:
• Kernel provenance suffers from semantic gap problem.
• Layered provenance bridges the gap, but expands attack surface.
• Authenticity and integrity of workflow provenance must be validated, but how?
Florida Institute for Cyber Security 16
user space
kernel space
Neo4j
NF Hooks
Prov. Module
RelayBuffer
SNAP
GZip
SQL
Prov. Hooks
IMA
TPM
Prov. Aware Applications
System Provenance Workflow Provenance
Integrity Measurements
ProvenanceRecorder
DesignSupport for layered provenance-aware systems:
• LPM includes a gateway for upgrading low integrity provenance.
• Integrity Measurement Architecture (IMA) check verifies load time integrity of application.
• Only correctly validated provenance is recorded.
Florida Institute for Cyber Security
Analysis of Secure Deployment1. Completeness
• Provenance hooks observe all sensitive operations performed on controlled data types.
2. Tamperproof• SELinux preserves run-time kernel integrity• Secure Boot techniques prevent booting into another kernel
3. Verifiable• By mirroring LSM hooks, LPM inherits formal analysis that ensures
complete mediation of controlled data types.
4. Authenticated Channel• Message Commitment Protocol ensures integrity and identity.
5. Secure Disclosure• IMA check verifies load time integrity of Prov-Aware Applications.
Florida Institute for Cyber Security
Data Loss Prevention
18
Data Loss Prevention tools take the following forms:
• Regex-Based: Fails to recognize data transformations
• Manual Labelling: Not tamper proof, may fail to handle data fusions.
• Provenance-Based: All lineage information is recorded, any sensitive ancestry can be traced.
Used
UsedUsedWasGeneratedBy WasGeneratedBy
Birth_Dates:0
SSNs:0
Training_Data:0
PII_Data:0 PII_Data.gz:0join Birth_Dates SSNs > PII_Data gzip PII_Data
3 4
1
2
Provenance Graph: Two objects are fused together to create PII, then compressed.
Florida Institute for Cyber Security
Evaluation: Collection Costs
19
Benchmark VanillaKernel
LPM w/Provmon Overhead
Kernel Compilation 598 sec 612 sec 2.7%
Postmark 25 sec 27 sec 7.5%
BlastSequencing 376 sec 390 sec 4.8%
Overhead is highest on I/O intensive tasks with frequent file creation, deletion, and open.
Costs amortize over reads and writes.
Florida Institute for Cyber Security
Raw volume generated by
the kernel.
Evaluation: Collection Costs
20
Storage overhead is high, but consistent with other system layer provenance/audit tools. Compression techniques can be used to reduce storage burden.
0
1
2
3
4
5
0 2 4 6 8 10
Stor
age
Cos
t (G
Byte
s)
Time (Minutes)
Uncompressed StreamCompressed Stream
In-Memory Graph
Max. Space Efficiency
Max. Query Efficiency
Florida Institute for Cyber Security
Evaluation: Query Costs
21
PB-DLP Ancestry Queries for Inodes in a 6 million node graph. (Only inodes with over 50 ancestors were considered)
0.95
0.96
0.97
0.98
0.99
1
0 5 10 15 20 25
Cum
ulat
ive
Den
sity
Response Time (Milliseconds)
99% of queries return in 2.5 ms or less!
Worst case: 17,696 nodesReturns in 21 ms
Florida Institute for Cyber Security
0
1000
2000
3000
4000
5000
Iperf Performance
Thro
ughp
ut (
Mbp
s)
VanillaLPM
ProvmonBatch Sig
Evaluation: Network Prov.
22
IPerf TCP benchmarks of Message Commitment Protocol.
Alternatives: SSL or IPSec, which require app rewriting.
90% throughput reduction with message commitment protocol…
Batch signatures c o u l d r e d u c e performance cost.
Florida Institute for Cyber Security
Takeaways
23
• identify the requirements for trustworthy provenance in distributed, heterogeneous environments.
• design, implement, and deploy the first fully-realized provenance monitor.
• propose a mechanism for provenance-based data loss prevention that offers improved capabilities over existing enterprise systems.
In this work, we…
Florida Institute for Cyber Security
Questions?
24
Thank you for your time. [email protected]
LPM is available at http://linuxprovenance.org