trustworthy whole-system provenance for the …...florida institute for cyber security overview 6...

24
Florida Institute for Cyber Security Trustworthy Whole-System Provenance for the Linux Kernel Adam Bates, Dave (Jing) Tian, Thomas Moyer, and Kevin R. B. Butler USENIX Security Symposium, Washington D.C., USA 13 August, 2015 In association with…

Upload: others

Post on 23-Jun-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Trustworthy Whole-System Provenance for the …...Florida Institute for Cyber Security Overview 6 • Linux Provenance Modules (LPM), for trustworthy provenance monitors in Linux

Florida Institute for Cyber Security

Trustworthy Whole-System Provenance for the Linux Kernel

Adam Bates, Dave (Jing) Tian,Thomas Moyer, and Kevin R. B. Butler

USENIX Security Symposium, Washington D.C., USA13 August, 2015

In association with…

Page 2: Trustworthy Whole-System Provenance for the …...Florida Institute for Cyber Security Overview 6 • Linux Provenance Modules (LPM), for trustworthy provenance monitors in Linux

Florida Institute for Cyber Security

Provenance Matters!

2

Page 3: Trustworthy Whole-System Provenance for the …...Florida Institute for Cyber Security Overview 6 • Linux Provenance Modules (LPM), for trustworthy provenance monitors in Linux

Florida Institute for Cyber Security

Provenance Matters!

3

Page 4: Trustworthy Whole-System Provenance for the …...Florida Institute for Cyber Security Overview 6 • Linux Provenance Modules (LPM), for trustworthy provenance monitors in Linux

Florida Institute for Cyber Security

Provenance Matters!

4

“40,000 Massachusetts defendants may be affected by chemists’s alleged misdeeds” - Morgan Windsor, CNN, Aug 2013

Page 5: Trustworthy Whole-System Provenance for the …...Florida Institute for Cyber Security Overview 6 • Linux Provenance Modules (LPM), for trustworthy provenance monitors in Linux

Florida Institute for Cyber Security

Provenance Matters!

5

“40,000 Massachusetts defendants may be affected by chemists’s alleged misdeeds” - Morgan Windsor, CNN, Aug 2013

But is our provenance secure?

Page 6: Trustworthy Whole-System Provenance for the …...Florida Institute for Cyber Security Overview 6 • Linux Provenance Modules (LPM), for trustworthy provenance monitors in Linux

Florida Institute for Cyber Security

Overview

6

• Linux Provenance Modules (LPM), for trustworthy provenance monitors in Linux.

• Provenance-Based Data Loss Prevention, to monitor and control the propagation of sensitive data in enterprise environments.

• Evaluation:

• Collection agent imposes 3%-8% runtime overhead

• Provenance queries return in under 3 milliseconds

Page 7: Trustworthy Whole-System Provenance for the …...Florida Institute for Cyber Security Overview 6 • Linux Provenance Modules (LPM), for trustworthy provenance monitors in Linux

Florida Institute for Cyber Security

Whole-System Provenance

7

• A complete description of system Agents…

• e.g., Users, Groups

• … controlling Activities…

• e.g., Processes, Forks

• … and their interactions with Controlled Data Types.

• e.g., Inodes, Sockets, IPC, Memory

əDef: prov·en·ance \pra-v -nan(t)s\ n:

Page 8: Trustworthy Whole-System Provenance for the …...Florida Institute for Cyber Security Overview 6 • Linux Provenance Modules (LPM), for trustworthy provenance monitors in Linux

Florida Institute for Cyber Security

• Provenance-Aware Adversary attempts to disable collection agent, tamper with logs, etc.

• Provenance-Aware Applications can be compromised, and may lie about system events.

• Kernel is trusted on install, but can later be attacked.

• PKI stores and distributes keys for Prov-Aware Hosts.

Threat Model

8ALL T

HE PROVENANCE

Page 9: Trustworthy Whole-System Provenance for the …...Florida Institute for Cyber Security Overview 6 • Linux Provenance Modules (LPM), for trustworthy provenance monitors in Linux

Florida Institute for Cyber Security

Design Goals

9

1. Completeness• Gapless descriptions of system activity

2. Tamperproof• Impervious to attacks launched in user space

3. Verifiable• Formal assurance of G1, G2

4. Authenticated Channel• Tamper-evident provenance transmission

5. Secure Disclosure• Validate annotations disclosed in user space

Reference Monitor Concept

Networked Provenance

Layered Provenance

Rationale…

Page 10: Trustworthy Whole-System Provenance for the …...Florida Institute for Cyber Security Overview 6 • Linux Provenance Modules (LPM), for trustworthy provenance monitors in Linux

Florida Institute for Cyber Security

Design

10

• LPM architecture mirrors Linux Security Modules.

• Kernel instrumented with 170 provenance hooks.

• Modules efficiently transmit provenance to user space with relay buffer.

user space

kernel space

Prov. Module

RelayBuffer

Prov. Hooks

System Provenance

ProvenanceRecorder

Kernel layer collection agent:

Page 11: Trustworthy Whole-System Provenance for the …...Florida Institute for Cyber Security Overview 6 • Linux Provenance Modules (LPM), for trustworthy provenance monitors in Linux

Florida Institute for Cyber Security

Design

11

user space

kernel space

Prov. Module

RelayBuffer

Prov. Hooks

System Provenance

ProvenanceRecorder

Kernel layer collection agent:user spaceText Editor

kernel spaceopen System Call

Look Up Inode

Error Checks

DAC Checks

LSM Hook

LPM Hook

Access Inode

Examine context.Does request pass policy?Grant or deny.

Examine context.Collect provenance.If successful, grant.

LSM Module

LPM Module

"Authorized?"Yes or No

"Prov collected?"Yes or No

Example control flow through an LPM provenance hook.

Page 12: Trustworthy Whole-System Provenance for the …...Florida Institute for Cyber Security Overview 6 • Linux Provenance Modules (LPM), for trustworthy provenance monitors in Linux

Florida Institute for Cyber Security 12

user space

kernel space

Neo4j

Prov. Module

RelayBuffer

SNAP

GZip

SQL

Prov. Hooks

System Provenance

ProvenanceRecorder

Design

• Recorders translate provenance stream for various storage backends.

• Support recording to file, relational DBs, graph DBs.

• Upcoming: Accumulo.

Support for provenance storage:

Page 13: Trustworthy Whole-System Provenance for the …...Florida Institute for Cyber Security Overview 6 • Linux Provenance Modules (LPM), for trustworthy provenance monitors in Linux

Florida Institute for Cyber Security 13

user space

kernel space

Neo4j

NF Hooks

Prov. Module

RelayBuffer

SNAP

GZip

SQL

Prov. Hooks

System Provenance

ProvenanceRecorder

Design

• Message Commitment Protocol enforced with Netfilter subsystem.

• LPM performs per-packet DSA signing and verification.

• Signatures are embedded in IP Options, ensuring (nearly) universal compatibility.

Support for networked provenance-aware systems:

Page 14: Trustworthy Whole-System Provenance for the …...Florida Institute for Cyber Security Overview 6 • Linux Provenance Modules (LPM), for trustworthy provenance monitors in Linux

Florida Institute for Cyber Security 14

user space

kernel space

Neo4j

NF Hooks

Prov. Module

RelayBuffer

SNAP

GZip

SQL

Prov. Hooks

System Provenance

ProvenanceRecorder

DesignSupport for networked provenance-aware systems:

user spaceWeb Browser

kernel spaceTCP Send Packet

IP Send Packet

Update IP Checksum

Netfilter Hook Iterate

IPTables Hook

LPM Hook

Network Card

Examine routing table.Does request pass policy?Grant or deny.

Sign IP HDR, Payload.Embed in IP Options.Update IP Checksum.

IPTables

LPM Module

"Ok with you?"Yes or No

"Prov embedded?"Yes or No

Example control flow for authenticated packet transmission.

Page 15: Trustworthy Whole-System Provenance for the …...Florida Institute for Cyber Security Overview 6 • Linux Provenance Modules (LPM), for trustworthy provenance monitors in Linux

Florida Institute for Cyber Security 15

user space

kernel space

Neo4j

NF Hooks

Prov. Module

RelayBuffer

SNAP

GZip

SQL

Prov. Hooks

Prov. Aware Applications

System Provenance Workflow Provenance

ProvenanceRecorder

DesignSupport for layered provenance-aware systems:

• Kernel provenance suffers from semantic gap problem.

• Layered provenance bridges the gap, but expands attack surface.

• Authenticity and integrity of workflow provenance must be validated, but how?

Page 16: Trustworthy Whole-System Provenance for the …...Florida Institute for Cyber Security Overview 6 • Linux Provenance Modules (LPM), for trustworthy provenance monitors in Linux

Florida Institute for Cyber Security 16

user space

kernel space

Neo4j

NF Hooks

Prov. Module

RelayBuffer

SNAP

GZip

SQL

Prov. Hooks

IMA

TPM

Prov. Aware Applications

System Provenance Workflow Provenance

Integrity Measurements

ProvenanceRecorder

DesignSupport for layered provenance-aware systems:

• LPM includes a gateway for upgrading low integrity provenance.

• Integrity Measurement Architecture (IMA) check verifies load time integrity of application.

• Only correctly validated provenance is recorded.

Page 17: Trustworthy Whole-System Provenance for the …...Florida Institute for Cyber Security Overview 6 • Linux Provenance Modules (LPM), for trustworthy provenance monitors in Linux

Florida Institute for Cyber Security

Analysis of Secure Deployment1. Completeness

• Provenance hooks observe all sensitive operations performed on controlled data types.

2. Tamperproof• SELinux preserves run-time kernel integrity• Secure Boot techniques prevent booting into another kernel

3. Verifiable• By mirroring LSM hooks, LPM inherits formal analysis that ensures

complete mediation of controlled data types.

4. Authenticated Channel• Message Commitment Protocol ensures integrity and identity.

5. Secure Disclosure• IMA check verifies load time integrity of Prov-Aware Applications.

Page 18: Trustworthy Whole-System Provenance for the …...Florida Institute for Cyber Security Overview 6 • Linux Provenance Modules (LPM), for trustworthy provenance monitors in Linux

Florida Institute for Cyber Security

Data Loss Prevention

18

Data Loss Prevention tools take the following forms:

• Regex-Based: Fails to recognize data transformations

• Manual Labelling: Not tamper proof, may fail to handle data fusions.

• Provenance-Based: All lineage information is recorded, any sensitive ancestry can be traced.

Used

UsedUsedWasGeneratedBy WasGeneratedBy

Birth_Dates:0

SSNs:0

Training_Data:0

PII_Data:0 PII_Data.gz:0join Birth_Dates SSNs > PII_Data gzip PII_Data

3 4

1

2

Provenance Graph: Two objects are fused together to create PII, then compressed.

Page 19: Trustworthy Whole-System Provenance for the …...Florida Institute for Cyber Security Overview 6 • Linux Provenance Modules (LPM), for trustworthy provenance monitors in Linux

Florida Institute for Cyber Security

Evaluation: Collection Costs

19

Benchmark VanillaKernel

LPM w/Provmon Overhead

Kernel Compilation 598 sec 612 sec 2.7%

Postmark 25 sec 27 sec 7.5%

BlastSequencing 376 sec 390 sec 4.8%

Overhead is highest on I/O intensive tasks with frequent file creation, deletion, and open.

Costs amortize over reads and writes.

Page 20: Trustworthy Whole-System Provenance for the …...Florida Institute for Cyber Security Overview 6 • Linux Provenance Modules (LPM), for trustworthy provenance monitors in Linux

Florida Institute for Cyber Security

Raw volume generated by

the kernel.

Evaluation: Collection Costs

20

Storage overhead is high, but consistent with other system layer provenance/audit tools. Compression techniques can be used to reduce storage burden.

0

1

2

3

4

5

0 2 4 6 8 10

Stor

age

Cos

t (G

Byte

s)

Time (Minutes)

Uncompressed StreamCompressed Stream

In-Memory Graph

Max. Space Efficiency

Max. Query Efficiency

Page 21: Trustworthy Whole-System Provenance for the …...Florida Institute for Cyber Security Overview 6 • Linux Provenance Modules (LPM), for trustworthy provenance monitors in Linux

Florida Institute for Cyber Security

Evaluation: Query Costs

21

PB-DLP Ancestry Queries for Inodes in a 6 million node graph. (Only inodes with over 50 ancestors were considered)

0.95

0.96

0.97

0.98

0.99

1

0 5 10 15 20 25

Cum

ulat

ive

Den

sity

Response Time (Milliseconds)

99% of queries return in 2.5 ms or less!

Worst case: 17,696 nodesReturns in 21 ms

Page 22: Trustworthy Whole-System Provenance for the …...Florida Institute for Cyber Security Overview 6 • Linux Provenance Modules (LPM), for trustworthy provenance monitors in Linux

Florida Institute for Cyber Security

0

1000

2000

3000

4000

5000

Iperf Performance

Thro

ughp

ut (

Mbp

s)

VanillaLPM

ProvmonBatch Sig

Evaluation: Network Prov.

22

IPerf TCP benchmarks of Message Commitment Protocol.

Alternatives: SSL or IPSec, which require app rewriting.

90% throughput reduction with message commitment protocol…

Batch signatures c o u l d r e d u c e performance cost.

Page 23: Trustworthy Whole-System Provenance for the …...Florida Institute for Cyber Security Overview 6 • Linux Provenance Modules (LPM), for trustworthy provenance monitors in Linux

Florida Institute for Cyber Security

Takeaways

23

• identify the requirements for trustworthy provenance in distributed, heterogeneous environments.

• design, implement, and deploy the first fully-realized provenance monitor.

• propose a mechanism for provenance-based data loss prevention that offers improved capabilities over existing enterprise systems.

In this work, we…

Page 24: Trustworthy Whole-System Provenance for the …...Florida Institute for Cyber Security Overview 6 • Linux Provenance Modules (LPM), for trustworthy provenance monitors in Linux

Florida Institute for Cyber Security

Questions?

24

Thank you for your time. [email protected]

LPM is available at http://linuxprovenance.org