iris: a scalable cloud file system with efficient integrity checks

34
Iris: A Scalable Cloud File System with Efficient Integrity Checks Marten van Dijk Ari Juels Alina Oprea RSA Labs RSA Labs RSA Labs [email protected] om [email protected] [email protected] Emil Stefanov UC Berkeley [email protected] y.edu

Upload: selena

Post on 23-Feb-2016

62 views

Category:

Documents


3 download

DESCRIPTION

Iris: A Scalable Cloud File System with Efficient Integrity Checks. Cloud Storage. Dropbox. Enterprise. Amazon S3, EBS. Windows Azure Storage. Enterprise. SkyDrive. EMC Atmos. Mozy. iCloud. Google Storage. Can you trust the cloud?. User. Infrastructure bugs Malware - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Iris: A Scalable Cloud File  System with  Efficient Integrity Checks

Iris: A Scalable Cloud File System

with Efficient Integrity Checks

Marten van Dijk Ari Juels Alina OpreaRSA Labs RSA Labs RSA Labs

[email protected] [email protected] [email protected]

Emil StefanovUC Berkeley

[email protected]

Page 2: Iris: A Scalable Cloud File  System with  Efficient Integrity Checks

Enterprise

Cloud Storage

Enterprise

User

User

User

• Infrastructure bugs• Malware• Disgruntled employees

SkyDrive

Windows Azure Storage

Amazon S3, EBSDropbox

EMC Atmos

MozyiCloud Google

StorageCan you trust

the cloud?

Page 3: Iris: A Scalable Cloud File  System with  Efficient Integrity Checks

Iris File System

• Integrity verification (on the fly)– value read == value written (integrity)– value read == last value written (freshness)– data & metadata

• Proof of Retrievability (PoR/PDP)– Verifies: ALL of the data is on the cloud or recoverable– More on this later

• High performance (low overhead)– Hundreds of MB/s data rates– Designed for enterprises

Page 4: Iris: A Scalable Cloud File  System with  Efficient Integrity Checks

Iris Deployment Scenario

clients

portal(s)(distributed)

cloud

lightweight(1 to 5 portals)

heavyweight(TBs to PBs

of data)

enterprise

portal

(appliances)

Page 5: Iris: A Scalable Cloud File  System with  Efficient Integrity Checks

Overview: File System Tree

• Most file systems have file-system tree.• Contains:

– Directory structure– File names– Timestamps– Permissions– Other attributes

• Efficiently laid out on disk (e.g., using B-tree)

Page 6: Iris: A Scalable Cloud File  System with  Efficient Integrity Checks

Overview: Merkle Trees

• Parents contain hash of children.• To verify an element (e.g., “y”) is in the tree:

nodes accessed

𝒉𝒂 :𝑯 (𝑩∨¿𝑪)

𝒉𝒃 :𝑯 (𝑫∨¿ 𝑬 ) 𝒉𝒄 :𝑯 (…)

x y

A

B C

D E…

Page 7: Iris: A Scalable Cloud File  System with  Efficient Integrity Checks

Iris: Unified File System + Merkle Tree

• File system tree is also a Merkle tree

/u/

v/c a f

g b e

/u/

v/c a f

gb

e

Free List

v/

gb

e

Directory treeFile version treeFile blocks

• Free List: stores deleted subtrees

• Binary– Balancing nodes

• Directory Tree– Root node:

• Directory attributes– Leafs:

• Subdirectories• Files

• File Version Tree– Root node:

• File attributes– Leafs:

• File block version numbers

Page 8: Iris: A Scalable Cloud File  System with  Efficient Integrity Checks

File Version Tree

• Each file has a version tree• Version numbers increase when blocks are modified.• Version numbers propagate upwards to version tree root

4 : 7

4 : 5 6 : 7

0 : 7

4 5 6 7v0 v0 v0

v0

0 : 3

0 : 1 4 : 7

0 1 2 3v0v0 v0 v0 v0

v0

v0

v0

v0 v0

v0

v1v1 v1 v1 v1

v1

v1

v1

v1 v1

v1

Page 9: Iris: A Scalable Cloud File  System with  Efficient Integrity Checks

4 : 7

4 : 5 6 : 7

0 : 7

4 5 6 7v0

0 : 3

0 : 1 4 : 7

0 1 2 3v1 v1 v1 v0 v0v1v1

v1 v0v1

v1

v1

v1

v1

v2 v2v2v2

v2v2

v2

v2

v2

v2

File Version Tree

• Process repeats for every write• Unique version numbers after each write

– Helps ensure freshness

Page 10: Iris: A Scalable Cloud File  System with  Efficient Integrity Checks

Integrity Verification: MACs

• For each file, Iris generates a MAC file.– Later used to verify integrity of data blocks.– 4 KB blocks

• MAC is computed over:– file id, block index, version number, block data

m3

4 KB

m1 m2 m4 m5

b3b1 b2 b4 b5 …

… mi = MAC(fid, i, vi, bi)

bi

20 bytes

4 KB

20 bytes

Page 11: Iris: A Scalable Cloud File  System with  Efficient Integrity Checks

Merkle Tree Efficiency• Many FS operations access paths in the tree• Inefficient to access one path at a time

– Paths share ancestor nodes• Accessing same nodes over and over

– Unnecessary I/O– Redundant Merkle tree crypto

– Latency bound• Accessing paths in parallel?

– Naïve techniques can lead to corruption• Same ancestor node accessed in separate threads

– Need a Merkle tree cache• Very important part of our system

Page 12: Iris: A Scalable Cloud File  System with  Efficient Integrity Checks

Merkle Tree Cache Challenges• Nodes depend on each other

– Parents contain hashes of children– Cannot evict parent before child

• Asynchronous– Inefficient: one thread per node/path

• Avoid unnecessary hashing– Nodes near the root of the tree often reused

• Efficient sequential file operations– Inefficient: access path per block log overhead– Adjacent nodes must stay “long enough” in cache.

Page 13: Iris: A Scalable Cloud File  System with  Efficient Integrity Checks

Merkle Tree Cache

Pinnedverifying

To Verify Unpinned

Compacting

Updating Hash

Ready to Writewriting

Nodes are read into the tree in parallel.

reading

Page 14: Iris: A Scalable Cloud File  System with  Efficient Integrity Checks

g

Reading a Path

Path: “/u/v/b”

c a f

eDirectory treeFile version treeData fileMAC File

/u/

v/

b

Page 15: Iris: A Scalable Cloud File  System with  Efficient Integrity Checks

Merkle Tree Cache

Pinned

To Verify Unpinned

Compacting

Updating Hash

Ready to Writewriting

When both siblings arrive, they are verified.

reading

verifying Top-down verification:parent verified before

children

Page 16: Iris: A Scalable Cloud File  System with  Efficient Integrity Checks

Verification

….

…. ….

A

B C

…. ….D E

verify

verify

Page 17: Iris: A Scalable Cloud File  System with  Efficient Integrity Checks

Merkle Tree Cache

Pinned

To Verify Unpinned

Compacting

Updating Hash

Ready to Writewriting

Verified nodes enter “pinned” state.

reading

verifying Pinned nodes used by async file system operations.

While used by at least one

operation, nodes remain pinned.

Pinned nodes cannot be evicted.

Page 18: Iris: A Scalable Cloud File  System with  Efficient Integrity Checks

Merkle Tree Cache

Pinned

To Verify Unpinned

Compacting

Updating Hash

Ready to Writewriting

When node no longer used, it becomes “unpinned”.

reading

verifyingUnpinned nodes are eligible for eviction.

When cache 75% full, eviction begins.

Page 19: Iris: A Scalable Cloud File  System with  Efficient Integrity Checks

Merkle Tree Cache

Pinned

To Verify Unpinned

Compacting

Updating Hash

Ready to Writewriting

Eviction Step #1:Adjacent nodes with

identical version numbers are compacted.

reading

verifying

Page 20: Iris: A Scalable Cloud File  System with  Efficient Integrity Checks

Compacting

0 : 7 8 : 15

0 : 3 4 : 7 8 : 11 12 : 15

0 : 15

8 : 9 10 : 11 12 : 13 14 : 15

v2 v1

v1 v2 v2 v1

v2 v2

v2 v2

v2

v1 v1

8 : 154 : 7

0 : 15

8 : 9 14 : 15

v1v2

v2• Keep:– if version ≠ parent version– for balancing

• Stripped out redundant information

Often files are written sequentially and compact

to a single node.

Page 21: Iris: A Scalable Cloud File  System with  Efficient Integrity Checks

Merkle Tree Cache

Pinned

To Verify Unpinned

Compacting

Updating Hash

Ready to Writewriting

Eviction Step #2:Hashes are then updated

in bottom-up order.

reading

verifying

Page 22: Iris: A Scalable Cloud File  System with  Efficient Integrity Checks

Merkle Tree Cache

Pinned

To Verify Unpinned

Compacting

Updating Hash

Ready to Writewriting

Eviction Step #3:Nodes written to

cloud storage.

reading

verifying

Page 23: Iris: A Scalable Cloud File  System with  Efficient Integrity Checks

Merkle Tree Cache

Pinned

To Verify Unpinned

Compacting

Updating Hash

Ready to Writewriting

Note:Node can be pinned at

any time during eviction.

reading

verifying

Path to node becomes “pinned”.

Page 24: Iris: A Scalable Cloud File  System with  Efficient Integrity Checks

Merkle Tree Cache:Crucial for Real-World Workloads

• Iris benefits from locality• Very small cache required to achieve high throughput

– Cache size: 5 MB to 10 MB

Page 25: Iris: A Scalable Cloud File  System with  Efficient Integrity Checks

Sequential Workloads

• Results– 250 to 300 MB/s– 100+ clients

• Cache– Minimal cache size ( < 1 MB ) to achieve high throughput– Reason: Nodes get compacted– Usually network bound

Page 26: Iris: A Scalable Cloud File  System with  Efficient Integrity Checks

Random Workloads

• Results– Bound by disk seeks

• Cache– Minimal cache size ( < 1 MB ) to achieve seek-bound throughput– Cache only used to achieve parallelism to combat latency.– Reason: Very little locality.

Page 27: Iris: A Scalable Cloud File  System with  Efficient Integrity Checks

Other Workloads• Very workload dependent• Specifically

– Depends on number of seeks• Iris is designed to reduce Merkle tree seek

overhead via:– Compacting– Merkle tree Cache

Page 28: Iris: A Scalable Cloud File  System with  Efficient Integrity Checks

Proofs of Retereivability

• How can we be sure our data is still there?

• Iris Continuously Verifies that the Cloud Possesses All Data

• First sublinear solution to the open problem ofDynamic Proofs of Retreivability

Page 29: Iris: A Scalable Cloud File  System with  Efficient Integrity Checks

Proofs of Retereivability

• Iris verifies that cloud possesses 99.9% of data (with high probability).

• Remaining 0.1% can be recovered using Iris parity data structure.

• Custom designed error-correcting code (ECC) and parity data structure.– High throughput (300-550 MB/s).

Page 30: Iris: A Scalable Cloud File  System with  Efficient Integrity Checks

ECC Challenges• Update efficiency

– Want high-throughput file system– On-the-fly– ECC should not be a bottleneck– Reed–Solomon codes are too slow.

• Hiding code structure– Adversary should not know which blocks to corrupt to make ECC

fail.– Adversarially-secure ECC

• Variable-length encoding– Handles: blocks, file attributes, Merkle tree nodes, etc

Page 31: Iris: A Scalable Cloud File  System with  Efficient Integrity Checks

Iris Error Correcting CodeBlock on file system

Stripe

Offset

File system: ECC Parity Stripes:

Stripe

Offset

Pseudo-random Error-Correcting CodeMapping from file system position to corresponding parities:

The cloud does not know the key , so it can’t determinewhich 0.1% subset of data to corrupt to make the ECC fail.

Page 32: Iris: A Scalable Cloud File  System with  Efficient Integrity Checks

Iris Error Correcting CodeBlock on file system

Stripe

Offset

File system: ECC Parity Stripes:

Stripe

Offset

• Memory: • Update time: • Verification time:

– Amortized cost

Page 33: Iris: A Scalable Cloud File  System with  Efficient Integrity Checks

ECC Update Efficiency

• Very fast– 300-550 MB/s

• Not a bottleneck in Iris

Page 34: Iris: A Scalable Cloud File  System with  Efficient Integrity Checks

Conclusion

• Presented Iris file system– Integrity– Proofs of retreivability / data possession– On the fly

• Very practical– Overall system throughput

• 250-300 MB/s per Portal– Scales to enterprises