efficient data dissemination and survivable data storage

Post on 04-Jan-2016

35 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Efficient Data Dissemination and Survivable Data Storage. Lihao Xu http://www.cs.wayne.edu/~lihao/. Ubiquitous Information Access. Key Building Blocks. Storage Retrieval Dissemination Consumption. Key Building Blocks. Storage Retrieval Dissemination Consumption. - PowerPoint PPT Presentation

TRANSCRIPT

Efficient Data Dissemination and Survivable Data Storage

Lihao Xuhttp://www.cs.wayne.edu/~lihao/

Ubiquitous Information Access

Key Building Blocks

• Storage

• Retrieval

• Dissemination

• Consumption

Key Building Blocks

• Storage

• RetrievalRetrieval

• Dissemination

• ConsumptionConsumption

Error Correcting Codes

Error Correcting Codes

21 k…3Message

Error Correcting Codes

21 k…3Message

Codeword 21 n - 1…3 n

Error Correcting Codes

21 k…3Message

Codeword 21 n - 1…3 n

m

21 k…3Message

MDS (Maximum Distance Separable ) Codes

m = k

(n,k) MDS Codes

Reed-Solomon (RS) Code

(n,k) MDS Codes

(4,2) B-Code

a

d+c

b

d+a

c

a+b

d

b+c

Data Dissemination:Broadcast Scheduling

WirelessServer

Data Dissemination

want 1want 2

want 1

want 3

WirelessClients

WirelessServer

Broadcast in a Cell

want 1want 2

want 1

want 3

WirelessClients

want 1want 2

want 1

want 3

WirelessServer

Broadcast Model

Model clients as random processesModel clients as random processes Desired item is random with probability Desired item is random with probability ppii

for item for item ii of length of length llii..

WirelessClients

Scheduling Problem

S =

• 2 items, l1=l2

• Each item consists of k packets, k large

• Challenge: choose packet broadcast schedule to minimize wait for clients

1 2 1 2

Prior Work

Complexity of optimal schedules Complexity of optimal schedules Bar-Noy, Bhatia, Naor, Schieber, FoltzBar-Noy, Bhatia, Naor, Schieber, Foltz

Complexity of computing optimal Complexity of computing optimal schedulesschedules Kenyon, SchabanelKenyon, Schabanel

Error correction/detectionError correction/detection BestavrosBestavros

Metric: Delivery Time

Delivery Time for item 1

1,SdelivT

S =

initt

1 2 1 2

Delivery Time

initiS

deliv tT , Total amount of time spent waiting for item i whenstarting at time in schedule S.

initt Instant in time when client starts waiting for item.

S =

initt

1 2 1 2

initt

initiS

deliv tT ,

Expected Delivery Time (EDT)

iS

n

iin EDTppppSEDT ,

121 ),...,,,(

][ ,, init

iSdelivtiS tTEEDT

init

initt uniformly distributed over schedule S.

EDT Calculation

1 2 1 2

P = P = 1/21 2

EDT Calculation

1 2 1 2

DT 2

P = P = 1/21 2

EDT Calculation

1 2 1 2

DT 2 3/2

P = P = 1/21 2

EDT Calculation

1 2 1 2

DT 2 3/2

P = P = 1/21 2

DT1 7/4

EDT Calculation

1 2 1 2

DT 2 3/2

P = P = 1/21 2

DT1 7/4

EDT 7/4

Performance with Errors

Data items consist of Data items consist of kk packets packets What happens if a packet is lost?What happens if a packet is lost?

Original:

Transmitted:

12345 . . . k

12345 . . . k

Received: 1234 . . . k

1

k 1

k 1

Performance with Errors

What happens if a packet is lost?What happens if a packet is lost?

Original:

Transmitted:

12345 . . . k

12345 . . . k

Received: 1234 . . . k

1

k 1

k 1 12345

Performance with Errors

What happens if a packet is lost?What happens if a packet is lost?

Original:

Transmitted:

12345 . . . k

12345 . . . k

Received: 1234 . . . k

1

k 1

k 1 12345

EDT = 3 !

Use Use kk of of nn MDS code, MDS code, nn = 2 = 2kk Now only need to wait for 1 additional packetNow only need to wait for 1 additional packet

Solution – Coding

Original:

Transmitted:

12345 . . . k

12345 . . . k

Received: 1234 . . . k

1

k 1

k 1 1

12345 . . . k

12345 . . . k

k +

k +

k +

EDT = 9/4EDT = 9/4

Solution – Coding

Original:

Transmitted:

12345 . . . k

12345 . . . k

Received: 1234 . . . k

1

k 1

k 1 1

12345 . . . k

12345 . . . k

k +

k +

k +

Solution – Coding

Use Use kk of of nn MDS code, MDS code, mm = 2( = 2(k+1)k+1) Now only need to wait for 1 additional packetNow only need to wait for 1 additional packet

Original:

Transmitted:

12345 . . . k

Received:

1k +

k +

n 12345 . . . kn

12345 . . . k 1n 12345 . . . kn

12345 . . . kn

Solution – Coding

Original:

Transmitted:

12345 . . . k

Received:

1k +

k +

n 12345 . . . kn

12345 . . . k 1n 12345 . . . kn

12345 . . . kn

EDT = 7/4 + e

General Solution

Original:

Transmitted:

12345 . . . k

Received:

1k +

k +

n 12345 . . . kn

12345 . . . k 1n 12345 . . . kn

12345 . . . kn

Given loss probability p, what is the optimal n?

General Solution

General Solution

General Solution

General Solution

k = 100 and p = 0.1

General Solution

k = 100

Two-Channel Broadcasting

WirelessServer

want 1want 2

want 1

want 3

WirelessClients

WirelessServer

Coordinating Schedule Data

Use (2Use (2kk, , kk) MDS code to eliminate data overlap) MDS code to eliminate data overlap Channel 1 sends packets 1 through Channel 1 sends packets 1 through kk (raw data) (raw data) Channel 2 sends packets Channel 2 sends packets kk+1 through 2+1 through 2kk

FeaturesFeatures Each channel is self-sufficientEach channel is self-sufficient No overlap between channelsNo overlap between channels

S1 = 12 1 2

S2 = 12 1 2(same schedule, different data)

Scheduling for two channelsScheduling for two channels Two items with equal length and demandTwo items with equal length and demand Two synchronized channels of equal Two synchronized channels of equal

bandwidthbandwidth First channel’s schedule fixed at 12First channel’s schedule fixed at 12

What is the optimal schedule for channel 2?What is the optimal schedule for channel 2?

Two Broadcast Channels

S1 =

S2 =

1 2

?

Some Schedules

1 2

1 2

1 2

12

1 2

1 2

1 2

1 2

Repeat

Swap

Shift

2

Reshuffle

Unequal Portions

121 112 2 2

1 2

1 12 2

Arbitrary

2

1 11 2 2

Some Schedules

1 2

1 2

1 2

12

1 2

1 2

1 2

1 2

Repeat

Swap

Shift

2

Reshuffle

1 1

Unequal Portions

121 112 2 2

1

1 2

1 12 2

Arbitrary

2

EDT = 1

EDT = 1

EDT = 1

EDT = 1

2 2

Some Schedules

1 2

1 2

1 2

12

1 2

1 2

1 2

1 2

Repeat

Swap

Shift

2

Reshuffle

1 1

Unequal Portions

1 21 112 2 2

1

1 2

1 12 2

Arbitrary

2

EDT = 1

EDT = 1

EDT = 1

EDT = 1

EDT = 63/64

EDT < 63/64?

2 2

Schedule Performance

Symmetric ProblemSymmetric Problem Equal lengthsEqual lengths Equal demandsEqual demands Equal bandwidth channelsEqual bandwidth channels Symmetric “fixed” schedule for 1Symmetric “fixed” schedule for 1stst channel channel

Asymmetric SolutionAsymmetric Solution Asymmetric schedules can beat any symmetric Asymmetric schedules can beat any symmetric

schedule for the 2schedule for the 2ndnd channel channel How is this possible?How is this possible?

More to Explore …

More servers/ChannelsMore servers/Channels Differing levels of synchronizationDiffering levels of synchronization Transmission ErrorsTransmission Errors Streaming DataStreaming Data BoundsBounds Wireless

Server

want 1want 2

want 1

want 3

WirelessClients

WirelessServer

WirelessServer

WirelessServer

Hydra: A Platform for SSS

Secure and Survivable Storage

• Availability

• Recoverability

• Persistence

• Confidentiality

• Integrity

• Scalability

• Efficiency

Secure and Survivable Storage

• Yahoo

• Ebay

• Amazon

• Google

• Banks

• Your Labs

• More …

Hydra

Hydra Design Goals

• Portable to various OS/FS

• Hardware independent

• Unix FS semantics maintained

• Low overhead in performance and storage

• Transport independent

• Easy to install, configure, scale, maintain and

automate

Hydra and System

App.

Hydra

FS

I/O

Hydra and System

App.

Hydra

FS

I/O

App.

Hydra

FS

I/O

Hydra and System

App.

Hydra

FS

I/O

App.

Hydra

FS

I/O

App.

FS/Hydra

I/O

Basics of Hydra

(4,2) B-Code

a

d+c

b

d+a

c

a+b

d

b+c

Performance Test2.4G P4, 512 MB, 80GB ATA/100 7200rpm, Redhat 9.0 (kernel 2.4.2.0)

Operations Throughput (Mbps)

File Read 384 File Write 200 Memory Copy 17572(4,2) B-Code Encoding 5522(4,2) B-Code Decoding 22866 (4,2) RS Encoding 286 (4,2) RS Decoding 216

Hydra Components

• Meta Data ( hnode)

• Operations

• Monitor

Hydra Meta Data

• Code

• Symbol Location

• Data Layout

• Security Flag

• Access Rights

• Extensions

Hydra Operations

• Distribute (Write)

• Recover (Read)

• Detect

• Repair

• Restore

• Others

Hydra Monitor

• Connectivity

• Security

Hydra Applications

• Web Server

• CDN/P2P/Data Server

• Archiving

• Data Security

system activity logger, forensic, file integrity checker …

• Others

Acknowledgement

lihao@cs.wayne.edu

top related