on managing continuous media data edward chang hector garcia-molina stanford university

Post on 20-Dec-2015

215 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

On Managing Continuous Media Data

On Managing Continuous Media Data

Edward Chang Hector Garcia-MolinaStanford University

2

Challenges

Large Volume of DataMPEG2 100 Minute Movie: 3-4 GBytes

Large Data Transfer RateMPEG2: 4 to 6 MbpsHDTV: 19.2 Mbps

Just-in-Time Data RequirementSimultaneous Users

3

...Challenges

Traditional Optimization Objectives:Maximizing Throughput!Maximizing Throughput!!Maximizing Throughout!!!

How about Cost?How about Initial Latency?

4

Related Work

USC (S. Ghandeharizadeh)UCLA (R. Muntz)UBC (Raymond Ng)Bell Labs. (B. Ozden)IBM Tom Watson Labs. (P. Yu)etc.

5

OutlineServer (Single Disk)

Revisiting Conventional Wisdom Minimizing CostMinimizing Initial Latency

Server (Parallel Disks)Balancing WorkloadMinimizing Cost & Initial Latency

ClientHandling VBRSupporting VCR-like Functions

6

Conventional Wisdom(for Single Disk)

Reducing Disk Latency leads to Better Disk Utilization

Reducing Disk Latency leads to Higher Throughput

Increasing Disk Utilization leads to Improved Cost Effectiveness

7

Is Conventional Wisdom Right?

Does Reducing Disk Latency lead to Better Disk Utilization?

Does Reducing Disk Latency lead to Higher Throughput?

Does Increasing Disk Utilization lead to Improved Cost Effectiveness?

8

Tseek: Disk Latency

TR: Disk Transfer Rate

DR: Display RateS: Segment Size (Peak Memory Use per Request)T: Service Cycle Time

9

S = DR × T

T = N × (Tseek + S/TR)

10

N × TR × DR × Tseek

TR - N × DR

S is directly proportional to Tseek

=

Dutil =S/TR

S/TR + Tseek

S

Dutil is Constant!

Disk Utilization

11

Is Conventional Wisdom Right?

Does Reducing Disk Latency lead to Better Disk Utilization? NO!

Does Reducing Disk Latency lead to Higher Throughput?

Does Increasing Disk Utilization lead to Improved Cost Effectiveness?

12

What Affects Throughput?

Disk Latency

Memory Utilization

Disk Utilization

Throughput

×

?

13

Memory Requirement

We Examine Two Disk Scheduling Policies’ Memory RequirementSweep (Elevator Policy): Enjoys

the Minimum Seek OverheadFixed-Stretch: Suffers from High

Seek Overhead

14

N × TR × DR × Tseek

TR - N × DR =S

Per User Peak Memory Use

15

Sweep (Elevator)

Disk Latency: MinimumIO Time Variability: Very High

16

Sweep (Elevator)

Memory Sharing: PoorTotal Memory Requirement:

2 * N * Ssweep

17

Fixed-Stretch

Disk Latency: High (because of Stretch)IO Variability: No (because of Fixed)

18

Fixed-Stretch

Memory Sharing: GoodTotal Memory Requirement:

1/2 * N * Sfs

19

Throughput

Sweep2 * N * Ssweep

Available Memory = 40 Mbytes

N = 40

Fixed Stretch1/2 * N * Ssf

Available Memory = 40 Mbytes

N= 42Higher Throughput

* Based on A Realistic Case Study Using Seagate Disks

20

What Affects Throughput?

Disk Latency

Memory Utilization

Disk Utilization

Throughput

×

?

21

Is Conventional Wisdom Right?

Does Reducing Disk Latency lead to Better Disk Utilization? NO!

Does Reducing Disk Latency lead to Higher Throughput? NO!

Does Increasing Disk Utilization lead to Improved Cost Effectiveness?

22

Per Stream Cost

23

Cm × N × TR × DR × Tseek

TR - N × DR =Cm × S

Per-Stream Memory Cost

24

Example

Disk Cost: $200 a unit Memory Cost: $5 each MBytes Supporting N = 40 Requires 60 MBytes Memory

$200 + 300 = $500 Supporting N = 50 Requires 160 MBytes

Memory$200 + 800 = $1,000

For the same cost $1,000, it’s better to buy 2 Disks and 120 Mbytes to support N = 80 Users!

Memory Use is Critical

25

Is Conventional Wisdom Right?

Does Reducing Disk Latency lead to Better Disk Utilization? NO!

Does Reducing Disk Latency lead to Higher Throughput? NO!

Does Increasing Disk Utilization lead to Improved Cost Effectiveness? NO!

26

So What?

27

OutlineServer (Single Disk)

Revisiting Conventional Wisdom Minimizing CostMinimizing Initial Latency

Server (Parallel Disks)Balancing WorkloadMinimizing Cost & Initial Latency

ClientHandling VBRSupporting VCR-like Functions

28

Initial Latency

What is it?The time between when a request arrives

at the server to the time when the data is available in the server’s main memory

Where is it important?Interactive applications (e.g., video

game)Interactive features (e.g., fast-scan)

29

Sweep (Elevator)

30

Fixed-Stretch

Space Out IOs

31

Fixed-Stretch

32

Fixed-Stretch

33

Our Contribution: BubbleUp

Fixed-Stretch Enjoys Fine Throughput

BubbleUp Remedies Fixed-Stretch to Minimize Initial Latency

34

Schedule Office Work

8am: Host a Visitor9am: Do Email10am: Write Paper11am: Write PaperNoon: Lunch

35

BubbleUp

36

BubbleUp

Empty Slots are Always Next in Time

No additional Memory RequiredFill the Buffer up to the Segment Size

No additional Disk Bandwidth RequiredThe Disk Is Idle Otherwise

37

Evaluation

38

Fast-Scan

39

Fast-Scan

40

Data Placement Policies

Please refer to our publications

41

42

Chunk Allocation

Allocate Memory in ChunksA Chunk = k * S

Replicate the Last Segment of a Chunk in the Beginning of Next Chunk

ExampleChunk 1: s1, s2, s3, s4, s5Chunk 2: s5, s6, s7, s8, s9

43

Chunk Allocation

Largest-Fit FirstBest Fit (Last Chunk)

44

18 Segment Placement

45

Largest-Fit First

46

Best Fit

47

OutlineServer (Single Disk)

Revisiting Conventional Wisdom Minimizing CostMinimizing Initial Latency

Server (Parallel Disks)Balancing WorkloadMinimizing Cost & Initial Latency

ClientHandling VBRSupporting VCR-like Functions

48

Unbalanced Workload

49

Balanced Workload

50

N × TR × DR × Tseek

TR - N × DR =S

Per Stream Memory Use (Use M Disks Independently)

M × N

51

Per Stream Memory Use (Use M Disks As One Disk)

M × N

52

N × TR × DR × Tseek

TR - N × DR =S

S’ =N × M × TR × M × DR × Tseek

TR × M - N × M × DR

S’ = M × N × TR × DR × Tseek

TR - N × DR= M × S

…Continue

53

Challenges

Using M Disks Independently:Unbalanced WorkloadLow Per-Stream Memory Cost

Using M Disks As One Virtual Disk (i.e., Employing Fine-Grained Striping):Balanced WorkloadHigh Per-Stream Memory Cost

54

Our Approach (2DB)

Use Disks IndependentlyTo Minimize Cost

Replicate Hot Movies (20% Movies)To Balance Workload

Use BubbleUpTo Minimize Initial Latency

55

2D BubbleUp (2DB)

Intelligent Data PlacementEfficient Request SchedulingFODO, 1998

56

2DB Data Placement: Chunk Allocation

57

2DB Scheduling

Formally, This is a Bipartite Weighted Matching problemCan be solved using Hungarian method

in O(V^3), where V = NMWe use a Greedy Method to reduce

the problem to a Bipartite Unweighted Matching problemCan be solved in O(M^2)

58

Why 2DB Works?

59

60

61

n balls n urns, finite n:

ln n / ln ln n(1 + o(1))

ln ln n / ln 2 + O(1)

m balls n urns, m > n and infinite m and n:

d: number of possible destinations

ln ln n / ln d (1 + o(1)) + O(m/n)

62

What 2DB Costs?

Storage CostAddition disk cost = % hot moviesTypically 20% of movies subscribed

80% of timeThroughput

Throughput is scaled back by a fraction to achieve balanced work

63

Evaluation

2DB Achieves Balanced Workload with High ThroughputCompared to e.g., some dynamic load

balancing schemes 2DB Incurs Low Additional Storage

Cost2DB Enjoys Minimum Initial Latency

64

OutlineServer (Single Disk)

Revisiting Conventional Wisdom Minimizing CostMinimizing Initial Latency

Server (Parallel Disks)Balancing WorkloadMinimizing Cost & Initial Latency

ClientHandling VBRSupporting VCR-like Functions

65

Media Client

Most Studies Assume Dumb ClientsWe Propose Smart Clients for

Handling VBRSupporting VCR-like Functions

66

Handling VBR

Server Can Handle VBRFrame rate fluctuates but the moving

average does not fluctuate as muchRates are even out when N is large,

which is typically the case

67

...VBR

But, the Server Cannot Eliminate Bitrate MismatchPacketization and Channel Delay

can change the bitrateThe Solution Must Be at the Client

Side!

68

Supporting VCR-like Functions

Pause Phone call interruptionsBiological needs

Fast ForwardCatching up the program after a pause

Instant Replay

69

How to Pause A Movie?

Broadcast TV Cannot Be PausedPausing Via a Point-to-point Link

Affects the Server’s Scheduling

Caching!!!Main Memory Caching?

Too expensive! (19.2 mbps * 20 min = 2 GBytes)

70

Buffer Management

71

Challenges

Must Ensure Arriving Bits Do Not Overflow the Network Buffer

Must Ensure Decoder Buffer Does Not Underflow

Must Work for Any Off-the-shelf Disks, CPU Box

72

Our Contribution: MEDIC

MEDIC: MEmory & Disk Integrated Cache

MEDIC Manages IOs Between Memory and Disk Efficiently Only 4 Mbytes main memory needed!!!Make a set-top box affordable

MEDIC Adapts to Hardware Configuration

73

Demo

Regular PlaybackPauseResume Regular PlaybackFast ForwardInstant Replay (not shown)

74

Visualize MEDIC

75

Conclusions (Contributions in Blue)

Server (Single Disk)Revisiting Conventional Wisdom Minimizing CostMinimizing Initial Latency

Server (Parallel Disks)Balancing WorkloadMinimizing Cost & Initial Latency

ClientHandling VBRSupporting VCR-like Functions

76

…Conclusions

Our Server SupportsLow Latency Playback and Fast Forward

Our Client SupportsPause and Low Latency Instance Replay

Together, We Propose A Complete End-to-end Solution for Continuous Media Data Delivery!

77

Future Work

Enhancing MEDIC for Managing Heterogeneous Data, from Both Broadcast & Internet ChannelsVideo PanoramasInteractive TV

Indexing Videos for ReplayVideo/Image databases

top related