on managing continuous media data edward chang hector garcia-molina stanford university

77
On Managing Continuous Media Data Edward Chang Hector Garcia-Molina Stanford University

Post on 20-Dec-2015

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: On Managing Continuous Media Data Edward Chang Hector Garcia-Molina Stanford University

On Managing Continuous Media Data

On Managing Continuous Media Data

Edward Chang Hector Garcia-MolinaStanford University

Page 2: On Managing Continuous Media Data Edward Chang Hector Garcia-Molina Stanford University

2

Challenges

Large Volume of DataMPEG2 100 Minute Movie: 3-4 GBytes

Large Data Transfer RateMPEG2: 4 to 6 MbpsHDTV: 19.2 Mbps

Just-in-Time Data RequirementSimultaneous Users

Page 3: On Managing Continuous Media Data Edward Chang Hector Garcia-Molina Stanford University

3

...Challenges

Traditional Optimization Objectives:Maximizing Throughput!Maximizing Throughput!!Maximizing Throughout!!!

How about Cost?How about Initial Latency?

Page 4: On Managing Continuous Media Data Edward Chang Hector Garcia-Molina Stanford University

4

Related Work

USC (S. Ghandeharizadeh)UCLA (R. Muntz)UBC (Raymond Ng)Bell Labs. (B. Ozden)IBM Tom Watson Labs. (P. Yu)etc.

Page 5: On Managing Continuous Media Data Edward Chang Hector Garcia-Molina Stanford University

5

OutlineServer (Single Disk)

Revisiting Conventional Wisdom Minimizing CostMinimizing Initial Latency

Server (Parallel Disks)Balancing WorkloadMinimizing Cost & Initial Latency

ClientHandling VBRSupporting VCR-like Functions

Page 6: On Managing Continuous Media Data Edward Chang Hector Garcia-Molina Stanford University

6

Conventional Wisdom(for Single Disk)

Reducing Disk Latency leads to Better Disk Utilization

Reducing Disk Latency leads to Higher Throughput

Increasing Disk Utilization leads to Improved Cost Effectiveness

Page 7: On Managing Continuous Media Data Edward Chang Hector Garcia-Molina Stanford University

7

Is Conventional Wisdom Right?

Does Reducing Disk Latency lead to Better Disk Utilization?

Does Reducing Disk Latency lead to Higher Throughput?

Does Increasing Disk Utilization lead to Improved Cost Effectiveness?

Page 8: On Managing Continuous Media Data Edward Chang Hector Garcia-Molina Stanford University

8

Tseek: Disk Latency

TR: Disk Transfer Rate

DR: Display RateS: Segment Size (Peak Memory Use per Request)T: Service Cycle Time

Page 9: On Managing Continuous Media Data Edward Chang Hector Garcia-Molina Stanford University

9

S = DR × T

T = N × (Tseek + S/TR)

Page 10: On Managing Continuous Media Data Edward Chang Hector Garcia-Molina Stanford University

10

N × TR × DR × Tseek

TR - N × DR

S is directly proportional to Tseek

=

Dutil =S/TR

S/TR + Tseek

S

Dutil is Constant!

Disk Utilization

Page 11: On Managing Continuous Media Data Edward Chang Hector Garcia-Molina Stanford University

11

Is Conventional Wisdom Right?

Does Reducing Disk Latency lead to Better Disk Utilization? NO!

Does Reducing Disk Latency lead to Higher Throughput?

Does Increasing Disk Utilization lead to Improved Cost Effectiveness?

Page 12: On Managing Continuous Media Data Edward Chang Hector Garcia-Molina Stanford University

12

What Affects Throughput?

Disk Latency

Memory Utilization

Disk Utilization

Throughput

×

?

Page 13: On Managing Continuous Media Data Edward Chang Hector Garcia-Molina Stanford University

13

Memory Requirement

We Examine Two Disk Scheduling Policies’ Memory RequirementSweep (Elevator Policy): Enjoys

the Minimum Seek OverheadFixed-Stretch: Suffers from High

Seek Overhead

Page 14: On Managing Continuous Media Data Edward Chang Hector Garcia-Molina Stanford University

14

N × TR × DR × Tseek

TR - N × DR =S

Per User Peak Memory Use

Page 15: On Managing Continuous Media Data Edward Chang Hector Garcia-Molina Stanford University

15

Sweep (Elevator)

Disk Latency: MinimumIO Time Variability: Very High

Page 16: On Managing Continuous Media Data Edward Chang Hector Garcia-Molina Stanford University

16

Sweep (Elevator)

Memory Sharing: PoorTotal Memory Requirement:

2 * N * Ssweep

Page 17: On Managing Continuous Media Data Edward Chang Hector Garcia-Molina Stanford University

17

Fixed-Stretch

Disk Latency: High (because of Stretch)IO Variability: No (because of Fixed)

Page 18: On Managing Continuous Media Data Edward Chang Hector Garcia-Molina Stanford University

18

Fixed-Stretch

Memory Sharing: GoodTotal Memory Requirement:

1/2 * N * Sfs

Page 19: On Managing Continuous Media Data Edward Chang Hector Garcia-Molina Stanford University

19

Throughput

Sweep2 * N * Ssweep

Available Memory = 40 Mbytes

N = 40

Fixed Stretch1/2 * N * Ssf

Available Memory = 40 Mbytes

N= 42Higher Throughput

* Based on A Realistic Case Study Using Seagate Disks

Page 20: On Managing Continuous Media Data Edward Chang Hector Garcia-Molina Stanford University

20

What Affects Throughput?

Disk Latency

Memory Utilization

Disk Utilization

Throughput

×

?

Page 21: On Managing Continuous Media Data Edward Chang Hector Garcia-Molina Stanford University

21

Is Conventional Wisdom Right?

Does Reducing Disk Latency lead to Better Disk Utilization? NO!

Does Reducing Disk Latency lead to Higher Throughput? NO!

Does Increasing Disk Utilization lead to Improved Cost Effectiveness?

Page 22: On Managing Continuous Media Data Edward Chang Hector Garcia-Molina Stanford University

22

Per Stream Cost

Page 23: On Managing Continuous Media Data Edward Chang Hector Garcia-Molina Stanford University

23

Cm × N × TR × DR × Tseek

TR - N × DR =Cm × S

Per-Stream Memory Cost

Page 24: On Managing Continuous Media Data Edward Chang Hector Garcia-Molina Stanford University

24

Example

Disk Cost: $200 a unit Memory Cost: $5 each MBytes Supporting N = 40 Requires 60 MBytes Memory

$200 + 300 = $500 Supporting N = 50 Requires 160 MBytes

Memory$200 + 800 = $1,000

For the same cost $1,000, it’s better to buy 2 Disks and 120 Mbytes to support N = 80 Users!

Memory Use is Critical

Page 25: On Managing Continuous Media Data Edward Chang Hector Garcia-Molina Stanford University

25

Is Conventional Wisdom Right?

Does Reducing Disk Latency lead to Better Disk Utilization? NO!

Does Reducing Disk Latency lead to Higher Throughput? NO!

Does Increasing Disk Utilization lead to Improved Cost Effectiveness? NO!

Page 26: On Managing Continuous Media Data Edward Chang Hector Garcia-Molina Stanford University

26

So What?

Page 27: On Managing Continuous Media Data Edward Chang Hector Garcia-Molina Stanford University

27

OutlineServer (Single Disk)

Revisiting Conventional Wisdom Minimizing CostMinimizing Initial Latency

Server (Parallel Disks)Balancing WorkloadMinimizing Cost & Initial Latency

ClientHandling VBRSupporting VCR-like Functions

Page 28: On Managing Continuous Media Data Edward Chang Hector Garcia-Molina Stanford University

28

Initial Latency

What is it?The time between when a request arrives

at the server to the time when the data is available in the server’s main memory

Where is it important?Interactive applications (e.g., video

game)Interactive features (e.g., fast-scan)

Page 29: On Managing Continuous Media Data Edward Chang Hector Garcia-Molina Stanford University

29

Sweep (Elevator)

Page 30: On Managing Continuous Media Data Edward Chang Hector Garcia-Molina Stanford University

30

Fixed-Stretch

Space Out IOs

Page 31: On Managing Continuous Media Data Edward Chang Hector Garcia-Molina Stanford University

31

Fixed-Stretch

Page 32: On Managing Continuous Media Data Edward Chang Hector Garcia-Molina Stanford University

32

Fixed-Stretch

Page 33: On Managing Continuous Media Data Edward Chang Hector Garcia-Molina Stanford University

33

Our Contribution: BubbleUp

Fixed-Stretch Enjoys Fine Throughput

BubbleUp Remedies Fixed-Stretch to Minimize Initial Latency

Page 34: On Managing Continuous Media Data Edward Chang Hector Garcia-Molina Stanford University

34

Schedule Office Work

8am: Host a Visitor9am: Do Email10am: Write Paper11am: Write PaperNoon: Lunch

Page 35: On Managing Continuous Media Data Edward Chang Hector Garcia-Molina Stanford University

35

BubbleUp

Page 36: On Managing Continuous Media Data Edward Chang Hector Garcia-Molina Stanford University

36

BubbleUp

Empty Slots are Always Next in Time

No additional Memory RequiredFill the Buffer up to the Segment Size

No additional Disk Bandwidth RequiredThe Disk Is Idle Otherwise

Page 37: On Managing Continuous Media Data Edward Chang Hector Garcia-Molina Stanford University

37

Evaluation

Page 38: On Managing Continuous Media Data Edward Chang Hector Garcia-Molina Stanford University

38

Fast-Scan

Page 39: On Managing Continuous Media Data Edward Chang Hector Garcia-Molina Stanford University

39

Fast-Scan

Page 40: On Managing Continuous Media Data Edward Chang Hector Garcia-Molina Stanford University

40

Data Placement Policies

Please refer to our publications

Page 41: On Managing Continuous Media Data Edward Chang Hector Garcia-Molina Stanford University

41

Page 42: On Managing Continuous Media Data Edward Chang Hector Garcia-Molina Stanford University

42

Chunk Allocation

Allocate Memory in ChunksA Chunk = k * S

Replicate the Last Segment of a Chunk in the Beginning of Next Chunk

ExampleChunk 1: s1, s2, s3, s4, s5Chunk 2: s5, s6, s7, s8, s9

Page 43: On Managing Continuous Media Data Edward Chang Hector Garcia-Molina Stanford University

43

Chunk Allocation

Largest-Fit FirstBest Fit (Last Chunk)

Page 44: On Managing Continuous Media Data Edward Chang Hector Garcia-Molina Stanford University

44

18 Segment Placement

Page 45: On Managing Continuous Media Data Edward Chang Hector Garcia-Molina Stanford University

45

Largest-Fit First

Page 46: On Managing Continuous Media Data Edward Chang Hector Garcia-Molina Stanford University

46

Best Fit

Page 47: On Managing Continuous Media Data Edward Chang Hector Garcia-Molina Stanford University

47

OutlineServer (Single Disk)

Revisiting Conventional Wisdom Minimizing CostMinimizing Initial Latency

Server (Parallel Disks)Balancing WorkloadMinimizing Cost & Initial Latency

ClientHandling VBRSupporting VCR-like Functions

Page 48: On Managing Continuous Media Data Edward Chang Hector Garcia-Molina Stanford University

48

Unbalanced Workload

Page 49: On Managing Continuous Media Data Edward Chang Hector Garcia-Molina Stanford University

49

Balanced Workload

Page 50: On Managing Continuous Media Data Edward Chang Hector Garcia-Molina Stanford University

50

N × TR × DR × Tseek

TR - N × DR =S

Per Stream Memory Use (Use M Disks Independently)

M × N

Page 51: On Managing Continuous Media Data Edward Chang Hector Garcia-Molina Stanford University

51

Per Stream Memory Use (Use M Disks As One Disk)

M × N

Page 52: On Managing Continuous Media Data Edward Chang Hector Garcia-Molina Stanford University

52

N × TR × DR × Tseek

TR - N × DR =S

S’ =N × M × TR × M × DR × Tseek

TR × M - N × M × DR

S’ = M × N × TR × DR × Tseek

TR - N × DR= M × S

…Continue

Page 53: On Managing Continuous Media Data Edward Chang Hector Garcia-Molina Stanford University

53

Challenges

Using M Disks Independently:Unbalanced WorkloadLow Per-Stream Memory Cost

Using M Disks As One Virtual Disk (i.e., Employing Fine-Grained Striping):Balanced WorkloadHigh Per-Stream Memory Cost

Page 54: On Managing Continuous Media Data Edward Chang Hector Garcia-Molina Stanford University

54

Our Approach (2DB)

Use Disks IndependentlyTo Minimize Cost

Replicate Hot Movies (20% Movies)To Balance Workload

Use BubbleUpTo Minimize Initial Latency

Page 55: On Managing Continuous Media Data Edward Chang Hector Garcia-Molina Stanford University

55

2D BubbleUp (2DB)

Intelligent Data PlacementEfficient Request SchedulingFODO, 1998

Page 56: On Managing Continuous Media Data Edward Chang Hector Garcia-Molina Stanford University

56

2DB Data Placement: Chunk Allocation

Page 57: On Managing Continuous Media Data Edward Chang Hector Garcia-Molina Stanford University

57

2DB Scheduling

Formally, This is a Bipartite Weighted Matching problemCan be solved using Hungarian method

in O(V^3), where V = NMWe use a Greedy Method to reduce

the problem to a Bipartite Unweighted Matching problemCan be solved in O(M^2)

Page 58: On Managing Continuous Media Data Edward Chang Hector Garcia-Molina Stanford University

58

Why 2DB Works?

Page 59: On Managing Continuous Media Data Edward Chang Hector Garcia-Molina Stanford University

59

Page 60: On Managing Continuous Media Data Edward Chang Hector Garcia-Molina Stanford University

60

Page 61: On Managing Continuous Media Data Edward Chang Hector Garcia-Molina Stanford University

61

n balls n urns, finite n:

ln n / ln ln n(1 + o(1))

ln ln n / ln 2 + O(1)

m balls n urns, m > n and infinite m and n:

d: number of possible destinations

ln ln n / ln d (1 + o(1)) + O(m/n)

Page 62: On Managing Continuous Media Data Edward Chang Hector Garcia-Molina Stanford University

62

What 2DB Costs?

Storage CostAddition disk cost = % hot moviesTypically 20% of movies subscribed

80% of timeThroughput

Throughput is scaled back by a fraction to achieve balanced work

Page 63: On Managing Continuous Media Data Edward Chang Hector Garcia-Molina Stanford University

63

Evaluation

2DB Achieves Balanced Workload with High ThroughputCompared to e.g., some dynamic load

balancing schemes 2DB Incurs Low Additional Storage

Cost2DB Enjoys Minimum Initial Latency

Page 64: On Managing Continuous Media Data Edward Chang Hector Garcia-Molina Stanford University

64

OutlineServer (Single Disk)

Revisiting Conventional Wisdom Minimizing CostMinimizing Initial Latency

Server (Parallel Disks)Balancing WorkloadMinimizing Cost & Initial Latency

ClientHandling VBRSupporting VCR-like Functions

Page 65: On Managing Continuous Media Data Edward Chang Hector Garcia-Molina Stanford University

65

Media Client

Most Studies Assume Dumb ClientsWe Propose Smart Clients for

Handling VBRSupporting VCR-like Functions

Page 66: On Managing Continuous Media Data Edward Chang Hector Garcia-Molina Stanford University

66

Handling VBR

Server Can Handle VBRFrame rate fluctuates but the moving

average does not fluctuate as muchRates are even out when N is large,

which is typically the case

Page 67: On Managing Continuous Media Data Edward Chang Hector Garcia-Molina Stanford University

67

...VBR

But, the Server Cannot Eliminate Bitrate MismatchPacketization and Channel Delay

can change the bitrateThe Solution Must Be at the Client

Side!

Page 68: On Managing Continuous Media Data Edward Chang Hector Garcia-Molina Stanford University

68

Supporting VCR-like Functions

Pause Phone call interruptionsBiological needs

Fast ForwardCatching up the program after a pause

Instant Replay

Page 69: On Managing Continuous Media Data Edward Chang Hector Garcia-Molina Stanford University

69

How to Pause A Movie?

Broadcast TV Cannot Be PausedPausing Via a Point-to-point Link

Affects the Server’s Scheduling

Caching!!!Main Memory Caching?

Too expensive! (19.2 mbps * 20 min = 2 GBytes)

Page 70: On Managing Continuous Media Data Edward Chang Hector Garcia-Molina Stanford University

70

Buffer Management

Page 71: On Managing Continuous Media Data Edward Chang Hector Garcia-Molina Stanford University

71

Challenges

Must Ensure Arriving Bits Do Not Overflow the Network Buffer

Must Ensure Decoder Buffer Does Not Underflow

Must Work for Any Off-the-shelf Disks, CPU Box

Page 72: On Managing Continuous Media Data Edward Chang Hector Garcia-Molina Stanford University

72

Our Contribution: MEDIC

MEDIC: MEmory & Disk Integrated Cache

MEDIC Manages IOs Between Memory and Disk Efficiently Only 4 Mbytes main memory needed!!!Make a set-top box affordable

MEDIC Adapts to Hardware Configuration

Page 73: On Managing Continuous Media Data Edward Chang Hector Garcia-Molina Stanford University

73

Demo

Regular PlaybackPauseResume Regular PlaybackFast ForwardInstant Replay (not shown)

Page 74: On Managing Continuous Media Data Edward Chang Hector Garcia-Molina Stanford University

74

Visualize MEDIC

Page 75: On Managing Continuous Media Data Edward Chang Hector Garcia-Molina Stanford University

75

Conclusions (Contributions in Blue)

Server (Single Disk)Revisiting Conventional Wisdom Minimizing CostMinimizing Initial Latency

Server (Parallel Disks)Balancing WorkloadMinimizing Cost & Initial Latency

ClientHandling VBRSupporting VCR-like Functions

Page 76: On Managing Continuous Media Data Edward Chang Hector Garcia-Molina Stanford University

76

…Conclusions

Our Server SupportsLow Latency Playback and Fast Forward

Our Client SupportsPause and Low Latency Instance Replay

Together, We Propose A Complete End-to-end Solution for Continuous Media Data Delivery!

Page 77: On Managing Continuous Media Data Edward Chang Hector Garcia-Molina Stanford University

77

Future Work

Enhancing MEDIC for Managing Heterogeneous Data, from Both Broadcast & Internet ChannelsVideo PanoramasInteractive TV

Indexing Videos for ReplayVideo/Image databases