massively distributed database systems broadcasting - data on air spring 2014 ki-joune li lik pusan...

25
Massively Distributed Database Systems Broadcasting - Data on air Spring 2014 Ki-Joune Li http://isel.cs.pusan.ac.kr/~lik Pusan National University

Upload: phebe-norman

Post on 27-Dec-2015

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Massively Distributed Database Systems Broadcasting - Data on air Spring 2014 Ki-Joune Li lik Pusan National University

Massively Distributed Database Systems

Broadcasting - Data on airSpring 2014Ki-Joune Li

http://isel.cs.pusan.ac.kr/~likPusan National University

Page 2: Massively Distributed Database Systems Broadcasting - Data on air Spring 2014 Ki-Joune Li lik Pusan National University

2

Why Broadcasting?

• Simple• Data Access Pattern: mostly asymmetric • Scalability – Very adequate for massively distrib-

uted environments• Example• DMB• TPEG

Page 3: Massively Distributed Database Systems Broadcasting - Data on air Spring 2014 Ki-Joune Li lik Pusan National University

3

TPEG – Transport Protocol Experts Group

• Broadcasting traffic information protocol

Page 4: Massively Distributed Database Systems Broadcasting - Data on air Spring 2014 Ki-Joune Li lik Pusan National University

4

TPEG – Message format

Page 5: Massively Distributed Database Systems Broadcasting - Data on air Spring 2014 Ki-Joune Li lik Pusan National University

5

TPEG Service Contents Example

Page 6: Massively Distributed Database Systems Broadcasting - Data on air Spring 2014 Ki-Joune Li lik Pusan National University

6

TPEG Service

Page 7: Massively Distributed Database Systems Broadcasting - Data on air Spring 2014 Ki-Joune Li lik Pusan National University

7

Air Update – Map Data Update

Page 8: Massively Distributed Database Systems Broadcasting - Data on air Spring 2014 Ki-Joune Li lik Pusan National University

8

Basic Idea – Broadcast DisksDisk Broadcast

Disk Access Time Frequency (Broadcasting Period)

Block Packet

Memory Hierarchy Multiple Broadcasting Disks (paper -1)

File Structure Message Format (paper -2)

Indexing Indexing Broadcasting (paper – 3)

Query Processing Query processing for Broadcasting Data(paper – 4)

Page 9: Massively Distributed Database Systems Broadcasting - Data on air Spring 2014 Ki-Joune Li lik Pusan National University

9

Key papers and documents

• S. Acharya, et al. “Broadcast Disks: Data Management for Asymmetric Communication Environments”, ACM SIGMOD 1996, pp.199-210• T. Imielinkski, S. Viswanathan, and B.R. Badrinath, “Data on

Air: Organization and Access”, IEEE TKDE Vol.9 No.3, 1997, pp.353-372• J. Xu et al. “Energy Efficient Indexing for Quering Location

Dependent Data in Mobile Broadcasting Environments, ICDE 2003, pp.239-250• B. Zheng et al. “Spatial Queries in Wireless Broadcast Sys-

tems”, Wireless Network, Vol.10, pp.723-736, 2004• tisa.org, TPEG, http://www.tisa.org/assets/Uploads/

Public/TISA14001TPEGWhatisitallabout2014.pdf

Page 10: Massively Distributed Database Systems Broadcasting - Data on air Spring 2014 Ki-Joune Li lik Pusan National University

10

Paper #1 – Broadcasting disks in SIGMOD 1995

Page 11: Massively Distributed Database Systems Broadcasting - Data on air Spring 2014 Ki-Joune Li lik Pusan National University

11

Key Ideas

• Broadcasting as a disk• How to organize broadcast message• Flat Message as a disk• Message with different frequencies as multiple disks

• Two Issues• How to organize message – Server Side• How to maintain cache – Client Side

Page 12: Massively Distributed Database Systems Broadcasting - Data on air Spring 2014 Ki-Joune Li lik Pusan National University

12

Message Format

• Given three data items A, B, and C to broadcast with different access probability,

Flat format

Skewed format

Multiple disks format

Page 13: Massively Distributed Database Systems Broadcasting - Data on air Spring 2014 Ki-Joune Li lik Pusan National University

13

Performance Measures

• What is the goal?• To minimize the average waiting time (expected delay)

• Example

Page 14: Massively Distributed Database Systems Broadcasting - Data on air Spring 2014 Ki-Joune Li lik Pusan National University

14

Message Formatting Method - Server

• Algorithm• 1. Sort and classify pages by access probability • 2. Determine relative frequency of each disk (page)• 3. Partition each disk into a set of chunks• 4. Define the message format with multiple disks

• Example• 4 pages/cycle

Relative frequenciesF(T1)=1, F(T2)=2, F(T3)=4

LCM=4 minor cycles

Length(T3)/LCM=2

Major Cycle=S*LCM

Page 15: Massively Distributed Database Systems Broadcasting - Data on air Spring 2014 Ki-Joune Li lik Pusan National University

15

Caching Policy at Client

• Replacement Policy• Not LRU

• Point 1Caching hottest page – problematic.If a page is considered as a hottest page by server, then frequent broadcasting, and therefore caching is not really necessary

• Point 2Server’s policy is to minimize the average delay!= Local Demands

Page 16: Massively Distributed Database Systems Broadcasting - Data on air Spring 2014 Ki-Joune Li lik Pusan National University

16

Caching Policy at Client

• For a given item A, we need to consider • Broadcasting frequency (X) and• Local access probability (P)

• Replacement in terms of• PIX (P/X) instead of LRU

Page 17: Massively Distributed Database Systems Broadcasting - Data on air Spring 2014 Ki-Joune Li lik Pusan National University

17

Paper #2 – Organization and Access, TKDE 9(3), 1997

Page 18: Massively Distributed Database Systems Broadcasting - Data on air Spring 2014 Ki-Joune Li lik Pusan National University

18

Key Ideas

• Disk Access – Disk Access Time• Two different measures• Latency and• Energy Consumption

• Data Access Time in Data on Air• Tuning Time: Amount of time spent by a client listening

to the channel Power Consumption• Latency: Time elapsed from the time that a client re-

quests data to the point of completing data downloads • Tuning time + Latency Data Access Time

Page 19: Massively Distributed Database Systems Broadcasting - Data on air Spring 2014 Ki-Joune Li lik Pusan National University

19

Broadcast data format

Bucket ID

Bcast ptr

idx ptr

Bucket type

Bucket

. . .

bcast

• Without Index, we need a full scanning of a bcast• Issue• How to organize and Where to place Index• For reducing tuning time and latency

Page 20: Massively Distributed Database Systems Broadcasting - Data on air Spring 2014 Ki-Joune Li lik Pusan National University

20

Data Access

. . .

1. Client joins here

Index

2. Wait until the index arrives

3. Wait until data bucket arrives

. . .

4. Read data

Page 21: Massively Distributed Database Systems Broadcasting - Data on air Spring 2014 Ki-Joune Li lik Pusan National University

21

Where to place Index

No Index

Single Index

(1,m) Index

What’s the difference? Probably (1,m) may improve the performance

Page 22: Massively Distributed Database Systems Broadcasting - Data on air Spring 2014 Ki-Joune Li lik Pusan National University

22

How to organize

• Full duplication vs. Relevant Duplication

Page 23: Massively Distributed Database Systems Broadcasting - Data on air Spring 2014 Ki-Joune Li lik Pusan National University

23

No replication

Page 24: Massively Distributed Database Systems Broadcasting - Data on air Spring 2014 Ki-Joune Li lik Pusan National University

24

Entire Path Replication

Page 25: Massively Distributed Database Systems Broadcasting - Data on air Spring 2014 Ki-Joune Li lik Pusan National University

25

Distributed Index