bigtable: a distributed storage system for structured...
TRANSCRIPT
![Page 1: Bigtable: A Distributed Storage System for Structured Dataranger.uta.edu/~sjiang/CSE6350-spring-19/Presentations-materials/Topic... · •Bigtable is a distributed multi-dimensional](https://reader031.vdocuments.mx/reader031/viewer/2022041521/5e2e826da100e4121d5c236b/html5/thumbnails/1.jpg)
BIGTABLE: A DISTRIBUTED
STORAGE SYSTEM FOR
STRUCTURED DATA
Presenter: Qiping Wei1
![Page 2: Bigtable: A Distributed Storage System for Structured Dataranger.uta.edu/~sjiang/CSE6350-spring-19/Presentations-materials/Topic... · •Bigtable is a distributed multi-dimensional](https://reader031.vdocuments.mx/reader031/viewer/2022041521/5e2e826da100e4121d5c236b/html5/thumbnails/2.jpg)
Introduction
2
Bigtable is a distributed storage system for managing structured data.– designed to scale to petabytes of data
– many projects at Google store data in it• web indexing, Google earth and Google finance,...
• different data sizes: URL, web pages, satellite imagery,...
• different latency requirements: backend bulk processing to real time data serving
– Provide a flexible high performance solution
– Rely on Chubby lock service
– Use Google file system to store data items
![Page 3: Bigtable: A Distributed Storage System for Structured Dataranger.uta.edu/~sjiang/CSE6350-spring-19/Presentations-materials/Topic... · •Bigtable is a distributed multi-dimensional](https://reader031.vdocuments.mx/reader031/viewer/2022041521/5e2e826da100e4121d5c236b/html5/thumbnails/3.jpg)
3
Row key
timestamp
column key
![Page 4: Bigtable: A Distributed Storage System for Structured Dataranger.uta.edu/~sjiang/CSE6350-spring-19/Presentations-materials/Topic... · •Bigtable is a distributed multi-dimensional](https://reader031.vdocuments.mx/reader031/viewer/2022041521/5e2e826da100e4121d5c236b/html5/thumbnails/4.jpg)
Property: reads of short row ranges are efficient
4
Row key value
AA … Value
AZ … Value
BA … Value
BU … Value
CB … Value
CM … Value…
……
……
Row range
tablet
A row rangeUnit of distribution
![Page 5: Bigtable: A Distributed Storage System for Structured Dataranger.uta.edu/~sjiang/CSE6350-spring-19/Presentations-materials/Topic... · •Bigtable is a distributed multi-dimensional](https://reader031.vdocuments.mx/reader031/viewer/2022041521/5e2e826da100e4121d5c236b/html5/thumbnails/5.jpg)
g
Good locality: obtained by placing access-related data together.
Answer: select row keys in the way that keys can be grouped together.
Eg. two row keys: apple, banana
revised row keys: fruit.apple, fruit.banana
5
![Page 6: Bigtable: A Distributed Storage System for Structured Dataranger.uta.edu/~sjiang/CSE6350-spring-19/Presentations-materials/Topic... · •Bigtable is a distributed multi-dimensional](https://reader031.vdocuments.mx/reader031/viewer/2022041521/5e2e826da100e4121d5c236b/html5/thumbnails/6.jpg)
g
• get a lot of useful data by doing only one read request.
• Reduce the number of read requests.
• Experience high speed of read operations.
6
![Page 7: Bigtable: A Distributed Storage System for Structured Dataranger.uta.edu/~sjiang/CSE6350-spring-19/Presentations-materials/Topic... · •Bigtable is a distributed multi-dimensional](https://reader031.vdocuments.mx/reader031/viewer/2022041521/5e2e826da100e4121d5c236b/html5/thumbnails/7.jpg)
Architecture• Bigtable has three major components:
– One master server• Assigns tablets to tablet servers
• Detects addition and expiration of tablet servers
• Balances the load of tablet servers
• Collect garbage in GFS
• Handles schema changes
– Many tablet servers• Manage a set of tablets
• Handle read/write requests from clients
• Store data in GFS
– A library linked to every client• Communicate tablet servers directly to reads and writes
7
![Page 8: Bigtable: A Distributed Storage System for Structured Dataranger.uta.edu/~sjiang/CSE6350-spring-19/Presentations-materials/Topic... · •Bigtable is a distributed multi-dimensional](https://reader031.vdocuments.mx/reader031/viewer/2022041521/5e2e826da100e4121d5c236b/html5/thumbnails/8.jpg)
8
Tabletserver
Tabletserver
Tabletserver
One Master
Bigtable CientLibrary ……
read/write request
•Balance load•Handle schema changes•Collect garbage
• Assign tablets • Detect addition & expiration of tablet servers
Manage a set of tabletsHandle read/write requestStore data in GFS
![Page 9: Bigtable: A Distributed Storage System for Structured Dataranger.uta.edu/~sjiang/CSE6350-spring-19/Presentations-materials/Topic... · •Bigtable is a distributed multi-dimensional](https://reader031.vdocuments.mx/reader031/viewer/2022041521/5e2e826da100e4121d5c236b/html5/thumbnails/9.jpg)
Data items: either in log files or in SSTables.
GFS: provide data reliability
By having multiple replicas
Of data.
9
SSTablesLog files
GFS
memtable
Mem
![Page 10: Bigtable: A Distributed Storage System for Structured Dataranger.uta.edu/~sjiang/CSE6350-spring-19/Presentations-materials/Topic... · •Bigtable is a distributed multi-dimensional](https://reader031.vdocuments.mx/reader031/viewer/2022041521/5e2e826da100e4121d5c236b/html5/thumbnails/10.jpg)
immutable: not allowed to modify the data.
This feature has many benefits:
• simplify various parts of Bigtable
eg. cache maintaining is easy;concurrency control implementation is efficient
• split tablet quickly
• Restore data 10
![Page 11: Bigtable: A Distributed Storage System for Structured Dataranger.uta.edu/~sjiang/CSE6350-spring-19/Presentations-materials/Topic... · •Bigtable is a distributed multi-dimensional](https://reader031.vdocuments.mx/reader031/viewer/2022041521/5e2e826da100e4121d5c236b/html5/thumbnails/11.jpg)
Assume that the KV item exists, start searching from memtable, then SSTables from low level to high level until find it.
Here are the steps:
• Check in-memory index first
• Find the appropriate block
• Check Bloom Filter to see if the KV item is there.
• If yes, read the block from disk and get the value.
• If no, continue the above steps until find the block and get the value.
11
![Page 12: Bigtable: A Distributed Storage System for Structured Dataranger.uta.edu/~sjiang/CSE6350-spring-19/Presentations-materials/Topic... · •Bigtable is a distributed multi-dimensional](https://reader031.vdocuments.mx/reader031/viewer/2022041521/5e2e826da100e4121d5c236b/html5/thumbnails/12.jpg)
Benefits resulting from sorting the key:
• support range search
• Reduce index size
12
![Page 13: Bigtable: A Distributed Storage System for Structured Dataranger.uta.edu/~sjiang/CSE6350-spring-19/Presentations-materials/Topic... · •Bigtable is a distributed multi-dimensional](https://reader031.vdocuments.mx/reader031/viewer/2022041521/5e2e826da100e4121d5c236b/html5/thumbnails/13.jpg)
• these updates: updates committed to the commit log.
• How to do an update?
• based on write operation
• depends on the manner of searching key
from top to bottom
• chooses the latest key 13
![Page 14: Bigtable: A Distributed Storage System for Structured Dataranger.uta.edu/~sjiang/CSE6350-spring-19/Presentations-materials/Topic... · •Bigtable is a distributed multi-dimensional](https://reader031.vdocuments.mx/reader031/viewer/2022041521/5e2e826da100e4121d5c236b/html5/thumbnails/14.jpg)
• Why exist?
– Limited memtable size
– Immutable SSTables
– Multiple versions allowed
• Why exist in multiple SSTables?
– SSTables from different levels can have overlap ranges
14
![Page 15: Bigtable: A Distributed Storage System for Structured Dataranger.uta.edu/~sjiang/CSE6350-spring-19/Presentations-materials/Topic... · •Bigtable is a distributed multi-dimensional](https://reader031.vdocuments.mx/reader031/viewer/2022041521/5e2e826da100e4121d5c236b/html5/thumbnails/15.jpg)
• Minor compaction: converting the memtable to a new SSTable.
• Major compaction: – turning multiple SSTables to a new large SSTable.
– No deletion information or deleted data
15
GFS
Memtable
SSTable
SSTable…
Mem
SSTable
Major compaction
Minor compaction
![Page 16: Bigtable: A Distributed Storage System for Structured Dataranger.uta.edu/~sjiang/CSE6350-spring-19/Presentations-materials/Topic... · •Bigtable is a distributed multi-dimensional](https://reader031.vdocuments.mx/reader031/viewer/2022041521/5e2e826da100e4121d5c236b/html5/thumbnails/16.jpg)
Contributions from major compaction:
• bound the number of SSTables
• reclaim resources used by deleted data
• Remove overlapped ranges to support range search
16
GFS
Read OpMemtable
SSTable
SSTable…
Mem
SSTable
![Page 17: Bigtable: A Distributed Storage System for Structured Dataranger.uta.edu/~sjiang/CSE6350-spring-19/Presentations-materials/Topic... · •Bigtable is a distributed multi-dimensional](https://reader031.vdocuments.mx/reader031/viewer/2022041521/5e2e826da100e4121d5c236b/html5/thumbnails/17.jpg)
17
4. Major CMPTGFS
Delete Op
Memtable
SSTable
SSTable
…
Mem
SSTable
Commit log
3.Minor CMPT
1. write a special deletion record
2. Insert record
![Page 18: Bigtable: A Distributed Storage System for Structured Dataranger.uta.edu/~sjiang/CSE6350-spring-19/Presentations-materials/Topic... · •Bigtable is a distributed multi-dimensional](https://reader031.vdocuments.mx/reader031/viewer/2022041521/5e2e826da100e4121d5c236b/html5/thumbnails/18.jpg)
Conclusion• Bigtable is a distributed multi-dimensional sorted
map indexed by a row key, a column key and a timestamp.
• The sorted feature has many benefits:– support range search– reduce index size– support read/write operations– allow to manipulate row keys to get good locality
• Bigtable provides high data reliability through GFS.
• The immutability and compaction of SSTAblessimplify and improve the performance of Bigtable.
18
![Page 19: Bigtable: A Distributed Storage System for Structured Dataranger.uta.edu/~sjiang/CSE6350-spring-19/Presentations-materials/Topic... · •Bigtable is a distributed multi-dimensional](https://reader031.vdocuments.mx/reader031/viewer/2022041521/5e2e826da100e4121d5c236b/html5/thumbnails/19.jpg)
Reference
• F. Chang, J. Dean, S. Ghemawat, W.C. Hsieh, D.A. Wallach, M. Burrows, T. Chandra, A. Fikes, R.E. Gruber. Bigtable: A Distributed Storage System for Structured Data. OSDI, 2006
• Lecture: the Google Bigtable. https://www.slideshare.net/romain_jacotin/undestand-google-bigtable-is-as-easy-as-playing-lego-bricks-lecture-by-romain-jacotinhe. October,2014
19