bigtable: a distributed storage system for structured data 1
TRANSCRIPT
![Page 1: Bigtable: A Distributed Storage System for Structured Data 1](https://reader036.vdocuments.mx/reader036/viewer/2022062517/56649edc5503460f94bed818/html5/thumbnails/1.jpg)
The name of Allah
Bigtable : A Distributed Storage System for Structured Data
PRESENTER:
BAHAREH HABIBI SHILAN HABIBI
SAMAN FOROUZANDEH
Bigtable: A Distributed Storage System for Structured Data 1
![Page 2: Bigtable: A Distributed Storage System for Structured Data 1](https://reader036.vdocuments.mx/reader036/viewer/2022062517/56649edc5503460f94bed818/html5/thumbnails/2.jpg)
Before we begin …
BigTable
Sawzall
MapReduce
Bloom Filters
Bigtable: A Distributed Storage System for Structured Data 2
![Page 3: Bigtable: A Distributed Storage System for Structured Data 1](https://reader036.vdocuments.mx/reader036/viewer/2022062517/56649edc5503460f94bed818/html5/thumbnails/3.jpg)
Table of Contents
Introduction Data Model API Building Blocks Implementation Refinements Performance Evaluation Real Applications Lessons Related Work Conclusions
Bigtable: A Distributed Storage System for Structured Data 3
![Page 4: Bigtable: A Distributed Storage System for Structured Data 1](https://reader036.vdocuments.mx/reader036/viewer/2022062517/56649edc5503460f94bed818/html5/thumbnails/4.jpg)
Introduction
What is Bigtable?
A distributed storage system for managing structured data at Google
Used by > 60 Google products
Google Analytics
Google reader
Personalized Search
Orkut
Bigtable: A Distributed Storage System for Structured Data 4
![Page 5: Bigtable: A Distributed Storage System for Structured Data 1](https://reader036.vdocuments.mx/reader036/viewer/2022062517/56649edc5503460f94bed818/html5/thumbnails/5.jpg)
Introduction
Goals Wide applicability Scalability High performance High availability
Bigtable and Database
Bigtable does not support a full relational data model
Bigtable: A Distributed Storage System for Structured Data 5
![Page 6: Bigtable: A Distributed Storage System for Structured Data 1](https://reader036.vdocuments.mx/reader036/viewer/2022062517/56649edc5503460f94bed818/html5/thumbnails/6.jpg)
A Bigtable is sparse, distributed, persistent multi-dimensional
sorted map
Distributed multi-dimensional sparse map (row, column, timestamp) cell contents
Webtable
Data Model
Bigtable: A Distributed Storage System for Structured Data 6
![Page 7: Bigtable: A Distributed Storage System for Structured Data 1](https://reader036.vdocuments.mx/reader036/viewer/2022062517/56649edc5503460f94bed818/html5/thumbnails/7.jpg)
Rows
row keys are arbitrary strings up to 64KB
every read or write of data in a single row is atomic (regardless
of the # or columns)
row ranges are dynamically partitioned into tablets
Bigtable: A Distributed Storage System for Structured Data 7
Data Model
![Page 8: Bigtable: A Distributed Storage System for Structured Data 1](https://reader036.vdocuments.mx/reader036/viewer/2022062517/56649edc5503460f94bed818/html5/thumbnails/8.jpg)
Data Model
Column Families
column keys are grouped into sets called column families
usually of the same type
number of columns families should be small
number of columns is unbounded
access control is at the column family level
Bigtable: A Distributed Storage System for Structured Data 8
![Page 9: Bigtable: A Distributed Storage System for Structured Data 1](https://reader036.vdocuments.mx/reader036/viewer/2022062517/56649edc5503460f94bed818/html5/thumbnails/9.jpg)
Data Model
Timestamps
Each cell in a Bigtable can contain multiple versions of the
same data
Versions are indexed by 64-bit integer timestamps
Garage-collection settings per-column-family:
only the last n versions of a cell be kept, or
only new-enough versions be kept
Bigtable: A Distributed Storage System for Structured Data 9
![Page 10: Bigtable: A Distributed Storage System for Structured Data 1](https://reader036.vdocuments.mx/reader036/viewer/2022062517/56649edc5503460f94bed818/html5/thumbnails/10.jpg)
Data Model
Bigtable: A Distributed Storage System for Structured Data 10
Rows
Columns
Timestamps
![Page 11: Bigtable: A Distributed Storage System for Structured Data 1](https://reader036.vdocuments.mx/reader036/viewer/2022062517/56649edc5503460f94bed818/html5/thumbnails/11.jpg)
API
Metadata operations Create/delete tables or column families
Change metadata
Writes (atomic)
Bigtable does not support general transactions across row keys does not support writing to Bigtable
filtering, summarization, and transformation
Bigtable can be used with MapReduce
Bigtable: A Distributed Storage System for Structured Data 11
![Page 12: Bigtable: A Distributed Storage System for Structured Data 1](https://reader036.vdocuments.mx/reader036/viewer/2022062517/56649edc5503460f94bed818/html5/thumbnails/12.jpg)
Building Blocks
Google File System (GFS) used to store log and data files
Scheduler cluster management system used to manage jobs and resources
SSTable file format used internally to store Bigtable data
Chubby distributed lock service highly-available with five active replicas• 0.0047% unavailability for 14 Bibtable clusters• 0.0326% unavailability for most affectected cluster
Bigtable: A Distributed Storage System for Structured Data 12
![Page 13: Bigtable: A Distributed Storage System for Structured Data 1](https://reader036.vdocuments.mx/reader036/viewer/2022062517/56649edc5503460f94bed818/html5/thumbnails/13.jpg)
Implementation
What is a tablet?
A Bigtable cluster stores a number of tables
Each table consists of a set of tablets
Each tablet managed by a specific tablet server
As a table grows, it is automatically split into multiple
tablets (100-200) MB in size by default
Tablet servers handle read/write requests for their tablets
Bigtable: A Distributed Storage System for Structured Data 13
![Page 14: Bigtable: A Distributed Storage System for Structured Data 1](https://reader036.vdocuments.mx/reader036/viewer/2022062517/56649edc5503460f94bed818/html5/thumbnails/14.jpg)
Implementation
BigTable: Servers
Master manages assignment of tablets servers
Bigtable: A Distributed Storage System for Structured Data 14
Tablet server 1
Bigtable Master
Tablet server 2
Tablets Tablets
![Page 15: Bigtable: A Distributed Storage System for Structured Data 1](https://reader036.vdocuments.mx/reader036/viewer/2022062517/56649edc5503460f94bed818/html5/thumbnails/15.jpg)
Implementation
Tablet Location A three-level hierarchy of tablets is used to store tablet
locations The root tablet is never split
Bigtable: A Distributed Storage System for Structured Data 15
![Page 16: Bigtable: A Distributed Storage System for Structured Data 1](https://reader036.vdocuments.mx/reader036/viewer/2022062517/56649edc5503460f94bed818/html5/thumbnails/16.jpg)
ImplementationTablet Assignment
A master server is responsible for assigning tablets to tablet
servers
The master server also:
detects addition and expiration of tablet servers
balances tablet server loads
initiates garbage collection of files in GFS
reassigns tablets when a tablet server is lost
If the master server dies, a new master server is recreated
Bigtable: A Distributed Storage System for Structured Data 16
![Page 17: Bigtable: A Distributed Storage System for Structured Data 1](https://reader036.vdocuments.mx/reader036/viewer/2022062517/56649edc5503460f94bed818/html5/thumbnails/17.jpg)
Implementation
Tablet Serving
The persistent state of a tablet as stored in GFS
Bigtable: A Distributed Storage System for Structured Data 17
memtable Read Op
TABLET LOG
Write OpSSTable Files
Memory
GFS
![Page 18: Bigtable: A Distributed Storage System for Structured Data 1](https://reader036.vdocuments.mx/reader036/viewer/2022062517/56649edc5503460f94bed818/html5/thumbnails/18.jpg)
Implementation
Compactions
Minor Compactions
memtable size reaches a threshold
memtable is frozen
new memtable is created
frozen memtable is converted into a new SSTable
Merging Compactions
Bigtable: A Distributed Storage System for Structured Data 18
![Page 19: Bigtable: A Distributed Storage System for Structured Data 1](https://reader036.vdocuments.mx/reader036/viewer/2022062517/56649edc5503460f94bed818/html5/thumbnails/19.jpg)
Refinements
A number of refinements were required for Bigtable implementations to achieve high:
performance
availability
reliability
Bigtable: A Distributed Storage System for Structured Data 19
![Page 20: Bigtable: A Distributed Storage System for Structured Data 1](https://reader036.vdocuments.mx/reader036/viewer/2022062517/56649edc5503460f94bed818/html5/thumbnails/20.jpg)
Refinements
Locality groups
Clients can group multiple column families together into a
locality group
A separate SSTable is generated for each locality group
Segregating column families which are not typically accessed
together enables more efficient reads
Bigtable: A Distributed Storage System for Structured Data 20
![Page 21: Bigtable: A Distributed Storage System for Structured Data 1](https://reader036.vdocuments.mx/reader036/viewer/2022062517/56649edc5503460f94bed818/html5/thumbnails/21.jpg)
Refinements
Compression
Clients can control whether compression is used on a locality
group
Many clients use a two pass compression algorithm
Bentley and McIlroy's scheme
Bigtable: A Distributed Storage System for Structured Data 21
![Page 22: Bigtable: A Distributed Storage System for Structured Data 1](https://reader036.vdocuments.mx/reader036/viewer/2022062517/56649edc5503460f94bed818/html5/thumbnails/22.jpg)
Refinements
Caching & Bloom Filters
Tablets use two levels of caching to improve read performance
Scan caching is useful for data which tends to be read
repeatedly
Block caching is useful for when read data tends to be close to
data recently read
Bloom filters reduce disk seeks by allowing a client to ask
whether a SSTable contains a row/column key pair
Bigtable: A Distributed Storage System for Structured Data 22
![Page 23: Bigtable: A Distributed Storage System for Structured Data 1](https://reader036.vdocuments.mx/reader036/viewer/2022062517/56649edc5503460f94bed818/html5/thumbnails/23.jpg)
Refinements
Speeding Table Recovery
When a tablet is moved to another tablet server : A minor compaction is performed
The tablet server stop serving the tablet
Another minor compaction (unusually fast)
Then the tablet is moved without requiring any log entry recovery
Bigtable: A Distributed Storage System for Structured Data 23
![Page 24: Bigtable: A Distributed Storage System for Structured Data 1](https://reader036.vdocuments.mx/reader036/viewer/2022062517/56649edc5503460f94bed818/html5/thumbnails/24.jpg)
Refinements
Exploiting Immutability
Because SSTables are immutable, various parts of the Bigtable
system have been simplified: file system access synchronization
permanently removing deleted data is completely handled thru garbage
collection
splitting tables is efficient because child tablets can share the SSTable of
parent tablets
Bigtable: A Distributed Storage System for Structured Data 24
![Page 25: Bigtable: A Distributed Storage System for Structured Data 1](https://reader036.vdocuments.mx/reader036/viewer/2022062517/56649edc5503460f94bed818/html5/thumbnails/25.jpg)
Performance Evaluation
Google setup a Bigtable cluster with N tablet servers to measure
performance and scalability as N is varied.
configured to use 1 GB of memory
each with two 400GB IDE hard drives, two dual core 2 GHz chips, and a single
gigabit Ethernet link
N client machines generated the Bigtable load used for tests
Every machine ran a GFS server.
Bigtable: A Distributed Storage System for Structured Data 25
![Page 26: Bigtable: A Distributed Storage System for Structured Data 1](https://reader036.vdocuments.mx/reader036/viewer/2022062517/56649edc5503460f94bed818/html5/thumbnails/26.jpg)
Performance Evaluation
Single tablet - server performance
Bigtable: A Distributed Storage System for Structured Data 26
Experiment# of Tablet Servers
1 50 250 500
Random Reads 1212 593 479 241Random Reads (mem) 10811 8511 8000 6250
Random Writes 8850 3745 3425 2000Sequintial Reads 4425 2463 2625 2469Sequintial Writes 8547 3623 2451 1905
Scans 15385 10526 9524 7843
![Page 27: Bigtable: A Distributed Storage System for Structured Data 1](https://reader036.vdocuments.mx/reader036/viewer/2022062517/56649edc5503460f94bed818/html5/thumbnails/27.jpg)
Performance Evaluation
Scaling : Aggregate throughput increases by over a factor of
100 as the number of tablet servers is increased from 1 to 500.
Bigtable: A Distributed Storage System for Structured Data 27
![Page 28: Bigtable: A Distributed Storage System for Structured Data 1](https://reader036.vdocuments.mx/reader036/viewer/2022062517/56649edc5503460f94bed818/html5/thumbnails/28.jpg)
Real Applications
As of August 2006 388 non-test Bigtable cluster
24500 tablet servers
Bigtable: A Distributed Storage System for Structured Data 28
# of Tablet Servers # of Clusters
0 .. 19 259
20 .. 49 47
50 .. 99 20
100 .. 499 50
> 500 12
![Page 29: Bigtable: A Distributed Storage System for Structured Data 1](https://reader036.vdocuments.mx/reader036/viewer/2022062517/56649edc5503460f94bed818/html5/thumbnails/29.jpg)
Real Applications
Bigtable: A Distributed Storage System for Structured Data 29
This table provides some data about a few of the tables currently in use
Table size (measured before compression) and # Cells indicate approximate sizes
![Page 30: Bigtable: A Distributed Storage System for Structured Data 1](https://reader036.vdocuments.mx/reader036/viewer/2022062517/56649edc5503460f94bed818/html5/thumbnails/30.jpg)
Real Applications
Google Analytics Google Analytics is supported by 2 Bigtables
200 TB raw click table 20 TB summary table
Bigtable: A Distributed Storage System for Structured Data 30
![Page 31: Bigtable: A Distributed Storage System for Structured Data 1](https://reader036.vdocuments.mx/reader036/viewer/2022062517/56649edc5503460f94bed818/html5/thumbnails/31.jpg)
Real Applications
Google Earth Google Earth is supported by 2 Bigtables
70 TB images table, compression turned off 500 GB index table
Bigtable: A Distributed Storage System for Structured Data 31
![Page 32: Bigtable: A Distributed Storage System for Structured Data 1](https://reader036.vdocuments.mx/reader036/viewer/2022062517/56649edc5503460f94bed818/html5/thumbnails/32.jpg)
Real Applications
Personalized Search Personalized Search supported by 1 Bigtable
one row per user id separate column family for each type of action
Bigtable: A Distributed Storage System for Structured Data 32
![Page 33: Bigtable: A Distributed Storage System for Structured Data 1](https://reader036.vdocuments.mx/reader036/viewer/2022062517/56649edc5503460f94bed818/html5/thumbnails/33.jpg)
Lessons learned
Large distributed systems are vulnerable to many types of failures
memory and network corruption hung machines extended and asymmetric network partitions bugs in other systems (i.e. Chubby) overflow of GFS quotas planned and unplanned hardware maintenance
To address experience problems some protocols have been changed some assumptions have been modified
Bigtable: A Distributed Storage System for Structured Data 33
![Page 34: Bigtable: A Distributed Storage System for Structured Data 1](https://reader036.vdocuments.mx/reader036/viewer/2022062517/56649edc5503460f94bed818/html5/thumbnails/34.jpg)
Lessons learned
It is important to delay adding new features until it is clear
how the new features will be used
It is important to support system-level monitoring allowed for detection and fixing of many issues
also enables tracking clusters to answer common questions
Bigtable: A Distributed Storage System for Structured Data 34
![Page 35: Bigtable: A Distributed Storage System for Structured Data 1](https://reader036.vdocuments.mx/reader036/viewer/2022062517/56649edc5503460f94bed818/html5/thumbnails/35.jpg)
Related Work
The Boxwood project's goal is to provide infrastructure for
building higher-level services such as file systems or
databases
while the goal of Bigtable is to directly support client
applications that wish to store data
Bigtable: A Distributed Storage System for Structured Data 35
![Page 36: Bigtable: A Distributed Storage System for Structured Data 1](https://reader036.vdocuments.mx/reader036/viewer/2022062517/56649edc5503460f94bed818/html5/thumbnails/36.jpg)
Related Work
C-Store and Bigtable share many characteristics
shared-nothing architecture
two different data structures
however these two systems vary significantly in their APIs performance
optimization
Bigtable: A Distributed Storage System for Structured Data 36
![Page 37: Bigtable: A Distributed Storage System for Structured Data 1](https://reader036.vdocuments.mx/reader036/viewer/2022062517/56649edc5503460f94bed818/html5/thumbnails/37.jpg)
Conclusions
Bigtable is a distributed system for storing structure data at Google
in production since April 2005 seven person-years to design and implement more than 60 projects using in August 2006 users like performance and high availability
Users can scale their applications capacity by simply adding more machines to their system
Bigtable: A Distributed Storage System for Structured Data 37
![Page 38: Bigtable: A Distributed Storage System for Structured Data 1](https://reader036.vdocuments.mx/reader036/viewer/2022062517/56649edc5503460f94bed818/html5/thumbnails/38.jpg)
Conclusions
Google has begun deploying Bigtable as a service to product groups
Google has gained significant advantages by building their own storage solution
has control over implementation and infrastructure can remove bottlenecks and inefficiencies as the arise
Bigtable: A Distributed Storage System for Structured Data 38
![Page 39: Bigtable: A Distributed Storage System for Structured Data 1](https://reader036.vdocuments.mx/reader036/viewer/2022062517/56649edc5503460f94bed818/html5/thumbnails/39.jpg)
Strengths
Implementation and Usable
Optimization
Performance Evaluation
Used by > 60 Google products
Bigtable: A Distributed Storage System for Structured Data 39
![Page 40: Bigtable: A Distributed Storage System for Structured Data 1](https://reader036.vdocuments.mx/reader036/viewer/2022062517/56649edc5503460f94bed818/html5/thumbnails/40.jpg)
Weaknesses
Complexity
Chubby
Master
Network
Bigtable: A Distributed Storage System for Structured Data 40
![Page 41: Bigtable: A Distributed Storage System for Structured Data 1](https://reader036.vdocuments.mx/reader036/viewer/2022062517/56649edc5503460f94bed818/html5/thumbnails/41.jpg)
Thanks for your attention
Bigtable: A Distributed Storage System for Structured Data 41