compaction, compaction everywhere
DESCRIPTION
Compaction is the consequence of the Log-Structured Merge-Tree engine used by Cassandra. Starting with the SizeTieredCompactionStrategy, we added the LeveledCompactionStrategy and recently the DateTieredCompactionStrategy it has always required some care and feeding. In this talk Aaron Morton, Co-Founder and Principal Consultant at The Last Pickle, will discuss the different strategies, their options, and when to use them.TRANSCRIPT
Licensed under a Creative Commons Attribution-NonCommercial 3.0 New Zealand License
SOUTH BAY CASSANDRA USERS NOVEMBER 2014
COMPACTION, COMPACTION, EVERYWHERE.
Licensed under a Creative Commons Attribution-NonCommercial 3.0 New Zealand License
Aaron Morton@aaronmorton
Co-Founder & Principal Consultant
About The Last Pickle.
Work with clients to deliver and improve Apache Cassandra based solutions.
Apache Cassandra Committer, DataStax MVP, Apache
Usergrid Committer. Based in New Zealand & USA.
Compaction?STCSLCS
DTCS
Compaction?
Because reasons.
No compaction?
Row fragmentation would result in dramatically increased
read latency.
No compaction?
Increased file count would increase memory usage.
No compaction?
Overwrites and deletions would result in wasted disk
space.
Compaction?
Yes.
Compaction?
Log Structured Merge Tree.
Compaction?
Creating new files when flushing to disk improves performance and reduces
complexity.
Compaction?
SSTable 1foo: dishwasher (ts 10): tomato purple (ts 10): cromulent
SSTable 2foo: frink (ts 20): flayven monkey (ts 10): embiggins
SSTable 3 SSTable 4foo: dishwasher (ts 15): tomacco
SSTable 5
Demo.
nodetool cfhistograms foo bar
SSTables per Read 1 sstables: 149 2 sstables: 62 3 sstables: 65 4 sstables: 50 5 sstables: 45 6 sstables: 44 7 sstables: 76 8 sstables: 72 10 sstables: 305 12 sstables: 390
Compaction?STCSLCS
DTCS
SizeTieredCompactionStrategy
The first compaction strategy.
Group files of a similar size for compaction.
SizeTieredCompactionStrategy
Works well when data is written to initially and then
only read from.
STCS - After flush.
Tier 0 ( < 50 MB) Tier 1 ~125MB
STCS - Compaction Starts
Tier 0 ( < 50 MB) Tier 1 ~125MB
STCS - New SSTable
Tier 0 ( < 50 MB) Tier 1 ~125MB
STCS - Purge old SSTables
Tier 0 ( < 50 MB) Tier 1 ~125MB
STCS - Compaction Starts (again)
Tier 0 ( < 50 MB) Tier 1 ~125MB
STCS - Final State
Tier 0 ( < 50 MB) Tier 1 ~200MB Tier 2 ~800MB
Maximum size of SSTables in the “small” bucket.
Default 50
STCS - min_sstable_size
Lower bound of the bucket size compared to the average
size in the bucket. Default 0.5
STCS - bucket_low
Upper bound of the bucket size compared to the average
size in the bucket. Default 1.5
STCS - bucket_high
Maximum percentage of reads SSTables ignored by STCS may
be responsible for.Default 0
STCS - cold_reads_to_omit
Compact buckets with at least this many SSTables.
Default 4
min_compaction_threshold
Compact no more than this many SSTables in a bucket.
Default 32
max_compaction_threshold
Compaction?STCSLCS
DTCS
LeveledCompactionStrategy
Based on LevelDB from the Chromium team.
http://leveldb.org/
LeveledCompactionStrategy
Works well with overwrites and tombstones.
Provides low read latency.
LeveledCompactionStrategy
“Uses twice the disk IO”
DataStax Blogs
“Leveled Compaction in Apache Cassandra”
“When to Use Leveled Compaction”
LCS - “It’s going to be all levels Jerry”
Level Number of Files
0 Unlimited*1 1002 10003 10000
LCS in nodetool cfstats
Column Family: HappyPandaCF SSTable count: 21 SSTables in each level: [1, 7, 13, 0, 0, 0, 0, 0, 0]
Column Family: SadPandaCF SSTable count: 710 SSTables in each level: [1, 10, 117/100, 582, 0, 0, 0, 0, 0]
LCS - Starting out
level 0
LCS - New File in Level 1
level 0 level 1
LCS - Later, Compact L0 With Overlapping L1
level 1level 0
LCS - Another File in L1
level 0 level 1
LCS - Level 1 Full, compact overlapping
level 0 level 1
LCS - New Files in Level 2
level 0 level 1 level 2
Maximum* size of each SSTable at all levels.
Default 160
LCS - sstable_size_in_mb
Compaction?STCSLCS
DTCS
DateTieredCompactionStrategy
CASSANDRA-6602
In 2.0.11 and 2.1.1“Experimental”
DTCS - Compact Newest Time Bucket
80 hours > 365 days20 hours4 hours
DTCS - New File in First Bucket
80 hours > 365 days20 hours4 hours
DTCS - Promoted to Later Bucket
80 hours > 365 days20 hours4 hours
Target size.
Multiplied by min_sstable_size.
DTCS - base_time_seconds
What TimeUnit you are using for your WriteTime.
DTCS - timestamp_resolution
Do not compact SSTables where the youngest
WRITETIME is older than this.
DTCS - max_sstable_age_days
Thanks.
Licensed under a Creative Commons Attribution-NonCommercial 3.0 New Zealand License
Aaron Morton@aaronmorton
Co-Founder & Principal Consultantwww.thelastpickle.com
Licensed under a Creative Commons Attribution-NonCommercial 3.0 New Zealand License