getting innodb compression_ready_for_facebook_scale
TRANSCRIPT
![Page 1: Getting innodb compression_ready_for_facebook_scale](https://reader036.vdocuments.mx/reader036/viewer/2022062304/558e821e1a28ab8c528b457b/html5/thumbnails/1.jpg)
InnoDB CompressionGetting it ready for Facebook scale
Nizam Ordulu [email protected] Engineer, database engineering @Facebook4/11/12
![Page 2: Getting innodb compression_ready_for_facebook_scale](https://reader036.vdocuments.mx/reader036/viewer/2022062304/558e821e1a28ab8c528b457b/html5/thumbnails/2.jpg)
Why use compression
![Page 3: Getting innodb compression_ready_for_facebook_scale](https://reader036.vdocuments.mx/reader036/viewer/2022062304/558e821e1a28ab8c528b457b/html5/thumbnails/3.jpg)
Why use compression
▪ Save disk space.
▪ Buy fewer servers.
▪ Buy better disks (SSD) without too much increase in cost.
▪ Reduce IOPS.
![Page 4: Getting innodb compression_ready_for_facebook_scale](https://reader036.vdocuments.mx/reader036/viewer/2022062304/558e821e1a28ab8c528b457b/html5/thumbnails/4.jpg)
Database Size
![Page 5: Getting innodb compression_ready_for_facebook_scale](https://reader036.vdocuments.mx/reader036/viewer/2022062304/558e821e1a28ab8c528b457b/html5/thumbnails/5.jpg)
IOPS
![Page 6: Getting innodb compression_ready_for_facebook_scale](https://reader036.vdocuments.mx/reader036/viewer/2022062304/558e821e1a28ab8c528b457b/html5/thumbnails/6.jpg)
Sysbench Benchmarks
![Page 7: Getting innodb compression_ready_for_facebook_scale](https://reader036.vdocuments.mx/reader036/viewer/2022062304/558e821e1a28ab8c528b457b/html5/thumbnails/7.jpg)
SysbenchDefault table schema for sysbench
CREATE TABLE `sbtest` (
`id` int(10) unsigned NOT NULL auto_increment,
`k` int(10) unsigned NOT NULL default '0',
`c` char(120) NOT NULL default '',
`pad` char(60) NOT NULL default '',
PRIMARY KEY (`id`),
KEY `k` (`k`)
);
![Page 8: Getting innodb compression_ready_for_facebook_scale](https://reader036.vdocuments.mx/reader036/viewer/2022062304/558e821e1a28ab8c528b457b/html5/thumbnails/8.jpg)
In-memory benchmarkConfiguration
▪ Buffer pool size =1G.
▪ 16 tables.
▪ 250K rows on each table.
▪ Uncompressed db size = 1.1G.
▪ Compressed db size = 600M.
▪ In-memory benchmark.
▪ 16 threads.
![Page 9: Getting innodb compression_ready_for_facebook_scale](https://reader036.vdocuments.mx/reader036/viewer/2022062304/558e821e1a28ab8c528b457b/html5/thumbnails/9.jpg)
In-memory benchmarkLoad Time
mysql-un-compressed
mysql-compressed fb-mysql-un-compressed
fb-mysql-compressed
0
10
20
30
40
50
60
70
80Time(s)
Time(s)
![Page 10: Getting innodb compression_ready_for_facebook_scale](https://reader036.vdocuments.mx/reader036/viewer/2022062304/558e821e1a28ab8c528b457b/html5/thumbnails/10.jpg)
In-memory benchmarkDatabase size after load
mysql-uncompressed mysql-compressed fb-mysql-uncompressedfb-mysql-compressed0
200
400
600
800
1000
1200Size (M)
Size (M)
![Page 11: Getting innodb compression_ready_for_facebook_scale](https://reader036.vdocuments.mx/reader036/viewer/2022062304/558e821e1a28ab8c528b457b/html5/thumbnails/11.jpg)
In-memory benchmarkTransactions per second for reads (oltp.lua, read-only)
mysql-uncompressed mysql-compressed fb-mysql-un-compressed
fb-mysql-compressed0
1000
2000
3000
4000
5000
6000
7000
8000Transactions Per Second (Read-Only)
TPS
![Page 12: Getting innodb compression_ready_for_facebook_scale](https://reader036.vdocuments.mx/reader036/viewer/2022062304/558e821e1a28ab8c528b457b/html5/thumbnails/12.jpg)
In-memory benchmarkInserts per second (insert.lua)
mysql-uncompressed mysql-compressed fb-mysql-un-compressed
fb-mysql-compressed (4X)
0
10000
20000
30000
40000
50000
60000Inserts Per Second
IPS
![Page 13: Getting innodb compression_ready_for_facebook_scale](https://reader036.vdocuments.mx/reader036/viewer/2022062304/558e821e1a28ab8c528b457b/html5/thumbnails/13.jpg)
IO-bound benchmark for insertsInserts per second (insert.lua)
mysql-uncompressed mysql-compressed fb-mysql-un-compressed
fb-mysql-com-pressed(3.8X)
0
10000
20000
30000
40000
50000
60000Inserts Per Second
IPS
![Page 14: Getting innodb compression_ready_for_facebook_scale](https://reader036.vdocuments.mx/reader036/viewer/2022062304/558e821e1a28ab8c528b457b/html5/thumbnails/14.jpg)
InnoDB Compression
![Page 15: Getting innodb compression_ready_for_facebook_scale](https://reader036.vdocuments.mx/reader036/viewer/2022062304/558e821e1a28ab8c528b457b/html5/thumbnails/15.jpg)
InnoDB CompressionBasics
▪ 16K Pages are compressed to 1K, 2K, 4K, 8K blocks.
▪ Block size is specified during table creation.
▪ 8K is safest if data is not too compressible.
▪ blobs and varchars increase compressibility.
▪ In-memory workloads may require larger buffer pool.
![Page 16: Getting innodb compression_ready_for_facebook_scale](https://reader036.vdocuments.mx/reader036/viewer/2022062304/558e821e1a28ab8c528b457b/html5/thumbnails/16.jpg)
InnoDB CompressionExample
CREATE TABLE `sbtest1` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`k` int(10) unsigned NOT NULL DEFAULT '0',
`c` char(120) NOT NULL DEFAULT '’,
`pad` char(60) NOT NULL DEFAULT '',
PRIMARY KEY (`id`),
KEY `k_1` (`k`)
) ENGINE=InnoDB ROW_FORMAT=COMPRESSED KEY_BLOCK_SIZE=8
![Page 17: Getting innodb compression_ready_for_facebook_scale](https://reader036.vdocuments.mx/reader036/viewer/2022062304/558e821e1a28ab8c528b457b/html5/thumbnails/17.jpg)
InnoDB CompressionPage Modification Log (mlog)
▪ InnoDB does not recompress a page on every update.
▪ Updates are appended to the modification log.
▪ mlog is located in the bottom of the compressed page.
▪ When mlog is full, page is recompressed.
![Page 18: Getting innodb compression_ready_for_facebook_scale](https://reader036.vdocuments.mx/reader036/viewer/2022062304/558e821e1a28ab8c528b457b/html5/thumbnails/18.jpg)
InnoDB CompressionPage Modification Log Example
![Page 19: Getting innodb compression_ready_for_facebook_scale](https://reader036.vdocuments.mx/reader036/viewer/2022062304/558e821e1a28ab8c528b457b/html5/thumbnails/19.jpg)
InnoDB CompressionPage Modification Log Example
![Page 20: Getting innodb compression_ready_for_facebook_scale](https://reader036.vdocuments.mx/reader036/viewer/2022062304/558e821e1a28ab8c528b457b/html5/thumbnails/20.jpg)
InnoDB CompressionPage Modification Log Example
![Page 21: Getting innodb compression_ready_for_facebook_scale](https://reader036.vdocuments.mx/reader036/viewer/2022062304/558e821e1a28ab8c528b457b/html5/thumbnails/21.jpg)
InnoDB CompressionPage Modification Log Example
![Page 22: Getting innodb compression_ready_for_facebook_scale](https://reader036.vdocuments.mx/reader036/viewer/2022062304/558e821e1a28ab8c528b457b/html5/thumbnails/22.jpg)
InnoDB CompressionCompression failures are bad
▪ Compression failures:
▪ waste CPU cycles,
▪ cause mutex contention.
![Page 23: Getting innodb compression_ready_for_facebook_scale](https://reader036.vdocuments.mx/reader036/viewer/2022062304/558e821e1a28ab8c528b457b/html5/thumbnails/23.jpg)
InnoDB CompressionUnzip LRU
▪ A compressed block is decompressed when it is read.
▪ Compressed and uncompressed copy are both in memory.
▪ Any update on the page is applied to both of the copies.
▪ When it is time to evict a page:
▪ Evict an uncompressed copy if the system is IO-bound.
▪ Evict a page from the normal LRU if the system is CPU-bound.
![Page 24: Getting innodb compression_ready_for_facebook_scale](https://reader036.vdocuments.mx/reader036/viewer/2022062304/558e821e1a28ab8c528b457b/html5/thumbnails/24.jpg)
InnoDB CompressionCompressed pages written to redo log
▪ Compressed pages are written to redo log.
▪ Reasons for doing this:
▪ Reuse redo logs even if the zlib version changes.
▪ Prevent against indeterminism in compression.
▪ Increase in redo log writes.
▪ Increase in checkpoint frequency.
![Page 25: Getting innodb compression_ready_for_facebook_scale](https://reader036.vdocuments.mx/reader036/viewer/2022062304/558e821e1a28ab8c528b457b/html5/thumbnails/25.jpg)
InnoDB CompressionOfficial advice on tuning compression
If the number of “successful” compression operations (COMPRESS_OPS_OK) is a high percentage of the total number of compression operations (COMPRESS_OPS), then the system is likely performing well. If the ratio is low, then InnoDB is reorganizing, recompressing, and splitting B-tree nodes more often than is desirable. In this case, avoid compressing some tables, or increase KEY_BLOCK_SIZE for some of the compressed tables. You might turn off compression for tables that cause the number of “compression failures” in your application to be more than 1% or 2% of the total. (Such a failure ratio might be acceptable during a temporary operation such as a data load).
![Page 26: Getting innodb compression_ready_for_facebook_scale](https://reader036.vdocuments.mx/reader036/viewer/2022062304/558e821e1a28ab8c528b457b/html5/thumbnails/26.jpg)
Facebook Improvements
![Page 27: Getting innodb compression_ready_for_facebook_scale](https://reader036.vdocuments.mx/reader036/viewer/2022062304/558e821e1a28ab8c528b457b/html5/thumbnails/27.jpg)
Facebook ImprovementsFinding bugs and testing new features
▪ Expanded mtr test suite with crash-recovery and stress tests.
▪ Simulate compression failures.
▪ Fixed the bugs revealed by the tests and production servers.
![Page 28: Getting innodb compression_ready_for_facebook_scale](https://reader036.vdocuments.mx/reader036/viewer/2022062304/558e821e1a28ab8c528b457b/html5/thumbnails/28.jpg)
Facebook ImprovementsTable level compression statistics
▪ Added the following columns to table_statistics:
▪ COMPRESS_OPS,
▪ COMPRESS_OPS_OK,
▪ COMPRESS_USECS,
▪ UNCOMPRESS_OPS,
▪ UNCOMPRESS_USECS.
![Page 29: Getting innodb compression_ready_for_facebook_scale](https://reader036.vdocuments.mx/reader036/viewer/2022062304/558e821e1a28ab8c528b457b/html5/thumbnails/29.jpg)
Facebook ImprovementsRemoval of compressed pages from redo log
▪ Removed compressed page images from redo log.
▪ Introduced a new log record for compression.
![Page 30: Getting innodb compression_ready_for_facebook_scale](https://reader036.vdocuments.mx/reader036/viewer/2022062304/558e821e1a28ab8c528b457b/html5/thumbnails/30.jpg)
Facebook ImprovementsAdaptive padding
▪ Put less data on each page to prevent compression failures.
▪ pad = 16K – (maximum data size allowed on the uncompressed copy)
![Page 31: Getting innodb compression_ready_for_facebook_scale](https://reader036.vdocuments.mx/reader036/viewer/2022062304/558e821e1a28ab8c528b457b/html5/thumbnails/31.jpg)
Facebook ImprovementsAdaptive padding
![Page 32: Getting innodb compression_ready_for_facebook_scale](https://reader036.vdocuments.mx/reader036/viewer/2022062304/558e821e1a28ab8c528b457b/html5/thumbnails/32.jpg)
Facebook ImprovementsAdaptive padding
![Page 33: Getting innodb compression_ready_for_facebook_scale](https://reader036.vdocuments.mx/reader036/viewer/2022062304/558e821e1a28ab8c528b457b/html5/thumbnails/33.jpg)
Facebook ImprovementsAdaptive padding▪ Algorithm to determine pad per table:
▪ Increase the pad until the compression failure rate reaches the specified level.
▪ Decrease padding if the failure rate is too low.
▪ Adapts to the compressibility of data over time.
![Page 34: Getting innodb compression_ready_for_facebook_scale](https://reader036.vdocuments.mx/reader036/viewer/2022062304/558e821e1a28ab8c528b457b/html5/thumbnails/34.jpg)
Facebook ImprovementsAdaptive padding on insert benchmark
▪ Padding value for sbtable is 2432.
▪ Compression failure rate:
▪ mysql: 41%.
▪ fb-mysql: 5%.
mysql-compressed fb-mysql-compressed
0
5000
10000
15000
20000
25000
30000
35000Inserts Per Second
![Page 35: Getting innodb compression_ready_for_facebook_scale](https://reader036.vdocuments.mx/reader036/viewer/2022062304/558e821e1a28ab8c528b457b/html5/thumbnails/35.jpg)
Facebook ImprovementsCompression ops in insert benchmark
mys
ql-c
ompr
esse
d
fb-m
ysql
-com
pres
sed
0
200000
400000
600000
800000
1000000
1200000
1400000
compress_ops_okcompress_ops_fail
![Page 36: Getting innodb compression_ready_for_facebook_scale](https://reader036.vdocuments.mx/reader036/viewer/2022062304/558e821e1a28ab8c528b457b/html5/thumbnails/36.jpg)
Facebook ImprovementsTime spent for compression ops in insert benchmark
mysql-compressed fb-mysql-compressed0
200
400
600
800
1000
1200
compress_time(s)decompress_time(s)
![Page 37: Getting innodb compression_ready_for_facebook_scale](https://reader036.vdocuments.mx/reader036/viewer/2022062304/558e821e1a28ab8c528b457b/html5/thumbnails/37.jpg)
Facebook ImprovementsOther improvements
▪ Amount of empty allocated pages: 10-15% to 2-5%.
▪ Cache memory allocations for:
▪ compression buffers,
▪ decompression buffers,
▪ buffer page descriptors.
▪ Hardware accelerated checksum for compressed pages.
▪ Remove adler32 calls from zlib functions.
![Page 38: Getting innodb compression_ready_for_facebook_scale](https://reader036.vdocuments.mx/reader036/viewer/2022062304/558e821e1a28ab8c528b457b/html5/thumbnails/38.jpg)
Facebook ImprovementsFuture work
▪ Make page_zip_compress() more efficient.
▪ Test larger page sizes:32K, 64K.
▪ Prefix compression.
▪ Other compression algorithms: snappy, quicklz etc.
▪ 3X compression in production.
![Page 39: Getting innodb compression_ready_for_facebook_scale](https://reader036.vdocuments.mx/reader036/viewer/2022062304/558e821e1a28ab8c528b457b/html5/thumbnails/39.jpg)
Questions