innodb: status, architecture, and latest enhancementsassets.en.oreilly.com/1/event/80/innodb_...
TRANSCRIPT
InnoDB: Status, Architecture, and
Latest Enhancements
OSCON Data, July 19, 2012
Calvin Sun, Sr. Manager, Oracle
The following is intended to outline our general product
direction. It is intended for information purposes only, and may
not be incorporated into any contract. It is not a commitment
to deliver any material, code, or functionality, and should not
be relied upon in making purchasing decisions.
The development, release, and timing of any features or
functionality described for Oracle’s products remains at the
sole discretion of Oracle.
Brief InnoDB History
1990 1995 2000 2005 2010
Oracle acquired
Innobase Oy
First shipped
with MySQL
InnoDB started
by Dr. Heikki
Tuuri
Dev of MySQL
started by Monty
& David
Sun Micro
acquired
MySQL
Oracle acquired
Sun Micro
Make MySQL a Better MySQL
Oracle’s commitment
- Be the technology leader for InnoDB/MySQL
- Community involvement
- Strategic, integrated product direction
- Align Engineering/Product teams
- The best technical support. Period.
InnoDB team now has 15+ developers in 9 countries
- Australia, Bulgaria, Canada, China, Finland, India, Japan, Sweden, US
Close collaboration with other MySQL engineering teams - MRR/ICP, Online DDLs, Performance and scalability improvements, Performance
schema, and more
Releases Since Last OSCON
MySQL 5.6.3, October 2011 - Improve LRU flushing, Increase max size of redo log files, Preload buffer pool,
Separate tablespace(s) for the InnoDB undo log
MySQL 5.6.4, December 2011 - InnoDB full-text search, Support 4k, 8k page sizes, Special handling of read-only
transactions
MySQL 5.6.5, April 2012
MySQL 5.6.6, Summer 2012 - Online DDLs, Direct access to InnoDB via Memcached, Transportable tablespaces,
Tune InnoDB persistent statistics, Tune adaptive flushing, Improved neighbor
flushing
MySQL Server
MySQL Server
CSV
Enterprise
Management Tools and Utilities
MySQL Enterprise Monitor
MySQL Query Analyzer
MySQL WorkBench
Backup & Recovery
Security
Replication
Cluster
Partitioning
INFORMATION_SCHEMA
PERFORMANCE_SCHEMA
InnoDB Design
Modeled on Gray & Reuter’s “Transaction Processing: Concepts & Techniques”
• Next key locking
Also emulated the Oracle architecture
• Multi-version concurrency control (MVCC)
• Undo info in the database, not the logs
• Tablespaces for data & index storage
Added unique subsystems/features
• Doublewrite buffer
• Change buffering
• Adaptive hash index
InnoDB Architecture: Runtime Model
Memory
Files Log buffer Buffer pool Misc buffer
Threads:
-master
-read io
-write io
-ibuf io
-log io
-lock timeout
-monitor
-purge
- and more
Buffer Pool
-data
-index
-undo
-adaptive hash
index
Background threads
InnoDB Architecture: Database Files
ibdata files
Syste
m t
able
spa
ce
internal data
dictionary
MySQL Data Directory
InnoDB
tables
OR innodb_file_per_table
.ibd files
.frm files
undo logs
change buffer
NoSQL to InnoDB via Memcached API
InnoDB Storage Engine
Handler API
MySQL Server
InnoDB API
Memcached plugin
Application
SQL Memcached protocol
mysqld
innodb_memcache local cache (optional)
MySQL 5.6 June – Sysbench OLTP_RW
0
2000
4000
6000
8000
10000
12000
24 32 64 96 128
QueriesperSecond
Threads
MySQL5.6.5
MySQL5.6.June
• Up to 2.8x higher performance
• Removal of false cacheline
sharing
• New flushing algorithm + page
cleaner thread
• Removal of LOCK_open
• Sysbench R/W
• 8 x Socket / 6-core Intel Xeon
7540, 2GHz
• 512GB RAM
• SSD
Performance and Scalability
IO • Multi threaded purge
• Cleaner thread to flush dirty pages
• Improve neighbor flushing, LRU flushing, and adaptive flushing
• Increase max size of redo log files
• Reduce contention for file extension
• Separate tablespace(s) for InnoDB undo log
Mutex • Kernel mutex split
• Use rw_locks for page_hash
Performance and Scalability (2)
Optimizer • MRR/ICP support
• InnoDB persistent statistics
Others • Removal of false cacheline sharing
• Optimization for read only transactions
• Improve thread scheduling
• Use hardware checksums (CRC32)
• Improve scalability with many partitions
Availability & Usability
Online DLLs
Transportable tablespaces
Configurable data dictionary cache
Dump/restore buffer pool for fast start up
CREATE TABLESPACE for file-per-table redirection
Monitoring & diagnostics
• Information schema metrics table
• Information schema for InnoDB system tables
• Information schema for InnoDB buffer pool
Transportable Tablespaces
A frequently requested feature since MySQL 4.1
Challenge: Resolving dependencies on…
- Change buffer
- Undo logs
- Crash recovery
- Data dictionary
Solutions:
- Make tablespaces ‘clean’ on export (FLUSH TABLES t1 FOR
EXPORT;)
- Adjustments on import
Online ALTER in MySQL 5.6
ADD INDEX
ADD/DROP COLUMN
ALTER ROW_FORMAT, KEY_BLOCK_SIZE
ALTER COLUMN NULLABLE, NOT_NULLABLE
RENAME COLUMN
RENAME TABLE
ADD/DROP FOREIGN KEY
CREATE PRIMARY KEY, rebuild cluster index
How does Add Index Evolve?
InnoDB 5.1 Builtin rebuilds the entire table, row-by-row, to
create a new secondary index
Fast index creation in InnoDB 5.1 Plugin & 5.5: build just the
new indexes, not the entire table, but
- Does not allow concurrent writes to the table
- Does not copy history (old transactions must not used the created
index even after it is completed)
Index creation is truly online in MySQL 5.6
New ALTER TABLE Syntax
ALTER TABLE ... , algorithm, concurrency
algorithm:
/* empty */
| ALGORITHM [=] DEFAULT
| ALGORITHM [=] INPLACE
| ALGORITHM [=] COPY
concurrency:
/* empty */
| LOCK [=] DEFAULT
| LOCK [=] NONE
| LOCK [=] SHARED
| LOCK [=] EXCLUSIVE
Online Add Index
Check
Phase
CREATE INDEX index_name ON table name
Prepare phase
Build phase
Final phase
Concurrent user
Source (table)
Scan clustered index
Extract index entries
Sort/merge
Index build
Drop old table(if create
primary)
No Concurrent DML allowed
Shared
Metadata
Lock
DML Logging.
And apply log at the end of
index build
Update internal structures
Shared Metadata Lock
Metadata lock
Concurrent Select, Delete,
Insert, Update
(cluster) Index
Create log (files),
Logging starts
System table (Metadata)
update
Metadata Lock that
blocks Write
Exclusive Metadata Lock
Concurrent Select, Delete,
Insert, Update
Concurrent Read only
Check weather this online
DDL is supported
Online ALTER vs. OSC Scripts
Inside server vs. external scripts
No need to copy entire table for some operations
Allow tight integration and better optimization
Better control and monitoring
Does not rely on replication, trigger, etc.
More New Features
InnoDB full-text search
Direct access to InnoDB via Memcached
Support 4k, 8k page sizes
InnoDB Full-Text Search
InnoDB full-text index as an inverted index
Support all query types supported by MyISAM:
• Natural language search
• Query expansion
• Boolean search
• Plus
• Proximity search: a special case of boolean search
• Create full-text index with parallel tokenization and sorting
Parallel FTS Index Creation
Storage Engine Time (min)
MyISAM 11 min 48 sec
InnoDB (default, pll_degree = 2) 7 min 25 sec
InnoDB pll_degree =4 5 min 35 sec
InnoDB pll_degree =8 4 min 9 sec
InnoDB pll_degree =16 3 min 40 sec
InnoDB Full-Text Search
References
• http://www.drdobbs.com/database/full-text-search-with-
innodb/231902587
• http://blogs.innodb.com/wp/2011/07/innodb-full-text-
search-tutorial/
• http://blogs.innodb.com/wp/2011/07/overview-and-getting-
started-with-innodb-fts/
InnoDB Roadmap
Performance and Scalability
• Reduce lock contentions: index->lock contention;
btr_search_latch
• Optimize in-memory workloads
• Improve bulk insert performance
• Additional information schema tables and performance statistics
InnoDB Roadmap
Improve table compression
Support GIS indexing
Global data dictionary
Continue improvements of online DDLs
Native partitioning in InnoDB
Improve InnoDB reliability
More NoSQL access to InnoDB
InnoDB Roadmap
Optimized for SSD
• Flexible tablespaces management
• Make InnoDB TRIM friendly
• Additional tuning
• Easy to configure
References
MySQL Developer Zone
- http://dev.mysql.com/
MySQL Glossary
- http://dev.mysql.com/doc/refman/5.6/en/glossary.html
The State of the Dolphin
- https://blogs.oracle.com/MySQL/
Transactions on InnoDB
- http://blogs.innodb.com/
DimitriK's (dim) Weblog
- http://dimitrik.free.fr/blog/index.html
Thanks for attending!
MySQL Connect http://www.oracle.com/mysqlconnect/
September 29-30, San Francisco