mysql usage of web applications from 1 user to 100 million · why mysql? it's easy to ... when...
TRANSCRIPT
MySQL usage of web applications from 1 user to
100 millionPeter Boros
RAMP conference 2013
www.percona.com
Why MySQL?
● It's easy to start small, basic installation well under 15 minutes.
● Very popular, supported by a lot of frameworks.● Development focus is “under the engine cover”,
not necessarily SQL level features.● Keeps up with performance requirements, very
large deployments exist.
www.percona.com
Agenda
● The architecture, which is the best for every application (the silver bullet)
● Backups● Online schema changes● High availability and scalability● Sharding basics
www.percona.com
Agenda
● The architecture, which is the best for every application (the silver bullet)
● Backups● Online schema changes● High availability and scalability● Sharding basics
www.percona.com
Backups: early days
● Mysqldump● Single threaded logical backup
● Mydumper● Table level parallelism
● Easy to implement, logical backups are easy to manage, restore takes a of time.
● Capable of point in time recovery
www.percona.com
Logical backup tuning
● Parallel backup● Table level● Parallel SELECT INTO OUTFILE
● Restore tables with only primary keys, add secondary keys later● In case of MyISAM disable indexes
www.percona.com
Binary backups
● When logical backup takes too long● “My daily backup takes almost 30 hours”
● When mean time to restore matters
● Snapshot based backups● Volume manager, intelligent storage, EBS
● Percona Xtrabackup
www.percona.com
Copy on write snapshot internals
O O O O O
O O O O O
F F F F F
F F F F F
lvcreate -s ...
O: original blockF: free block
Initially snapshot is empty and it doesn't take up disk space
www.percona.com
LVM snapshot internals
O → C O O O O
O O O O O
F → O F F F F
F F F F F
lvcreate -s ...
O: original blockF: free blockC: changed block
Changed original content (O) of the block is written to the snapshot
Write changed content C to one of the blocks
www.percona.com
Snapshot performance
http://www.mysqlperformanceblog.com/2013/07/09/lvm-read-performance-during-snapshots/
www.percona.com
XtraBackup
O O O
O O O
Pages in datafileO O
Backup
Redo log copied continously O: original pageC: changed pageduring backupBlue: copied by XtraBackup
www.percona.com
XtraBackup
O O O
O O O
C O O
O O C
LSN x, C(1)
Pages in datafileO O O
Backup
Redo log copied continously O: original pageC: changed pageduring backupBlue: copied by XtraBackup
www.percona.com
XtraBackup
O O O
O O O
C O O
O O C
C O O
O O C
LSN x, C(1) LSN y, C(2)
Pages in datafileO O O
O O C
Backup
Redo log copied continously
apply-log, roll forward
C O O
O O C O: original pageC: changed pageduring backupBlue: copied by XtraBackup
www.percona.com
Agenda
● The architecture, which is the best for every application (the silver bullet)
● Backups● Online schema changes● High availability and scalability● Sharding basics
www.percona.com
Early days
● Either no high availability (relying on backups), or basic high availability with shared storage or DRBD + pacemaker
● MySQL's built-in replication: high availability and scaling reads
www.percona.com
Some other HA options
● Dual master with manual failover● Replication managers
● MMM, MHA, PRM● Percona XtraDB Cluster
www.percona.com
Pacemaker + DRBD
● Storage block level replication● Can be synchronous and asynchronous (using
asynchronous here is not recommended)● Write IO response time increased● The standby server is cold
● MySQL is not running, the data storage is not mounted
● Crash recovery at non-graceful failover
www.percona.com
Dual masters with manual failover
● Popular because of simplicity● Database runs on both instances
● Database cache is cold on the standby instance● If both instances are written, you have to deal
with conflict resolution● auto_increment_increment and
auto_increment_offset doesn't resolve every conflict (think about UPDATEs)
● https://launchpad.net/my-vip-flip
www.percona.com
Cold cache issue
● If the buffer pool is cold on the standby side● Server becomes IO bound
● It could mean that the database is unusable for a while
● This can increase failover time● How we solved this by continuously keeping
standby's cache warm● http://www.mysqlperformanceblog.com/2013/04/16/is-your-mysql-buffer-pool-warm-make-it-sweat/
● https://archive.fosdem.org/2013/schedule/event/bp_hot_slave/
www.percona.com
How we solved this @ Groupon
www.percona.com
Workload warmed up with itself
www.percona.com
Workload warmed up with the previous chunk
www.percona.com
Replication managers
● Introducing an automation layer on top of replication● Automating VIP switch (if VIP exists)● Automating choosing the slave to promote● Automating repositioning the other slaves
● Cold cache issue can make things really worse when failover is automated
● https://github.com/blog/1261-github-availability-this-week
www.percona.com
Replication manager topology I.
● Used by MMM● Dedicated standby
master
www.percona.com
Replication manager topology II.
● Used by PRM/MHA● On master failure, the
most recent slave is promoted to master
www.percona.com
Replication and durability
● MySQL's built-in replication is asynchronous● Binary logs are written to disk in an
asynchronous fashion● sync_binlog
● From MySQL 5.5: semi-sync replication guarantees that at least one slave has the data in the relay log, when the client gets commit acknowledgement
http://www.mysqlperformanceblog.com/2012/06/14/comparing-percona-xtradb-cluster-with-semi-sync-replication-cross-wan/
www.percona.com
Replication issues
● Replication is asynchronous, data can be different on the master and slaves
● A user with SUPER privilege can write a database with read_only=1
● Verifying integrity: pt-table-checksum● Fixing replication: pt-table-sync
● Syncs based on pt-table-checksum's results table
● New pt-table-checksum in Percona Toolkit 2.x, uses resources in a smart way
www.percona.com
Replication integrity check[root@ps1 ~]# pt-table-checksum h=localhost,u=percona,p=percona --chunk-size=10000 TS ERRORS DIFFS ROWS CHUNKS SKIPPED TIME TABLE07-10T09:03:07 0 0 0 1 0 0.278 mysql.columns_priv07-10T09:03:08 0 0 2 1 0 0.275 mysql.db07-10T09:03:08 0 0 0 1 0 0.291 mysql.event07-10T09:03:08 0 0 0 1 0 0.281 mysql.func07-10T09:03:09 0 0 40 1 0 0.278 mysql.help_category07-10T09:03:09 0 0 467 1 0 0.279 mysql.help_keyword07-10T09:03:09 0 0 1048 1 0 0.292 mysql.help_relation07-10T09:03:09 0 0 510 1 0 0.287 mysql.help_topic07-10T09:03:10 0 0 0 1 0 0.284 mysql.host07-10T09:03:10 0 0 0 1 0 0.286 mysql.ndb_binlog_index07-10T09:03:10 0 0 0 1 0 0.276 mysql.plugin07-10T09:03:11 0 0 0 1 0 0.274 mysql.proc07-10T09:03:11 0 0 0 1 0 0.281 mysql.procs_priv07-10T09:03:11 0 0 2 1 0 0.289 mysql.proxies_priv07-10T09:03:12 0 0 0 1 0 0.276 mysql.servers07-10T09:03:12 0 0 0 1 0 0.291 mysql.tables_priv07-10T09:03:12 0 0 0 1 0 0.293 mysql.time_zone07-10T09:03:12 0 0 0 1 0 0.288 mysql.time_zone_leap_second07-10T09:03:13 0 0 0 1 0 0.275 mysql.time_zone_name07-10T09:03:13 0 0 0 1 0 0.297 mysql.time_zone_transition07-10T09:03:13 0 0 0 1 0 0.282 mysql.time_zone_transition_type07-10T09:03:14 0 0 7 1 0 0.282 mysql.user07-10T09:04:40 0 0 100000 12 0 0.544 sbtest.sbtest1
www.percona.com
Replication integrity check
mysql> update sbtest1 set c='nasty_update' order by rand() limit 1;Query OK, 1 row affected, 2 warnings (0.11 sec)Rows matched: 1 Changed: 1 Warnings: 2
[root@ps1 ~]# pt-table-checksum h=localhost,u=percona,p=percona --chunk-size=10000 --databases=sbtest TS ERRORS DIFFS ROWS CHUNKS SKIPPED TIME TABLE07-10T09:07:13 0 1 100000 12 0 0.554 sbtest.sbtest1
mysql> SELECT db, tbl, SUM(this_cnt) AS total_rows, COUNT(*) AS chunks -> FROM percona.checksums -> WHERE ( -> master_cnt <> this_cnt -> OR master_crc <> this_crc -> OR ISNULL(master_crc) <> ISNULL(this_crc)) -> GROUP BY db, tbl;+--------+---------+------------+--------+| db | tbl | total_rows | chunks |+--------+---------+------------+--------+| sbtest | sbtest1 | 10000 | 1 |+--------+---------+------------+--------+1 row in set (0.00 sec)
www.percona.com
Replication integrity check
mysql> select * from percona.checksums where tbl='sbtest1' and this_crc!=master_crc\G*************************** 1. row *************************** db: sbtest tbl: sbtest1 chunk: 2 chunk_time: 0.017746 chunk_index: PRIMARYlower_boundary: 10001upper_boundary: 20000 this_crc: 897f461b this_cnt: 10000 master_crc: 49b06b87 master_cnt: 10000 ts: 2013-07-10 09:07:121 row in set (0.00 sec)
mysql> select * into outfile '/tmp/sbtest1.master.tsv' from sbtest1 where id between 10001 and 20000;Query OK, 10000 rows affected (0.02 sec)mysql> select * into outfile '/tmp/sbtes1.slave.csv' from sbtest1 where id between 10001 and 20000;Query OK, 10000 rows affected (0.02 sec)
www.percona.com
Replication integrity check
[root@ps1 tmp]# diff sbtest1.master.tsv sbtest1.slave.tsv 2881c2881< 12881 4981752549114044-71952811643-35917427238-45653512101-54649391075-62222650976-38499344639-32676735660-91138464428-11354436353 45479228511-64905225468-01443207380-44574500934-76313687534---> 12881 49817nasty_update 45479228511-64905225468-01443207380-44574500934-76313687534
mysql> show create table sbtest1\G*************************** 1. row *************************** Table: sbtest1Create Table: CREATE TABLE `sbtest1` ( `id` int(10) unsigned NOT NULL AUTO_INCREMENT, `k` int(10) unsigned NOT NULL DEFAULT '0', `c` char(120) NOT NULL DEFAULT '', `pad` char(60) NOT NULL DEFAULT '', PRIMARY KEY (`id`), KEY `k_1` (`k`)) ENGINE=InnoDB AUTO_INCREMENT=100001 DEFAULT CHARSET=latin1 MAX_ROWS=10000001 row in set (0.00 sec)
www.percona.com
Percona XtraDB Cluster
● Real parallel, synchronous replication● Using write set replication● Based on academic research, state of the art
replication● Galera: wsrep provider library made by
Codership● Percona XtraDB Cluster: Percona Server +
Galera
www.percona.com
PXC Single data center
● Any node is writable● Application server typically
use some load balancer to connect
● No VIPs needed
● Each node sends each write set to all the other nodes in parallel
● Bidirectional arrows represent group communication
www.percona.com
PXC virtually synchronous replication
www.percona.com
Multi data center PXC
Real nodes in every DC 2 nodes + arbitrator
www.percona.com
PXC common caveats
● InnoDB only● MyISAM support is “experimental”
● On node failure, writes are stalled for suspect timeout
● Conflicting transactions can be an issue when writing multiple nodes
● Long running transactions are more painful than in the single node case
● Always test carefully with your workload!
www.percona.com
PXC common caveats
● InnoDB only● MyISAM support is “experimental”
● On node failure, writes are stalled for suspect timeout
● Conflicting transactions can be an issue when writing multiple nodes
● Long running transactions are more painful than in the single node case
● Always test carefully with your workload!
www.percona.com
Agenda
● The architecture, which is the best for every application (the silver bullet)
● Backups● Online schema changes● High availability and scalability● Sharding basics
www.percona.com
Possible methods
● Using replication, and altering boxes one by one
● Using pt-online-schema-change● Using wsrep_osu_method='RSU' in PXC
● Built-in method in Percona XtraDB cluster
www.percona.com
Online schema changes using replication
● ALTER the standby server (SET sql_log_bin=0)● ALTER all the slaves● Flip to a slave or to the passive master or
promote something which has the new schema manually
● Flip the former active master
www.percona.com
Agenda
● The architecture, which is the best for every application (the silver bullet)
● Backups● Online schema changes● High availability and scalability● Sharding basics
www.percona.com
Sharding types
● Table level sharding● Geographical sharding● Static sharding● Dynamic sharding● Static and dynamic
www.percona.com
Sharding pros and cons
✔ Write scalability ✗ No cross-shard transactions
✗ No cross-shard joins✗ Application level logic
to implement and maintain
www.percona.com
Static sharding
● Simple hash / modulo function
● No central dictionary needed
● What if we have to add a node?
www.percona.com
Rebalancing shards
● Adding a new node● Adding more powerful nodes● Replacing old nodes with more powerful ones● Change in usage patterns
www.percona.com
Dynamic sharding
● Central dictionary service
● Knows where should data go and where data is
● Key-value store, easy to scale
www.percona.com
Should I shard?
● If you don't know for sure, no.●
● A lot of big applications are operating without sharding.
● Really big applications are doing a mix of static and dynamic sharding.
http://37signals.com/svn/posts/1509-mr-moore-gets-to-punt-on-sharding
www.percona.com
Help writes before sharding: archiving
● Archive less frequently accessed data● More compact indexes
● Better cache utilization● Faster writes
● A good archiving strategy based on data access patterns can postpone sharding
www.percona.com
Help writes before sharding: partitioning
● Can be helpful if you have a single larger table● A partitioned table is practically multiple
physical tables under the hood● Smaller tables -> smaller indexes -> faster writes● Faster reads by partitioning function● Slower reads if it's not by partitioning function
● It has it's limitations as wellhttp://www.mysqlperformanceblog.com/2010/12/11/mysql-partitioning-can-save-you-or-kill-you/
www.percona.com
Agenda
● The architecture, which is the best for every application (the silver bullet)
● Backups● Online schema changes● High availability and scalability● Sharding basics
www.percona.com
Silver bullet
Doesn't exist.
Q&A
Thanks for attention.
www.percona.com
See you next time