migrating to xtradb cluster
DESCRIPTION
The presentation provides you with the necessary steps to follow when migrating to XtraDB Cluster. Percona provides an in-depth review of your database and recommends appropriate changes by performing a complete MySQL health check in which we identify inefficiencies, find problems before they occur, and ensure that your MySQL database is in the best condition.TRANSCRIPT
Migrating to XtraDB Cluster
Jay Janssen, MySQL Consulting LeadPercona Live University, Toronto
March 22nd, 2013
Overview of Xtradb Cluster
• Percona Server 5.5 + Galera Codership sync repl addon
• “Cluster of MySQL nodes”– Have all the data, all the time– Readable and writeable
• Established cluster:– Synchronizes new nodes– Handles node failures– Handles Node resync– Split brain protection (quorum)
Company Confidential December 2010
-2-
• Standard MySQL replication– into or out of the cluster
• Write scalable to a point– all writes still hit all nodes
• LAN/WAN architectures– write latency ~1 RTT
• MyISAM experimental– big list of caveats– designed and built for Innodb
XtraDB Cluster FAQ
Company Confidential December 2010
-3-
• Is it production worthy?
– Several production users of Galera/PXC
– You should really evaluate your workload to see if it’s a
good fit for Galera/PXC
• What are the limitations of using Galera?
– http://www.codership.com/wiki/doku.php?id=limitations
What you really want to know
Company Confidential December 2010
-4-
CONFIGURING XTRADB CLUSTER
Company Confidential December 2010
-5-
• Configured via wsrep_provider_options• Can be a separate network from mysqld• Default cluster replication port is 4567 (tcp)
– Supports multicast
– Supports SSL
• Starting node needs to know one cluster node ip– you can list all the nodes you know and it will find one
that is a member of the cluster
Cluster Replication Config
Company Confidential December 2010
-6-
• Outside of galera replication (gcomm)• SST
– full state transfers– Donor picked from running cluster, gives full backup to
joiner node– Might be blocking (various methods allowed)– default tcp 4444
• IST– incremental state transfers– default wsrep port + 1 (tcp 4568)
Other intra-cluster communication
Company Confidential December 2010
-7-
• [mysqld]– wsrep_provider = /usr/lib64/libgalera_smm.so– wsrep_cluster_name - Identify the cluster– wsrep_cluster_address - Where to find the cluster– srep_node_address - tell Galera what IP to use for
replication/SST/IST– wsrep_sst_method - How to synchronize nodes– binlog_format = ROW– innodb_autoinc_lock_mode=2– innodb_locks_unsafe_for_binlog=1 - performance
Essential Galera settings
Company Confidential December 2010
-8-
• [mysqld]– wsrep_node_name - Identify this node
– wsrep_provider_options - cluster comm opts
• wsrep_provider_options="gcache.size=<gcache size>"
• http://www.codership.com/wiki/doku.php?id=galera_parameters
– wsrep_node_incoming_address=<node mysql IP>
– wsrep_slave_threads - apply writesets in parallel
• http://www.codership.com/wiki/doku.php?id=mysql_options_0.8
Other Galera Settings
Company Confidential December 2010
-9-
1. [mysqld]2. datadir=/var/lib/mysql3. binlog_format=ROW5. wsrep_cluster_name=trimethylxanthine6. wsrep_cluster_address=gcomm://192.168.70.2,192.168.70.3,192.168.70.48. # Only use this before the cluster is formed9. # wsrep_cluster_address=gcomm://11. wsrep_node_name=percona112. wsrep_node_address=192.168.70.213. wsrep_provider=/usr/lib64/libgalera_smm.so15. wsrep_sst_method=xtrabackup16. wsrep_sst_auth=backupuser:password18. wsrep_slave_threads=220. innodb_locks_unsafe_for_binlog=121. innodb_autoinc_lock_mode=223. innodb_buffer_pool_size=128M24. innodb_log_file_size=64M
Example configuration
Company Confidential December 2010
-10-
CONVERTING STANDALONE MYSQL TO XTRADB CLUSTER
Company Confidential December 2010
-11-
• Migrating a single server:– stop MySQL– replace the packages– add essential Galera settings– start MySQL
• A stateless, peerless node will form its own cluster– if an empty cluster address is given (gcomm://)
• That node is the baseline data for the cluster• Easiest from Percona Server 5.5
Method 1 - Single Node
Company Confidential December 2010
-12-
• All at once (with downtime):– Stop all writes, stop all nodes
after replication is synchronized– skip-slave-start / RESET SLAVE– Start first node - initial cluster– Start the others with
wsrep_sst_mode=skip
• The slaves will join the cluster,skipping SST
• Change wsrep_sst_method !=skip
Method 2 - Blanket changeover
Company Confidential December 2010
-13-
Method 2 - Blanket changeover
Company Confidential December 2010
-14-
• No downtime– Form new cluster from one slave– Node replicates from old master
• log-slave-updates on this node
– Test like any other slave– Move more slave nodes to
cluster– Cut writes over to the cluster– Absorb master into cluster.
• Non-skip SST
OPERATIONAL CONSIDERATIONS
Company Confidential December 2010
-15-
• SHOW GLOBAL STATUS like ‘wsrep%’;• Cluster integrity - same across all nodes
– wsrep_cluster_conf_id - configuration version
– wsrep_cluster_size - number of active nodes
– wsrep_cluster_status - should be Primary
• Node Status– wsrep_ready - indicator that the node is healthy
– wsrep_local_state_comment - status message
– wsrep_flow_control_paused/sent - replication lag feedback
– wsrep_local_send_q_avg - possible network bottleneck
• http://www.codership.com/wiki/doku.php?id=monitoring
Monitoring
Company Confidential December 2010
-16-
Realtime Wsrep status
Company Confidential December 2010
-17-
Maintenance
Company Confidential December 2010
-18-
• Rolling package updates• Schema changes
– potential for blocking the whole cluster– Galera supports a rolling schema upgrade feature
• http://www.codership.com/wiki/doku.php?id=rolling_schema_upgrade
• Isolates DDL to individual cluster nodes• Won’t work if replication events become incompatible
– pt-online-schema-change
• Prefer IST over SST– be sure you know when IST will and won’t work!
Architecture
Company Confidential December 2010
-19-
• How many nodes should I have?– >= 3 nodes for quorum purposes
• 50% is not a quorum
– garbd - Galera Arbitrator Daemon• Contributes as a voting node for
quorum• Does not store data, but does
replicate
• What gear should I get?– Writes as fast as your slowest node– Standard MySQL + Innodb choices– garbd could be on a cloud server
APPLICATION WORKLOADS
Company Confidential December 2010
-20-
How (Virtually) Synchronous Writes Work
Company Confidential December 2010
-21-
• Source node - pessimistic locking– Innodb transaction locking
• Cluster repl - optimistic locking– Before source returns commit:
• replicates to all nodes, GTID chosen• source certifies
– PASS: source applies– FAIL: source deadlock error (LCF)
– Other nodes• receive, certify, apply (or drop)• Certification deterministic on all nodes
– Apply can abort open trxs (BFA)• First commit wins!
Why does the Application care?
Company Confidential December 2010
-22-
• Workload dependent!• Write to all nodes simultaneously and evenly:
– Increase of deadlock errors on data hot spots
• Can be avoided by– Writing to only one node at a time
• all pessimistic locking happens on one node
– Data subsets written only on a single node
• e.g., different databases, tables, rows, etc.
• different nodes can handle writes for different datasets
• pessimistic locking for that subset only on one node
Workloads that work best with Galera
Company Confidential December 2010
-23-
• Multi-node writing– Low Data hotspots
– Auto-increment-offset/increment is ok
• Galera sets automatically by default
• Small transactions– Expose serialization points in replication and certification
• Tables– With PKs
– Innodb
– Avoid triggers, FKs, etc. -- supported, but problematic
APPLICATION CLUSTER HA
Company Confidential December 2010
-24-
Application to Cluster Connects
Company Confidential December 2010
-25-
• For writes:– Best practice: (any) single node
• For Reads:– All nodes load-balanced
• Can be hashed to hit hot caches
• Geo-affinity for WAN setups
– Replication lag still possible, but minimal. Avoidable with
wsrep_causal_reads (session|global).
• Be sure to monitor that nodes are functioning members of the cluster!
Load balancing and Node status
Company Confidential December 2010
-26-
• Health check:– TCP 3306– SHOW GLOBAL STATUS
• wsrep_ready = ON• wsrep_local_state_comment !~ m/
Donor/?• /usr/bin/clustercheck
• Maintain a separate rotations:– Reads
• RR or Least Connected all available
– Writes• Single node with backups on failure
Load Balancing Technologies
Company Confidential December 2010
-27-
• glbd - Galera Load Balancer– similar to Pen, can utilize multiple cores
– http://www.codership.com/products/galera-loadbalancer
• HAProxy– httpchk to monitor node status
– http://www.percona.com/doc/percona-xtradb-cluster/
haproxy.html
• Watch out for a lot of TIME_WAIT conns!
HAProxy Sample config
Company Confidential December 2010
-28-
1. # Random Reads connection (any node)2. listen all *:33063. server db1 10.2.46.120:3306 check port 92004. server db2 10.2.46.121:3306 check port 92005. server db3 10.2.46.122:3306 check port 92007. # Writer connection (first available node)8. listen writes *:43069. server db1 10.2.46.120:3306 track all/db110. server db2 10.2.46.121:3306 track all/db2 backup11. server db3 10.2.46.122:3306 track all/db3 backup
Resources
Company Confidential December 2010
-29-
• XtraDB Cluster homepage and documentation:– http://www.percona.com/software/percona-xtradbcluster/
• Galera Documentation:– http://www.codership.com/wiki/doku.php
• PXC tutorial (self-guided or at a conference):– https://github.com/jayjanssen/percona-xtradb-cluster-
tutorial
• http://www.mysqlperformanceblog.com/category/xtradb-cluster/
THANK YOU
Jay Janssen
@jayjanssen
http://www.percona.com/software/percona-xtradb-cluster
Company Confidential December 2010
-30-