Download - Migrating to XtraDB Cluster
Migrating to XtraDB Cluster
Jay Janssen, MySQL Consulting LeadPercona Live University, Toronto
March 22nd, 2013
Overview of Xtradb Cluster
• Percona Server 5.5 + Galera Codership sync repl addon
• “Cluster of MySQL nodes”– Have all the data, all the time– Readable and writeable
• Established cluster:– Synchronizes new nodes– Handles node failures– Handles Node resync– Split brain protection (quorum)
Company Confidential December 2010
-2-
• Standard MySQL replication– into or out of the cluster
• Write scalable to a point– all writes still hit all nodes
• LAN/WAN architectures– write latency ~1 RTT
• MyISAM experimental– big list of caveats– designed and built for Innodb
XtraDB Cluster FAQ
Company Confidential December 2010
-3-
• Is it production worthy?
– Several production users of Galera/PXC
– You should really evaluate your workload to see if it’s a
good fit for Galera/PXC
• What are the limitations of using Galera?
– http://www.codership.com/wiki/doku.php?id=limitations
What you really want to know
Company Confidential December 2010
-4-
CONFIGURING XTRADB CLUSTER
Company Confidential December 2010
-5-
• Configured via wsrep_provider_options• Can be a separate network from mysqld• Default cluster replication port is 4567 (tcp)
– Supports multicast
– Supports SSL
• Starting node needs to know one cluster node ip– you can list all the nodes you know and it will find one
that is a member of the cluster
Cluster Replication Config
Company Confidential December 2010
-6-
• Outside of galera replication (gcomm)• SST
– full state transfers– Donor picked from running cluster, gives full backup to
joiner node– Might be blocking (various methods allowed)– default tcp 4444
• IST– incremental state transfers– default wsrep port + 1 (tcp 4568)
Other intra-cluster communication
Company Confidential December 2010
-7-
• [mysqld]– wsrep_provider = /usr/lib64/libgalera_smm.so– wsrep_cluster_name - Identify the cluster– wsrep_cluster_address - Where to find the cluster– srep_node_address - tell Galera what IP to use for
replication/SST/IST– wsrep_sst_method - How to synchronize nodes– binlog_format = ROW– innodb_autoinc_lock_mode=2– innodb_locks_unsafe_for_binlog=1 - performance
Essential Galera settings
Company Confidential December 2010
-8-
• [mysqld]– wsrep_node_name - Identify this node
– wsrep_provider_options - cluster comm opts
• wsrep_provider_options="gcache.size=<gcache size>"
• http://www.codership.com/wiki/doku.php?id=galera_parameters
– wsrep_node_incoming_address=<node mysql IP>
– wsrep_slave_threads - apply writesets in parallel
• http://www.codership.com/wiki/doku.php?id=mysql_options_0.8
Other Galera Settings
Company Confidential December 2010
-9-
1. [mysqld]2. datadir=/var/lib/mysql3. binlog_format=ROW5. wsrep_cluster_name=trimethylxanthine6. wsrep_cluster_address=gcomm://192.168.70.2,192.168.70.3,192.168.70.48. # Only use this before the cluster is formed9. # wsrep_cluster_address=gcomm://11. wsrep_node_name=percona112. wsrep_node_address=192.168.70.213. wsrep_provider=/usr/lib64/libgalera_smm.so15. wsrep_sst_method=xtrabackup16. wsrep_sst_auth=backupuser:password18. wsrep_slave_threads=220. innodb_locks_unsafe_for_binlog=121. innodb_autoinc_lock_mode=223. innodb_buffer_pool_size=128M24. innodb_log_file_size=64M
Example configuration
Company Confidential December 2010
-10-
CONVERTING STANDALONE MYSQL TO XTRADB CLUSTER
Company Confidential December 2010
-11-
• Migrating a single server:– stop MySQL– replace the packages– add essential Galera settings– start MySQL
• A stateless, peerless node will form its own cluster– if an empty cluster address is given (gcomm://)
• That node is the baseline data for the cluster• Easiest from Percona Server 5.5
Method 1 - Single Node
Company Confidential December 2010
-12-
• All at once (with downtime):– Stop all writes, stop all nodes
after replication is synchronized– skip-slave-start / RESET SLAVE– Start first node - initial cluster– Start the others with
wsrep_sst_mode=skip
• The slaves will join the cluster,skipping SST
• Change wsrep_sst_method !=skip
Method 2 - Blanket changeover
Company Confidential December 2010
-13-
Method 2 - Blanket changeover
Company Confidential December 2010
-14-
• No downtime– Form new cluster from one slave– Node replicates from old master
• log-slave-updates on this node
– Test like any other slave– Move more slave nodes to
cluster– Cut writes over to the cluster– Absorb master into cluster.
• Non-skip SST
OPERATIONAL CONSIDERATIONS
Company Confidential December 2010
-15-
• SHOW GLOBAL STATUS like ‘wsrep%’;• Cluster integrity - same across all nodes
– wsrep_cluster_conf_id - configuration version
– wsrep_cluster_size - number of active nodes
– wsrep_cluster_status - should be Primary
• Node Status– wsrep_ready - indicator that the node is healthy
– wsrep_local_state_comment - status message
– wsrep_flow_control_paused/sent - replication lag feedback
– wsrep_local_send_q_avg - possible network bottleneck
• http://www.codership.com/wiki/doku.php?id=monitoring
Monitoring
Company Confidential December 2010
-16-
Realtime Wsrep status
Company Confidential December 2010
-17-
Maintenance
Company Confidential December 2010
-18-
• Rolling package updates• Schema changes
– potential for blocking the whole cluster– Galera supports a rolling schema upgrade feature
• http://www.codership.com/wiki/doku.php?id=rolling_schema_upgrade
• Isolates DDL to individual cluster nodes• Won’t work if replication events become incompatible
– pt-online-schema-change
• Prefer IST over SST– be sure you know when IST will and won’t work!
Architecture
Company Confidential December 2010
-19-
• How many nodes should I have?– >= 3 nodes for quorum purposes
• 50% is not a quorum
– garbd - Galera Arbitrator Daemon• Contributes as a voting node for
quorum• Does not store data, but does
replicate
• What gear should I get?– Writes as fast as your slowest node– Standard MySQL + Innodb choices– garbd could be on a cloud server
APPLICATION WORKLOADS
Company Confidential December 2010
-20-
How (Virtually) Synchronous Writes Work
Company Confidential December 2010
-21-
• Source node - pessimistic locking– Innodb transaction locking
• Cluster repl - optimistic locking– Before source returns commit:
• replicates to all nodes, GTID chosen• source certifies
– PASS: source applies– FAIL: source deadlock error (LCF)
– Other nodes• receive, certify, apply (or drop)• Certification deterministic on all nodes
– Apply can abort open trxs (BFA)• First commit wins!
Why does the Application care?
Company Confidential December 2010
-22-
• Workload dependent!• Write to all nodes simultaneously and evenly:
– Increase of deadlock errors on data hot spots
• Can be avoided by– Writing to only one node at a time
• all pessimistic locking happens on one node
– Data subsets written only on a single node
• e.g., different databases, tables, rows, etc.
• different nodes can handle writes for different datasets
• pessimistic locking for that subset only on one node
Workloads that work best with Galera
Company Confidential December 2010
-23-
• Multi-node writing– Low Data hotspots
– Auto-increment-offset/increment is ok
• Galera sets automatically by default
• Small transactions– Expose serialization points in replication and certification
• Tables– With PKs
– Innodb
– Avoid triggers, FKs, etc. -- supported, but problematic
APPLICATION CLUSTER HA
Company Confidential December 2010
-24-
Application to Cluster Connects
Company Confidential December 2010
-25-
• For writes:– Best practice: (any) single node
• For Reads:– All nodes load-balanced
• Can be hashed to hit hot caches
• Geo-affinity for WAN setups
– Replication lag still possible, but minimal. Avoidable with
wsrep_causal_reads (session|global).
• Be sure to monitor that nodes are functioning members of the cluster!
Load balancing and Node status
Company Confidential December 2010
-26-
• Health check:– TCP 3306– SHOW GLOBAL STATUS
• wsrep_ready = ON• wsrep_local_state_comment !~ m/
Donor/?• /usr/bin/clustercheck
• Maintain a separate rotations:– Reads
• RR or Least Connected all available
– Writes• Single node with backups on failure
Load Balancing Technologies
Company Confidential December 2010
-27-
• glbd - Galera Load Balancer– similar to Pen, can utilize multiple cores
– http://www.codership.com/products/galera-loadbalancer
• HAProxy– httpchk to monitor node status
– http://www.percona.com/doc/percona-xtradb-cluster/
haproxy.html
• Watch out for a lot of TIME_WAIT conns!
HAProxy Sample config
Company Confidential December 2010
-28-
1. # Random Reads connection (any node)2. listen all *:33063. server db1 10.2.46.120:3306 check port 92004. server db2 10.2.46.121:3306 check port 92005. server db3 10.2.46.122:3306 check port 92007. # Writer connection (first available node)8. listen writes *:43069. server db1 10.2.46.120:3306 track all/db110. server db2 10.2.46.121:3306 track all/db2 backup11. server db3 10.2.46.122:3306 track all/db3 backup
Resources
Company Confidential December 2010
-29-
• XtraDB Cluster homepage and documentation:– http://www.percona.com/software/percona-xtradbcluster/
• Galera Documentation:– http://www.codership.com/wiki/doku.php
• PXC tutorial (self-guided or at a conference):– https://github.com/jayjanssen/percona-xtradb-cluster-
tutorial
• http://www.mysqlperformanceblog.com/category/xtradb-cluster/
THANK YOU
Jay Janssen
@jayjanssen
http://www.percona.com/software/percona-xtradb-cluster
Company Confidential December 2010
-30-