maximizing sql reviews and tuning with pt-query-digest
DESCRIPTION
PalominoDB's Mark Filipi feels that pt-query-digest is one of the more valuable components of the Percona Toolkit available as OSS to DBAs. In this talk, Mark will teach with an eye towards real world test cases, output reviews and anecdotal production experience.TRANSCRIPT
Maximizing SQL reviews
with pt-query-digest
PALOMINODB OPERATIONAL EXCELLENCE
FOR DATABASES
Mark Filipi
www.palominodb.com
What is pt-query-digest
Analyzes MySQL queries from slow, general
and binary log files.
Processlist, tcpdump and more also available.
What is pt-query-digest
Fingerprints each query pattern, calculating
statistics for rows examine, returned, etc.
Prints report containing important query
information
Why use pt-query-digest
1) Easily locate slowest queries
2) See most executed queries
3) Find bottlenecks for tuning opportunities.
How to use pt-query-digest
# pt-query-digest slow.log.1 > slow.txt
slow.log.1: 1% 39:05 remain
slow.log.1: 2% 38:05 remain
slow.log.1: 3% 37:26 remain
...
Alternate usage # pt-query-digest slow.log.1
--type binlog
--type genlog
--type tcpdump
tcpdump -s 65535 -x -nn -q -tttt -i any -c 1000 port 3306 \
> mysql.tcp.txt
:program:`pt-query-digest` --type tcpdump mysql.tcp.txt
Default output # 2281.2s user time, 9.9s system time, 765.31M rss, 855.46M vsz
# Current date: Wed Apr 24 11:42:05 2013
# Hostname: XXXXXXXXXX
# Files: slow.log.1
# Overall: 5.50M total, 191 unique, 63.64 QPS, 0.78x concurrency _________
# Time range: 2013-04-23 04:02:12 to 2013-04-24 04:02:12
# Attribute total min max avg 95% stddev median
# ============ ======= ======= ======= ======= ======= ======= =======
# Exec time 67604s 46us 59s 12ms 185us 555ms 138us
# Lock time 263s 0 52ms 47us 57us 36us 44us
# Rows sent 1.35M 0 478.42k 0.26 0 285.55 0
# Rows examine 5.90G 0 679.89k 1.13k 34.95 23.75k 34.95
# Rows affecte 6.05k 0 908 0.00 0 0.79 0
# Rows read 5.89G 0 679.89k 1.12k 33.28 23.75k 33.28
# Bytes sent 8.37G 11 6.07M 1.60k 1.53k 3.71k 1.53k
# Tmp tables 2 0 1 0.00 0 0.00 0
# Tmp disk tbl 0 0 0 0 0 0 0
# Tmp tbl size 3.20M 0 1.60M 0.61 0 966.98 0
# Query size 4.77G 6 8.57k 931.73 918.49 43.35 918.49
Default output - cont # Profile
# Rank Query ID Response time Calls R/Call Apdx V/M Item
# ==== ================== ================ ======= ======= ==== ===== ====
# 1 0xE944A3E3D50BEF59 23978.8323 35.5% 1701 14.0969 0.27 20.62 SELECT user_metrics
# 2 0x9205E91301282E2B 18855.0849 27.9% 10530 1.7906 0.50 0.20 SELECT devices
# 3 0x540C99F3A94F6D5A 3753.6122 5.6% 96 39.1001 0.04 8.52 SELECT user_metrics
# 4 0x14DB3EB1D01A663C 1927.7399 2.9% 180 10.7097 0.34 18.82 SELECT levels
# 5 0xD3E830FA246FCC0B 1873.8963 2.8% 106 17.6783 0.25 21.10 SELECT users
# 6 0xB50D900EE7C2C24B 1813.5766 2.7% 119 15.2401 0.23 20.47 SELECT devices
# 7 0x54D33962E5247A38 1701.7326 2.5% 142 11.9840 0.25 17.88 SELECT given_items
# 8 0xD9BED65DDCEF98B2 1472.2035 2.2% 138 10.6681 0.31 21.89 SELECT users app_friends
weekly_stats
# 9 0x1044CCDF33122B46 1440.0400 2.1% 5237851 0.0003 1.00 1.43 SELECT unlock_dialogs
unlock_dialog_translations
# 10 0x8ACF8E33226B0CD0 1427.1346 2.1% 88 16.2174 0.25 23.13 SELECT users
...
# 46 0xE82376CB7E12680C 46.4620 0.1% 10 4.6462 0.45 15.80 UPDATE user_metrics
# MISC 0xMISC 1778.1279 2.6% 247232 0.0072 NS 0.0 <161 ITEMS>
Default output - cont # Query 1: 0.04 QPS, 0.55x concurrency, ID 0xE944A3E3D50BEF59 at byte 4921476349
# This item is included in the report because it matches --limit.
# Scores: Apdex = 0.27 [1.0], V/M = 20.62
# Query_time sparkline: | ^_|
# Time range: 2013-04-23 04:05:30 to 16:15:23
# Attribute pct total min max avg 95% stddev median
# ============ === ======= ======= ======= ======= ======= ======= =======
# Count 0 1701
# Exec time 35 23979s 1s 58s 14s 57s 17s 4s
# Lock time 0 43ms 14us 102us 25us 28us 3us 23us
# Rows sent 0 1.66k 1 1 1 1 0 1
# Rows examine 0 1.66k 1 1 1 1 0 1
# Rows affecte 0 0 0 0 0 0 0 0
# Rows read 0 2.44k 0 3 1.47 2.90 0.81 0.99
# Bytes sent 0 3.28M 1.97k 1.98k 1.98k 1.96k 0.00 1.96k
# Tmp tables 0 0 0 0 0 0 0 0
# Tmp disk tbl 0 0 0 0 0 0 0 0
# Tmp tbl size 0 0 0 0 0 0 0 0
# Query size 0 157.45k 92 95 94.79 92.72 0.24 92.72
Query output # Query 1: 0.04 QPS, 0.55x concurrency, ID 0xE944A3E3D50BEF59 at byte 4921476349
# This item is included in the report because it matches --limit.
# Scores: Apdex = 0.27 [1.0], V/M = 20.62
# Query_time sparkline: | ^_|
# Time range: 2013-04-23 04:05:30 to 16:15:23
# Attribute pct total min max avg 95% stddev median
# ============ === ======= ======= ======= ======= ======= ======= =======
# Count 0 1701
# Exec time 35 23979s 1s 58s 14s 57s 17s 4s
# Lock time 0 43ms 14us 102us 25us 28us 3us 23us
# Rows sent 0 1.66k 1 1 1 1 0 1
# Rows examine 0 1.66k 1 1 1 1 0 1
# Rows affecte 0 0 0 0 0 0 0 0
# Rows read 0 2.44k 0 3 1.47 2.90 0.81 0.99
# Bytes sent 0 3.28M 1.97k 1.98k 1.98k 1.96k 0.00 1.96k
# Tmp tables 0 0 0 0 0 0 0 0
# Tmp disk tbl 0 0 0 0 0 0 0 0
# Tmp tbl size 0 0 0 0 0 0 0 0
# Query size 0 157.45k 92 95 94.79 92.72 0.24 92.72
Query output - cont # String:
# Databases slow_db
# Hosts
# InnoDB trxID F8696E3EF (1/0%), F8696E401 (1/0%)... 1699 more
# Last errno 0
# Users slow_user
# Query_time distribution
# 1us
# 10us
# 100us
# 1ms
# 10ms
# 100ms
# 1s ################################################################
# 10s+ ##############################
# Tables
# SHOW TABLE STATUS FROM `slow_db` LIKE 'user_metrics'\G
# SHOW CREATE TABLE `slow_db`.`user_metrics`\G
# EXPLAIN /*!50100 PARTITIONS*/
SELECT `user_metrics`.* FROM `user_metrics` WHERE `user_metrics`.`user_id` = 21745101 LIMIT
1\G
Explain output mysql> explain
-> SELECT `user_metrics`.* FROM `user_metrics` WHERE `user_metrics`.`user_id` = 21745101
LIMIT 1;
+----+-------------+--------------+-------+-------------------------------+--------------------
-----------+---------+-------+------+-------+
| id | select_type | table | type | possible_keys | key
| key_len | ref | rows | Extra |
+----+-------------+--------------+-------+-------------------------------+--------------------
-----------+---------+-------+------+-------+
| 1 | SIMPLE | user_metrics | const | index_user_metrics_on_user_id |
index_user_metrics_on_user_id | 4 | const | 1 | |
+----+-------------+--------------+-------+-------------------------------+--------------------
-----------+---------+-------+------+-------+
1 row in set (0.00 sec)
Query output - #2 # Query 2: 3.31 QPS, 5.93x concurrency, ID 0x9205E91301282E2B at byte 1378442353
# This item is included in the report because it matches --limit.
# Scores: Apdex = 0.50 [1.0], V/M = 0.20
# Query_time sparkline: | _^_|
# Time range: 2013-04-23 08:34:15 to 09:27:12
# Attribute pct total min max avg 95% stddev median
# ============ === ======= ======= ======= ======= ======= ======= =======
# Count 0 10530
# Exec time 27 18855s 856ms 14s 2s 2s 591ms 2s
# Lock time 0 318ms 14us 4ms 30us 44us 62us 23us
# Rows sent 0 171 0 22 0.02 0 0.25 0
# Rows examine 94 5.58G 555.59k 555.89k 555.76k 535.27k 0 535.27k
# Rows affecte 0 0 0 0 0 0 0 0
# Rows read 94 5.58G 555.59k 555.89k 555.76k 535.27k 0 535.27k
# Bytes sent 0 10.13M 1005 5.34k 1008.50 964.41 51.78 964.41
# Tmp tables 0 0 0 0 0 0 0 0
# Tmp disk tbl 0 0 0 0 0 0 0 0
# Tmp tbl size 0 0 0 0 0 0 0 0
# Query size 0 808.73k 75 79 78.65 76.28 0.19 76.28
Query output #2 - cont # String:
# Databases slow_db
# Hosts
# InnoDB trxID F87B2DD00 (1/0%), F87B2DD04 (1/0%)... 10528 more
# Last errno 0
# Users slow_db
# Query_time distribution
# 1us
# 10us
# 100us
# 1ms
# 10ms
# 100ms #
# 1s ################################################################
# 10s+ #
# Tables
# SHOW TABLE STATUS FROM `slow_db` LIKE 'devices'\G
# SHOW CREATE TABLE `slow_db`.`devices`\G
# EXPLAIN /*!50100 PARTITIONS*/
SELECT `devices`.* FROM `devices` WHERE `devices`.`current_user_id` = 24261223\G
Explain output - #2 mysql> EXPLAIN /*!50100 PARTITIONS*/
-> SELECT `devices`.* FROM `devices` WHERE `devices`.`current_user_id` = 24261223\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: devices
partitions: NULL
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 524634
Extra: Using where
1 row in set (0.00 sec)
Explain output - #2 mmysql> show create table devices\G
*************************** 1. row ***************************
Table: devices
Create Table: CREATE TABLE `devices` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`os` enum('ios') NOT NULL,
`uuid` varchar(255) NOT NULL,
`current_user_id` int(11) NOT NULL,
`platform_id` int(11) DEFAULT NULL,
`ios_device` varchar(255) DEFAULT NULL,
`ios_version` varchar(255) DEFAULT NULL,
`merged_at` datetime DEFAULT NULL,
`created_at` datetime DEFAULT NULL,
`updated_at` datetime DEFAULT NULL,
`apn_token` varchar(255) DEFAULT NULL,
`apn_token_updated_at` datetime DEFAULT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `index_devices_on_os_uuid` (`os`,`uuid`),
KEY `index_devices_on_apn_token` (`apn_token`)
) ENGINE=InnoDB AUTO_INCREMENT=1155873 DEFAULT CHARSET=utf8
1 row in set (0.00 sec)
Query output #3 - compare # Query 9: 88.32 QPS, 0.02x concurrency, ID 0x1044CCDF33122B46 at byte 4560650581
# Time range: 2013-04-23 04:02:12 to 20:30:40
# Attribute pct total min max avg 95% stddev median
# ============ === ======= ======= ======= ======= ======= ======= =======
# Count 95 5237851
# Exec time 2 1440s 129us 31s 274us 167us 20ms 138us
# Lock time 95 252s 41us 52ms 48us 57us 36us 44us
# Rows sent 0 0 0 0 0 0 0 0
# Rows examine 2 174.83M 35 35 35 35 0 35
# Rows affecte 0 0 0 0 0 0 0 0
# Rows read 2 169.83M 26 34 34.00 33.28 0.06 33.28
# Bytes sent 93 7.86G 1.57k 1.57k 1.57k 1.57k 0 1.57k
# Tmp tables 0 0 0 0 0 0 0 0
# Tmp disk tbl 0 0 0 0 0 0 0 0
# Tmp tbl size 0 0 0 0 0 0 0 0
# Query size 95 4.56G 932 936 934.17 918.49 0.00 918.49
Query output #3 # Query_time distribution
# 1us
# 10us
# 100us ################################################################
# 1ms #
# 10ms #
# 100ms #
# 1s #
# 10s+ #
# Tables
# SHOW TABLE STATUS FROM `slow_db` LIKE 'unlock_dialogs'\G
# SHOW CREATE TABLE `slow_db`.`unlock_dialogs`\G
# SHOW TABLE STATUS FROM `slow_db` LIKE 'unlock_dialog_translations'\G
# SHOW CREATE TABLE `slow_db`.`unlock_dialog_translations`\G
# EXPLAIN /*!50100 PARTITIONS*/
SELECT `unlock_dialogs`.`id` AS t0_r0, `unlock_dialogs`.`level_id` AS t0_r1,
`unlock_dialogs`.`games_played` AS t0_r2, `unlock_dialogs`.`screen` AS t0_r3,
`unlock_dialogs`.`text` AS t0_r4, `unlock_dialogs`.`character_id` AS t0_r5,
`unlock_dialogs`.`orientation` AS t0_r6, `unlock_dialogs`.`sort_order` AS t0_r7,
`unlock_dialogs`.`created_at` AS t0_r8, `unlock_dialogs`.`updated_at` AS t0_r9,
`unlock_dialog_translations`.`id` AS t1_r0, `unlock_dialog_translations`.`unlock_dialog_id` AS
t1_r1, `unlock_dialog_translations`.`locale` AS t1_r2, `unlock_dialog_translations`.`text` AS
t1_r3, `unlock_dialog_translations`.`created_at` AS t1_r4,
`unlock_dialog_translations`.`updated_at` AS t1_r5 FROM `unlock_dialogs` LEFT OUTER JOIN
`unlock_dialog_translations` ON `unlock_dialog_translations`.`unlock_dialog_id` =
`unlock_dialogs`.`id` WHERE `unlock_dialogs`.`games_played` = 366 AND
`unlock_dialog_translations`.`locale` IN ('en', 'en')\G
Query Explain #3 mysql> EXPLAIN ....
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: unlock_dialogs
partitions: NULL
type: ALL
possible_keys: PRIMARY
key: NULL
key_len: NULL
ref: NULL
rows: 35
Extra: Using where
*************************** 2. row ***************************
id: 1
select_type: SIMPLE
table: unlock_dialog_translations
partitions: NULL
type: ref
possible_keys: idx_dialog_locale
key: idx_dialog_locale
key_len: 5
ref: slow_db.unlock_dialogs.id
rows: 1
Extra: Using where
2 rows in set (0.01 sec)
MySQL Variables to set
--slow-query-log
Logs SQL statements that run for more than long_query_time
--log-queries-not-using-indexes
--long-query-time
- Use logrotate to keep N slow query logs
MySQL Variables to set
Other options:
– log_slow_admin_statements
– log_slow_slave_statements
– (Percona) log_slow_verbosity ( query_plan / innodb /
profiling )
– (Percona) log_slow_filter ( full_scan / tmp_table /
filesort_on_disk / etc)
http://dev.mysql.com/doc/refman/5.5/en/slow-query-log.html
http://www.percona.com/doc/percona-server/5.5/diagnostics/slow_extended_55.html
Extended Verbosity
# Time: 120325 2:29:54
# User@Host: web[web] @ web1 [192.168.1.138]
# Thread_id: 1217328 Schema: dbname_prod Last_errno: 0 Killed: 0
# Query_time: 1.248839 Lock_time: 0.001044 Rows_sent: 98 Rows_examined:
146 Rows_affected: 0 Rows_read: 1
# Bytes_sent: 215048 Tmp_tables: 0 Tmp_disk_tables: 0 Tmp_table_sizes: 0
# InnoDB_trx_id: 71BE9460
# QC_Hit: No Full_scan: No Full_join: No Tmp_table: No Tmp_table_on_disk: No
# Filesort: No Filesort_on_disk: No Merge_passes: 0
# InnoDB_IO_r_ops: 9 InnoDB_IO_r_bytes: 147456 InnoDB_IO_r_wait: 1.240737
# InnoDB_rec_lock_wait: 0.000000 InnoDB_queue_wait: 0.000000
# InnoDB_pages_distinct: 43
SET timestamp=1332667794;
SELECT ....
pt-query-digest variables to
consider
--order-by - output sorted by count
--limit - default top 95%, but # of queries
allowed
--explain - connects to mysql to execute and
output EXPLAIN for each query
-u, --ask-pass
Caution: Explain at query execution may
be different than pt-query-digest's explain
Limitations
• Memory
o Beware the OOM killer
• CPU
o Cycles aren't free
• Log collection period
• Non-mysql issues
What about RDS?
RDS slow log
mysql> select * from mysql.slow_log limit 1;
+---------------------+------------------------------------------------------------------+-----
-------+-----------+-----------+---------------+------------------+----------------+-------
----+------------+-------------------------------------------------------------------------
----------------------------+
| start_time | user_host |
query_time | lock_time | rows_sent | rows_examined | db | last_insert_id |
insert_id | server_id | sql_text
|
+---------------------+------------------------------------------------------------------+-----
-------+-----------+-----------+---------------+------------------+----------------+-------
----+------------+-------------------------------------------------------------------------
----------------------------+
| 2013-03-12 11:19:28 | user[user] @ ip-not-your-host.ec2.internal [not.your.host] | 00:00:02
| 00:00:00 | 1 | 1 | atvi_codlive_web | 0 | 0 |
1456316482 | SELECT `ranks`.* FROM `ranks` WHERE `title` = 'mw3' AND `id` = 39 ORDER BY
`ranks`.`id` ASC LIMIT 1 |
+---------------------+------------------------------------------------------------------+-----
-------+-----------+-----------+---------------+------------------+----------------+-------
----+------------+-------------------------------------------------------------------------
----------------------------+
RDS slow log
mysql> SELECT CONCAT( '# Time: ', DATE_FORMAT(start_time, '%y%m%d %H%i%s'), '\n', '# User@Host:
', user_host, '\n', '# Query_time: ', TIME_TO_SEC(query_time), ' Lock_time: ',
TIME_TO_SEC(lock_time), ' Rows_sent: ', rows_sent, ' Rows_examined: ', rows_examined,
'\n', sql_text, ';' ) FROM mysql.slow_log limit 1;
+----------------------------------------------------------------------------------------------
-------------------------------------------------------------------------------------------
-------------------------------------------------------------------------------+
| CONCAT( '# Time: ', DATE_FORMAT(start_time, '%y%m%d %H%i%s'), '\n', '# User@Host: ',
user_host, '\n', '# Query_time: ', TIME_TO_SEC(query_time), ' Lock_time: ',
TIME_TO_SEC(lock_time), ' Rows_sent: ', rows_sent, ' Rows_examined: ', rows_examined,
'\n', |
+----------------------------------------------------------------------------------------------
-------------------------------------------------------------------------------------------
-------------------------------------------------------------------------------+
| # Time: 130312 111928
# User@Host: user[user] @ ip-not-your-host.ec2.internal [not.your.host]
# Query_time: 2 Lock_time: 0 Rows_sent: 1 Rows_examined: 1
SELECT `ranks`.* FROM `ranks` WHERE `title` = 'mw3' AND `id` = 39 ORDER BY `ranks`.`id` ASC
LIMIT 1; |
+----------------------------------------------------------------------------------------------
-------------------------------------------------------------------------------------------
-------------------------------------------------------------------------------+
1 row in set (0.00 sec)
RDS slow log
mysql -u user -p -h host.rds.amazonaws.com -D mysql -s -r
-e "SELECT CONCAT( '# Time: ', DATE_FORMAT(start_time,
'%y%m%d %H%i%s'), '\n', '# User@Host: ', user_host,
'\n', '# Query_time: ', TIME_TO_SEC(query_time), '
Lock_time: ', TIME_TO_SEC(lock_time), ' Rows_sent: ',
rows_sent, ' Rows_examined: ', rows_examined, '\n',
sql_text, ';' ) FROM mysql.slow_log" >
/tmp/mysql.slow_log.log