optimizer overviewoow2014

32
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | Copyright © 2014, Oracle and/or its affiliates. All rights reserved. Chaithra M G Software engineer MySQL, Oracle October 17, 2014 MySQL 5.7: What’s New in the Parser and the Optimizer?

Upload: mysql-user-camp

Post on 13-Jul-2015

106 views

Category:

Presentations & Public Speaking


1 download

TRANSCRIPT

Page 1: Optimizer overviewoow2014

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | Copyright © 2014, Oracle and/or its affiliates. All rights reserved.

Chaithra M G Software engineer MySQL, Oracle October 17, 2014

MySQL 5.7: What’s New in the Parser and the Optimizer?

Page 2: Optimizer overviewoow2014

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

Safe Harbor Statement

The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracle’s products remains at the sole discretion of Oracle.

Page 3: Optimizer overviewoow2014

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

MySQL Optimizer

SELECT a, b FROM t1, t2, t3 WHERE t1.a = t2.b AND t2.b = t3.c AND t2.d > 20 AND t2.d < 30;

MySQL Server

Cost based optimizations

Heuristics

Cost Model

Op

tim

izer

Table/index info (data dictionary)

Statistics (storage engines)

t2 t3

t1

Table scan

Range scan

Ref access

JOIN

JOIN

Pars

er

Page 4: Optimizer overviewoow2014

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

MySQL Optimizer: Design Principles

• Best out of the box performance

• Easy to use, minimum tuning needed

• When you need to understand: explain and trace

• Flexibility through optimizer switches, hints and plugins

• Fast evolving

Page 5: Optimizer overviewoow2014

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

MySQL 5.7 Parser and Optimizer Improvements

• Parser and optimizer refactoring

• Improved cost model: better record estimation for JOIN

• Improved cost model: configurable cost constants

• Query rewrite plugin

• Explain on a running query

Page 6: Optimizer overviewoow2014

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

MySQL 5.7 Optimizer Improvements

• Generated columns

• UNION ALL queries no longer use temporary tables

• Improved optimizations for queries with IN expressions

• Optimized full text search

Page 7: Optimizer overviewoow2014

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

5.7 Parser and Optimizer Refactoring

Optimizer

Logical transformations

Cost-based optimizer: Join order and access methods

Plan refinement

Query execution plan

Query execution

Parser

Resolver: Semantic check,name resolution

SQL DML query

Query result

Storage Engine

InnoDB MyISAM

Improves readability,

maintainability and stability

– Cleanly separate the parsing,

optimizing, and execution stages

– Allows for easier feature additions,

with lessened risk

Page 8: Optimizer overviewoow2014

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

MySQL 5.7: Parser Refactoring • Challenge:

– Overly complex, hard to add new syntax

• Solution:

– Create an internal parse tree bottom-up

– Create an AST (Abstract Syntax Tree) from the parse tree and the user's context.

– Have syntax rules that are more precisely defined and are closer to the SQL standard.

– More precise error messages

– Better support for larger syntax rules in the future

Resolver

Optimizer

SE

Lexical Scanner (lexer)

GNU Bison-generated Parser (bottom-up parsing style)

Contextualization

Parser (new)

Executor

AST

Page 9: Optimizer overviewoow2014

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

Motivation for Changing the Cost Model

• Adopt to new hardware architectures

– SSD, larger memories, caches

• Allows storage engines to provide accurate and dynamic cost estimate

– Whether the data is in RAM, SSD, HDD?

• More maintainable cost model implementation – Avoid hard coded constants

– Refactoring of existing cost model code

• Tunable/configurable

• Replace heuristics with cost based decisions

Page 10: Optimizer overviewoow2014

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

Cost Model: Main Focus in 5.7

Address the following pain points in current cost model:

• Hard-coded cost constants

– Not possible to adjust for different hardware

• Imprecise cardinality/records per key estimates from SE

– Integer value gives too low precision

• Inaccurate record estimation for JOIN

– Too high fan out

• Hard to obtain detailed cost numbers

Page 11: Optimizer overviewoow2014

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

MySQL 5.6: Record Estimates for JOIN

• t1 JOIN t2

• Total cost = cost (access method t1) + Prefix_rows_t1 * cost (access method t2)

• Prefix_rows_t1 is records read by t1

– Overestimation if where conditions apply!->Suboptimial join order

Without condition filtering

t1 t2

Acc

ess

Met

ho

d

Prefix_rows_t1 Number of records read

from t1

Page 12: Optimizer overviewoow2014

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

MySQL 5.7 Improved Record Estimates for JOIN

• t1 JOIN t2

• Prefix_rows_t1 Takes into account the entire query condition

– More accurate record estimate -> improved JOIN order

Condition filter

t1 t2

Acc

ess

Met

ho

d

Number of records read

from t1

Co

nd

itio

n f

ilter

Prefix_rows_t1 Records passing the table

conditions on t1

Page 13: Optimizer overviewoow2014

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

• 10 000 rows in the emp table • 100 rows in the office table • 100 rows with first_name=”John” AND hire_date BETWEEN “2012-01-

01″ AND “2012-06-01″

MySQL 5.7 Improved Record Estimates for JOIN

CREATE TABLE emp ( id INTEGER NOT NULL PRIMARY KEY, office_id INTEGER NOT NULL, first_name VARCHAR(20), hire_date DATE NOT NULL, KEY office (office_id) ) ENGINE=InnoDB;

CREATE TABLE office ( id INTEGER NOT NULL PRIMARY KEY, officename VARCHAR(20) ) ENGINE=InnoDB;

Page 14: Optimizer overviewoow2014

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

Table Type Possible keys Key Ref Rows Filtered Extra

office ALL PRIMARY NULL NULL 100 100.00 NULL

employee ref office office office.id 99 100.00 Using where

MySQL 5.7 Improved Record Estimates for JOIN

Explain for 5.6: Total Cost = cost(scan office) + 100 * cost(ref_access emp)

SELECT office_name

FROM office JOIN employee ON office.id = employee.office

WHERE employee.name LIKE “John” AND

hire_date BETWEEN “2014-01-01” AND “2014-06-01”;

Table Type Possible keys Key Ref Rows Filtered Extra

employee ALL NULL NULL NULL 9991 1.23 NULL

office eq_ref PRIMARY PRIMARY employee.office 1 100.00 Using where

JOIN ORDER HAS CHANGED!

Explain for 5.7: Total Cost = cost(scan emp) + 9991*1.23% * cost(eq_ref_access office)

Page 15: Optimizer overviewoow2014

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

MySQL 5.7 Improved Record Estimation for JOIN Performance Improvements: DBT-3 (SF 10)

0

20

40

60

80

100

120

Q3 Q7 Q8 Q9 Q12

Exe

cuti

on

Tim

e R

ela

tive

to

5.6

in

Pe

rce

nta

ge

5 out of 22 queries get an improved query plan

5.6

5.7

Page 16: Optimizer overviewoow2014

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

MySQL 5.7 Additional Cost Data in JSON Explain JSON s mysql> EXPLAIN FORMAT=JSON SELECT SUM(o_totalprice) FROM orders WHERE o_orderdate BETWEEN '1994-01-01' AND '1994-12-31'; { "query_block": { "select_id": 1, "cost_info": { "query_cost": "3118848.00" }, "table": { "table_name": "orders", "access_type": "ALL", "possible_keys": [ "i_o_orderdate" ], "rows_examined_per_scan": 15000000, "rows_produced_per_join": 4489990, "filtered": 29.933, "cost_info": { "read_cost": "2220850.00", "eval_cost": "897998.00", "prefix_cost": "3118848.00", "data_read_per_join": "582M" }, "used_columns": [ "o_totalprice", "o_orderDATE" ], "attached_condition": "(`dbt3`.`orders`.`o_orderDATE` between '1994-01-01' and '1994-12-31')" } } }

Total query cost of a query block

Cost per table

Cost of sorting operation

Cost of reading data

Cost of evaluating conditions

Cost of prefix join

Rows examined/produced per join

Used columns

Data read per join –

(# of rows)*(record width) in byte

Page 17: Optimizer overviewoow2014

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

MySQL 5.7 Visual Explain in MySQL Workbench

Page 18: Optimizer overviewoow2014

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

MySQL 5.7: Why Query Rewrite Plugin?

• Problem

– Optimizer choses a suboptimal plan

– Users can change the query plan by adding hints or rewrite the query

– However, dabase application code cannot be changed

• Solution: query rewrite plugin!

labs.mysql.com

Page 19: Optimizer overviewoow2014

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

MySQL 5.7: Query Rewrite Plugin

• New pre and post parse query rewrite APIs

– Users can write their own plug-ins

• Provides a post-parse query plugin

– Rewrite problematic queries without the need to make application changes

– Add hints

– Modify join order

– Many more …

• Improve problematic queries from ORMs, third party apps, etc

• ~Zero performance overhead for queries not to be rewritten

labs.mysql.com

Page 20: Optimizer overviewoow2014

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

MySQL 5.7 How Rewrites Happen?

For query

SELECT * FROM t1 JOIN t2 ON t1.keycol = t2.keycol WHERE col1 = 42 AND col2 = 2

Replace parameter markers in Replacement with actual literals:

Pattern is: SELECT * FROM t1 JOIN t2 ON t1.keycol = t2.keycol

WHERE col1 = ? AND col2 = ?

Replacement is: SELECT * FROM t1 STRAIGHT_JOIN t2 FORCE INDEX (col1) ON t1.keycol = t2.keycol WHERE col1 = ? AND col2 = ?

SELECT *

FROM t1 STRAIGHT_JOIN t2 FORCE INDEX (col1)

ON t1.keycol = t2.keycol WHERE col1 = 42 AND col2 = 2

Page 21: Optimizer overviewoow2014

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

MySQL 5.7 How Matching of Rules Happen?

Match and execute rule in three steps:

1. Hash lookup using query digest computed during parsing

– Finds patterns with same digest.

2. Parse tree structure comparison

– To filter out hash collision

– Will not detect differences in literals

3. Compare literal constants

– In practice done during rewrite

Page 22: Optimizer overviewoow2014

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

MySQL 5.7 Communicating with the Plugin query_rewrite.rewrite_rules table:

Pattern Pattern_ database

Replacement Enabl

ed Message

SELECT name, department_name FROM employee JOIN department USING ( department_id ) WHERE salary > ?

employees

SELECT name, department_name FROM employee STRAIGHT JOIN department USING ( department_id ) WHERE salary > ?

Y NULL

SELECT name, department_name FROXM employee JOIN department USING ( department_id ) WHERE salary > ?

employees

SELECT name FROM employee STRAIGHT JOIN department USING ( department_id ) WHERE salary > ?

N

Parse error in pattern:……near ……at line 1

SELECT name, department_name FROM employee JOIN department USING ( department_id ) WHERE salary > ?

employees SELECT name, department_name FROXM employee STRAIGHT JOIN department USING ( department_id ) WHERE salary > ?

N Parse error in replacement …near … at line 1

Page 23: Optimizer overviewoow2014

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

MySQL 5.7 Query Rewrite Plug-in: Server’s POV

• Query comes in – Plugin(s) is asked if it wants digests (It does)

• Query is parsed • Plugin is invoked • In case rules table has changed, refresh rules • Pattern matching • In case match, the query is rewritten. Server raises SQL note.

Page 24: Optimizer overviewoow2014

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

MySQL 5.7 Query Rewrite Plugin: Performance Impact

What is the Cost of Rewriting queries?

• Designed for rewriting problematic queries only!

• ~ Zero cost for queries not to be rewritten

– Statement digest computed for performance schema anyway

• Cost of queries to be rewritten is insignificant compared to performance gain

– Cost of generating query + reparsing max ~5% performance overhead

– Performance gain potentially x times

Page 25: Optimizer overviewoow2014

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

MySQL 5.7 Explain on a Running Query

EXPLAIN [FORMAT=(JSON|TRADITIONAL)] [EXTENDED] FOR CONNECTION <id>;

• Shows query plan on connection <id>

• Useful for diagnostic on long running queries

• Plan isn’t available when query plan is under creation

• Applicable to SELECT/INSERT/DELETE/UPDATE

Page 26: Optimizer overviewoow2014

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

MySQL 5.7 Generated Columns

• Column generated from the expression

• VIRTUAL: computed when read, not stored, not indexable

• STORED: computed when inserted/updated, stored in SE, indexable

• Useful for: – Functional index: create a stored column, add a secondary index

– Materialized cache for complex conditions

– Simplify query expression

labs.mysql.com

CREATE TABLE order_lines (order integer, lineno integer, price decimal(10,2), qty integer, sum_price decimal(10,2) GENERATED ALWAYS AS (qty * price) STORED );

Kodus to Andrey Zhakov for his contribution!

Page 27: Optimizer overviewoow2014

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

MySQL 5.7: Avoid Creating Temporary Table for UNION ALL

SELECT * FROM table_a UNION ALL SELECT * FROM table_b;

• 5.6: Always materialize results of UNION ALL in temporary tables

• 5.7: Do not materialize in temporary tables unless used for sorting, rows are sent directly to client

• 5.7: Less memory and disk consumption

Page 28: Optimizer overviewoow2014

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

MySQL 5.7: Optimizations for IN Expressions

• 5.6: Certain queries with IN predicates can’t use index scans or range scans even though all the columns in the query are indexed.

• 5.6: Range optimizer ignores lists of rows

• 5.6: Needs to rewrite to De-normalized form

SELECT a, b FROM t1 WHERE ( a = 0 AND b = 0 ) OR ( a = 1 AND b = 1 )

• 5.7: IN queries with row value expressions executed using range scans.

• 5.7: Explain output: Index/table scans changes to range scans

CREATE TABLE t1 (a INT, b INT, c INT, KEY x(a, b)); SELECT a, b FROM t1 WHERE (a, b) IN ((0, 0), (1, 1));

Page 29: Optimizer overviewoow2014

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

MySQL 5.7: Optimizations for IN Expressions

• A table has 10 000 rows, 2 match the where condition

Before:

**************1. row *****************

select_type: SIMPLE

table: t1

type: index

key: x

key_len: 10

ref: NULL

rows: 10 000

Extra: Using where; Using index

After:

*************1. row *****************

select_type: SIMPLE

table: t1

type: range

key: x

key_len: 10

ref: NULL

rows: 2

Extra: Using where; Using index

SELECT a, b FROM t1 WHERE (a, b) IN ((0, 0), (1, 1));

Page 30: Optimizer overviewoow2014

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

MySQL 5.7: Optimization for Full Text Search

SELECT COUNT(*) FROM innodb_table WHERE MATCH(text) AGAINST ('for the this that‘ in natural language mode) > 0.5;

• Recognize more situations where ‘index only’ access method can be use. No need to access base table, only FT index

– when the MATCH expression was part of a '>' expression

• 2.5 GB data

– 4X performance improvement!

Page 31: Optimizer overviewoow2014

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

MySQL 5.7: Optimization for Full Text Search

Before:

**************1. row *****************

select_type: SIMPLE

table: innodb_table

type: fulltext

key: ft_idx

key_len: 0

ref: NULL

rows: 1

Extra: Using where;

After:

*************1. row *****************

select_type: SIMPLE

table: innodb_table

type: fulltext

key: ft_idx

key_len: 10

ref: const

rows: 1

Extra: Using where; Ft_hints: rank > 0.500000;

Using index

SELECT COUNT(*) FROM innodb_table WHERE MATCH(text) AGAINST ('for the this that‘ in natural language mode) > 0.5;

Page 32: Optimizer overviewoow2014

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |