query processing and query optimization

of 22/22
Query processing & Query optimization Master : DR alesheikh Student : mohsen yousefzadeh (9413374) 1

Post on 11-Jan-2017

60 views

Category:

Engineering

9 download

Embed Size (px)

TRANSCRIPT

  • Query processing & Query optimization

    Master : DR alesheikh

    Student : mohsen yousefzadeh (9413374)

    1

  • What is Query Processing?

    The activities involved in retrieving data from the

    database are called as query processing.

    It is a 3 step process that transforms a high level query

    (sql) into an equivalent and more efficient lower-level

    query (of relational algebra).

    2

  • Query

    Query

    Query is the statement written by the user in high

    language using sql.

    3

  • Scanning & Parsing & validating

    Query

    Scanning

    parsing

    and

    validating

    Scanner : identifies the query tokens _ such as SQL keywords,

    attribute names, and relation names _ that appear in the text

    of the query .

    Parser : Checks the syntax and verifies the relation.

    Vaditing : checking that all attribute and relation names are

    valid and semantically meaningful names .

    4

  • Query optimizer

    QueryQuery

    optimizer

    A query typically has many possible execution strategies,and

    the process of choosing a suitable one for processing a query

    is known as query optimization.

    The query optimizer module has the task of producing a

    good execution plan.It will select the query which has low cost.

    Scanning

    parsing

    and

    validating

    5

  • Optimizer

    Query

    6

    Scanning

    parsing

    and

    validating

    Query

    optimizer

    For example :"Find all female senators who own a business.

    This query is actually a composition of two subqueries"all

    female senators" is a selection query. The query "Find all

    senators who own a business " is a join query because we

    combine two tables to process the query.

  • SENATOR

    BUSINESS

    The question is, in which order should these subqueries be

    processed: select before join or join before select?

    Remember a join is a multiscan query, and select is a single-

    scan query. Now it is obvious that select should be done before

    join.

    name Soc-sec gender District(polygon)

    B-name owner Soc-sec Location(point)

    7

  • There are two main techniques that are employed during query

    optimization :

    The first technique is based on heuristic rules for ordering the

    operations in a query execution strategy.

    A heuristic is a rule that works well in most cases but is not

    guaranteed to work well in every case. The rules typically

    reorder the operations in a query tree.

    8

  • The second technique involves systematically estimating

    the cost of different execution strategies and choosing

    the execution plan with the lowest cost estimate .

    These techniques are usually combined in a query

    optimizer .

    9

  • Using Heuristics in Query Optimization

    The scanner and parser of an SQL query first generate a data

    structure that corresponds to an initial query representation,

    which is then optimized according to heuristic rules . This

    leads to an optimized query representation .

    One of the main heuristic rules is to apply SELECT and

    PROJECT operations before applying the JOIN .

    10

  • Example of Transforming a Query

    Consider the following query Q on the database

    Q : Find the last names of employees born after 1957 who

    work on a project named Aquarius.

    This query can be specified in SQL as follows:

    Q: SELECT Lname

    FROM EMPLOYEE, WORKS_ON, PROJECT

    WHERE Pname=Aquarius AND Pnumber=Pno AND Essn=Ssn

    AND Bdate > 1957-12-31;

    11

  • Lname

    Pname=Aquarius AND Pnumber=Pno AND Essn=Ssn AND Bdate>1957-12-31

    X

    12

    PROJECTX

    WORKS_ONEMPLOYEE

    (a) Initial query tree for SQL query Q

  • 13

    PROJECT

    X

    WORKS_ON

    EMPLOYEE

    (b) Moving SELECT

    operations down the query

    tree.

    Bdate>1957-12-31pnumber=pno

    Essn=ssn

    X

    Pname=Aquarius

    Lname

  • 14

    PROJECT

    WORKS_ON EMPLOYEE

    (c) Replacing CARTESIAN

    PRODUCT and SELECT with

    JOIN operations.

    Bdate>1957-12-31pnumber=pno

    Essn=ssn

    Pname=Aquarius

    Lname

  • 15

    PROJECT

    WORKS_ON

    EMPLOYEE

    (d) Moving PROJECT

    operations down the query

    tree.

    Bdate>1957-12-31

    pnumber=pno

    Essn=ssn

    Pname=Aquarius

    Lname

    ssn,Lname

    pnumber

    Essn

    Essn,pno

  • Using Selectivity and Cost Estimates

    in Query Optimization

    A query optimizer does not depend solely on heuristic

    rules .

    it also estimates and compares the costs of executing a

    query using different execution strategies and

    algorithms.

    and it then chooses the strategy with the lowest cost

    estimate .

    16

  • Cost Components for Query Execution

    1. Access cost to secondary storage

    2. Disk storage cost

    3. Computation cost

    4. Memory usage cost

    5. Communication cost

    17

  • Query

    Query

    code

    generator

    Query code generator

    Code generator

    generates the code to

    execute that plan.

    18

    Query

    optimizer

    Scanning

    parsing

    and

    validating

  • Runtime database processor

    Query

    The runtime database processor

    has the task of running

    (executing) the query code,

    whether in compiled or

    interpreted mode, to produce

    the query result.

    19

    Scanning

    parsing

    and

    validating

    Query

    optimizer

    Query

    code

    generator

    Runtime database processor

  • Code can be :

    Excecuted directly (interpreted mode)

    Stored and executed later whenever needed (compiled mode)

    The term optimization is actually a misnomer because in some

    cases the chosen execution plan is not the optimal (or absolute

    best) strategy it is just a reasonably efficient strategy for executing

    the query.

    20

  • Result of query

    Query

    Result of

    query

    21

    Scanning

    parsing

    and

    validating

    Query

    optimizer

    Query

    code

    generator

    Runtime

    database

    processor