fo optimization2

Author: timothy-wood

Post on 03-Jun-2018

213 views

Category:

Documents


0 download

Embed Size (px)

TRANSCRIPT

  • 8/12/2019 Fo Optimization2

    1/32

    Query processing and

    o timization

    Reading (5th edition): Chapters 6.1-6.3, 15.1-15.3, 15.7-15.8.2

    Jose M. Pea

    [email protected]

  • 8/12/2019 Fo Optimization2

    2/32

    ERdiagram

    Relational model

    MySQL

  • 8/12/2019 Fo Optimization2

    3/32

    Relation schema

    Attributes

    -

    yymmdd-xxxx

    Textual string less than 30 chars

    Textual string less than 30 chars

    rrr - nn nn nn

    aaaaannn

    Positive integer0

  • 8/12/2019 Fo Optimization2

    4/32

    Relation (state)

    PNumber Name Address Telephone E-mail Age123456-7890 Anders

    AnderssonRydsvgen 1 013-11 22 33 andan111 25

    112233-4455 Veronika Alsters 2 013-22 33 44 ver e222 27

    Pettersson

    Tuple = list of values in the corresponding domains, or NULL

  • 8/12/2019 Fo Optimization2

    5/32

    Key constraints Relation = set of tuples.

    Then, no duplicates are allowed.

    Then, every tuple is uniquely identifiable

    super ey, can ate ey, pr mary eywhich are all time-invariant).

    PNumber Name Address Telephone E-mail Age

    123456-7890 AndersAndersson

    Rydsvgen 1 013-11 22 33 andan111 25

    112233-4455 VeronikaPettersson

    Alstersg 2 013-22 33 44 verpe222 27

  • 8/12/2019 Fo Optimization2

    6/32

    Integrity constraints

    Entity integrity constraint = no primarykey value is NULL.

    domain(FK) = domain(PK) and (ii) everyvalue of FK in R1 refers to an existing

    tuple in R2 or is NULL. Referential integrity constraint =

    conditions (i) and (ii) above hold.

  • 8/12/2019 Fo Optimization2

    7/32

  • 8/12/2019 Fo Optimization2

    8/32

    Select Selects the tuples of a relation satisfying

    some condition over its attributes.

    )(3)21( RZAYAXA =

  • 8/12/2019 Fo Optimization2

    9/32

    Example: select

    PNum Name Address TelNr

    112233-4455 Elin Rydsvgen 1 112233

    223344-5566 Nisse Alstersgatan 3 223344

    334455-6677 Nisse Rydsvgen 3 334455

    STUDENT:

    113322-1122 Pelle Rydsvgen 2 113322552233-1144 Monika Rydsvgen 4 443322

    442211-2222 Patrik Rydsvgen 6 111122

    334433-1111 Camilla Alstersgatan 1 665544

    )('')'334455'''( STUDENTCamillaNameTelNrNisseName ===

    PNum Name Address TelNr

    334455-6677 Nisse Rydsvgen 3 334455

    334433-1111 Camilla Alstersgatan 1 665544

  • 8/12/2019 Fo Optimization2

    10/32

    Project Projects a relation over some attributes.

    The result must be a relation = duplicatesare removed.

    3,2,1 AAA

  • 8/12/2019 Fo Optimization2

    11/32

    Example: project

    PNum Name Address TelNr112233-4455 Elin Rydsvgen 1 112233

    223344-5566 Nisse Alstersgatan 3 223344

    334455-6677 Nisse R dsv en 3 334455

    STUDENT:

    )(, STUDENTNamePNum

    PNum Name

    112233-4455 Elin

    223344-5566 Nisse

    334455-6677 Nisse

    ?)(STUDENTName

  • 8/12/2019 Fo Optimization2

    12/32

    Union, intersection anddifference

    SRISRU SR

    , . .same number of attributes and with thesame domains.

    The result must be a relation =duplicates are removed (union).

  • 8/12/2019 Fo Optimization2

    13/32

    Example: IntersectionPNum Name Address TelNr

    112233-4455 Elin Rydsvgen 1 112233

    223344-5566 Nisse Alstersgatan 3 223344

    334455-6677 Nisse Rydsvgen 3 334455

    STUDENT:

    PNum Name Office address TelNr

    884455-4455 Monika Teknikringen 1 111112

    223344-5566 Nisse Alstersgatan 3 223344

    668877-7766 Patrik Teknikringen 3 332211

    EMPLOYEESTUDENTIPNum Name Address TelNr

    223344-5566 Nisse Alstersgatan 3 223344

  • 8/12/2019 Fo Optimization2

    14/32

    Cartesian productName STATE

    Los Angeles Calif

    Oakland Calif

    Atlanta Ga

    Name STATE Key City

    Los Angeles Calif 5 San Fransisco

    Los Angeles Calif 7 Oakland

    Los Angeles Calif 8 Boston

    Oakland Calif 5 San Fransisco

    Oakland Calif 7 Oakland

    Oakland Calif 8 Boston

    R:

    San Fransisco Calif

    Boston Mass

    Key City

    5 San Fransisco

    7 Oakland

    8 Boston

    AtlantaGa 5 San Fransisco

    Atlanta Ga 7 Oakland

    Atlanta Ga 8 Boston

    San Fransisco Calif 5 San Fransisco

    San Fransisco Calif 7 Oakland

    San Fransisco Calif 8 Boston

    Boston Mass 5 San Fransisco

    Boston Mass 7 Oakland

    Boston Mass 8 Boston

    S: R x S

  • 8/12/2019 Fo Optimization2

    15/32

    Join

    Joins two tuples from two relations if they satisfysome condition over their attributes.

    Join = Cartesian product followed by selection.

    Tuples with NULL in the condition attributes donot appear in the result.

    Recall: Join only on foreign key-primary key

    attributes.

    R.A1=S.B3 AND R.A5

  • 8/12/2019 Fo Optimization2

    16/32

    Example: joinName STATE

    Los Angeles Calif

    Oakland Calif

    Atlanta Ga

    Key City

    5 San Fransisco

    7 Oakland

    R:

    S:

    San Fransisco Calif

    Boston Mass8 Boston

    Name STATE Key City

    Oakland Calif 7 Oakland

    San Fransisco Calif 5 San Fransisco

    Boston Mass 8 Boston

    R.Name=S.CityR S

  • 8/12/2019 Fo Optimization2

    17/32

    Name STATE Key City

    Los Angeles Calif 5 San Fransisco

    Los Angeles Calif 7 Oakland

    Los Angeles Calif 8 Boston

    OaklandCalif 5 San Fransisco

    Oakland Calif 7 Oakland

    Oakland Calif 8 Boston

    Atlanta Ga 5 San Fransisco

    Atlanta Ga 7 Oakland

    Atlanta Ga 8 Boston

    San Fransisco Calif 5 San Fransisco

    San Fransisco Calif 7 Oakland

    San Fransisco Calif 8 Boston

    Boston Mass 5 San Fransisco

    Boston Mass 7 Oakland

    Boston Mass 8 Boston

  • 8/12/2019 Fo Optimization2

    18/32

    Example: joinName Area

    Los Angeles 2

    Oakland 9

    Atlanta 7

    R:

    Name Area Key City

    Los Angeles 2 5 San Fransisco

    Los Angeles 2 7 Oakland

    Los Angeles 2 8 Boston

    Boston 16

    Key City

    5 San Fransisco

    7 Oakland

    8 Boston

    S: R.Area

  • 8/12/2019 Fo Optimization2

    19/32

    Name Area Key City

    Los Angeles 2 5 San Fransisco

    Los Angeles 2 7 Oakland

    Los Angeles 2 8 Boston

    Oakland 9 5 San FransiscoOakland 9 7 Oakland

    Oakland 9 8 Boston

    Atlanta 7 7 Oakland

    Atlanta 7 8 Boston

    San Fransisco 11 5 San Fransisco

    San Fransisco11 7 Oakland

    San Fransisco 11 8 Boston

    Boston 16 5 San Fransisco

    Boston 16 7 Oakland

    Boston 16 8 Boston

  • 8/12/2019 Fo Optimization2

    20/32

    Variants of join

    Theta join = join. Equijoin = join with only equality conditions.

    =

    duplicate attributes is removed (attributes inthe conditions must have the same name).

    Unless otherwise specified, natural join joins

    all the attributes with the same name in Rand S.

    AR S*

  • 8/12/2019 Fo Optimization2

    21/32

    Example

  • 8/12/2019 Fo Optimization2

    22/32

    Query trees Tree that represents a relational algebra expression. Leaves = base tables. Internal nodes = relational algebra operators applied to the nodes

    children. The tree is executed from leaves to root.

    Example: List the last name of the employees born after 1957 who work .

    SELECT E.LNAMEFROM EMPLOYEE E, WORKS_ON W, PROJECT PWHERE P.PNAME = Aquarius AND P.PNUMBER = W.PNO AND W.ESSN = E.SSN AND E.BDATE > 1957-12-31

    Canonial query tree

    SELECT attributesFROM A, B, CWHERE condition

    XX

    CA B

    condition

    attributes

    Construct the canonical query tree as follows Cartesian product of the FROM-tables

    Select with WHERE-condition Project to the SELECT-attributes

  • 8/12/2019 Fo Optimization2

    23/32

    Equivalent query trees

  • 8/12/2019 Fo Optimization2

    24/32

    Real World

    Model

    DatabaseProcessing of

    Queries AnswersUpdates

    User 4

    Queries AnswersUpdates

    User 3

    Queries AnswersUpdates

    User 2

    Queries AnswersUpdates

    User 1

    Overview

    Physicaldatabase

    management

    system

    Access to stored data

  • 8/12/2019 Fo Optimization2

    25/32

    Query processingStarsIn( movieTitle, movieYear, starName )MovieStar( name, address, gender, birthdate )

    SELECT movieTitleFROM StarsInWHERE starName IN (

    SELECT nameFROM MovieStarWHERE birthdate LIKE %1960);

    Canonical query tree(usually very inefficient)

  • 8/12/2019 Fo Optimization2

    26/32

    Parsing and validating Control of used relations

    Have to be declared in FROM Must exist in the database

    Control and resolve attributes

    Attributes must exist in the relations

    Type checking

    Attributes that are compared must be of the same type

  • 8/12/2019 Fo Optimization2

    27/32

    Query optimizer: Heuristic

    Heuristic: Use joins instead of cartesian product+selections and doselection and projection as soon as possible, in order to keep theintermediate tables as small as possible, because

    If the tables do not fit in memory, then we need to perform fewerdisc accesses

    If the tables fit in memory, then we use less memory

    ,

    If the tables have to be sorted, joined, etc., then we use lesscomputation power

    ORDER_ID, ENTRY_DATE

    ENTRY_DATE>2001-08-30

    ORDER

    ENTRY_DATE>2001-08-30( ORDER_ID, ENTRY_DATE( ORDER ) )

    n = 6 tuples

    4+4+27 (= 35) bytes

    total: 210 bytes

    n = 6 tuples

    4+27 (=31) bytes

    total: 181 bytes

    n = 2 tuples

    4+27 (=31) bytes

    total: 62 bytes

    ORDER_ID, ENTRY_DATE

    ENTRY_DATE>2001-08-30

    ORDER

    ORDER_ID, ENTRY_DATE( ENTRY_DATE>2001-08-30( ORDER ) )

    n = 6 tuples

    4+4+27 (= 35) bytes

    = 210 bytes

    n = 2 tuples

    4+4+27 (=35) bytes

    = 70bytes

    n = 2 tuples

    4+27 (=31) bytes

    = 62 bytes

  • 8/12/2019 Fo Optimization2

    28/32

    Query optimizer: Heuristic Algorithm:

    1. Break up conjunctive select into cascade

    2. Move down select as far as possible in the tree

    3. Rearrange select operations: The most restrictive should be executed first

    4. Convert Cartesian product followed by selection into join

    5. Move down project operations as far as possible in the tree. Create newprojections so that only the required attributes are involved in the tree

    Fewest tuples ? Smallestsize ? Smallest selectivity ?

    DBMS catalog containsrequired info.

    .

  • 8/12/2019 Fo Optimization2

    29/32

    Equivalence rules

  • 8/12/2019 Fo Optimization2

    30/32

  • 8/12/2019 Fo Optimization2

    31/32

    ExercisesTrue or false ?

    SELECT *

    FROM ol_order_line, it_item

    WHERE ol_item_id = it_item_id

    AND ol_order_id = 1001

    Optimize the queries below:

  • 8/12/2019 Fo Optimization2

    32/32

    Execution plans Execution plan: Optimized query tree extended

    with access methods and algorithms toimplement the operations.