query processing and query optimization

22
Query processing & Query optimization Master : DR alesheikh Student : mohsen yousefzadeh (9413374) 1

Upload: shadowfax5885

Post on 11-Jan-2017

98 views

Category:

Engineering


9 download

TRANSCRIPT

Page 1: Query Processing and Query Optimization

Query processing & Query optimization

Master : DR alesheikh

Student : mohsen yousefzadeh (9413374)

1

Page 2: Query Processing and Query Optimization

What is Query Processing?

The activities involved in retrieving data from the

database are called as query processing.

It is a 3 step process that transforms a high level query

(sql) into an equivalent and more efficient lower-level

query (of relational algebra).

2

Page 3: Query Processing and Query Optimization

Query

Query

Query is the statement written by the user in high

language using sql.

3

Page 4: Query Processing and Query Optimization

Scanning & Parsing & validating

Query

Scanning

parsing

and

validating

Scanner : identifies the query tokens _ such as SQL keywords,

attribute names, and relation names _ that appear in the text

of the query .

Parser : Checks the syntax and verifies the relation.

Vaditing : checking that all attribute and relation names are

valid and semantically meaningful names .

4

Page 5: Query Processing and Query Optimization

Query optimizer

QueryQuery

optimizer

A query typically has many possible execution strategies,and

the process of choosing a suitable one for processing a query

is known as query optimization.

The query optimizer module has the task of producing a

good execution plan.It will select the query which has low cost.

Scanning

parsing

and

validating

5

Page 6: Query Processing and Query Optimization

Optimizer

Query

6

Scanning

parsing

and

validating

Query

optimizer

For example :"Find all female senators who own a business.“

This query is actually a composition of two subqueries"all

female senators" is a selection query. The query "Find all

senators who own a business " is a join query because we

combine two tables to process the query.

Page 7: Query Processing and Query Optimization

SENATOR

BUSINESS

The question is, in which order should these subqueries be

processed: select before join or join before select?

Remember a join is a multiscan query, and select is a single-

scan query. Now it is obvious that select should be done before

join.

name Soc-sec gender District(polygon)

B-name owner Soc-sec Location(point)

7

Page 8: Query Processing and Query Optimization

There are two main techniques that are employed during query

optimization :

The first technique is based on heuristic rules for ordering the

operations in a query execution strategy.

A heuristic is a rule that works well in most cases but is not

guaranteed to work well in every case. The rules typically

reorder the operations in a query tree.

8

Page 9: Query Processing and Query Optimization

The second technique involves systematically estimating

the cost of different execution strategies and choosing

the execution plan with the lowest cost estimate .

These techniques are usually combined in a query

optimizer .

9

Page 10: Query Processing and Query Optimization

Using Heuristics in Query Optimization

The scanner and parser of an SQL query first generate a data

structure that corresponds to an initial query representation,

which is then optimized according to heuristic rules . This

leads to an optimized query representation .

One of the main heuristic rules is to apply SELECT and

PROJECT operations before applying the JOIN .

10

Page 11: Query Processing and Query Optimization

Example of Transforming a Query

Consider the following query Q on the database

Q : Find the last names of employees born after 1957 who

work on a project named ‘Aquarius’.

This query can be specified in SQL as follows:

Q: SELECT Lname

FROM EMPLOYEE, WORKS_ON, PROJECT

WHERE Pname=‘Aquarius’ AND Pnumber=Pno AND Essn=Ssn

AND Bdate > ‘1957-12-31’;

11

Page 12: Query Processing and Query Optimization

πLname

σPname=‘Aquarius’ AND Pnumber=Pno AND Essn=Ssn AND Bdate>‘1957-12-31’

X

12

PROJECTX

WORKS_ONEMPLOYEE

(a) Initial query tree for SQL query Q

Page 13: Query Processing and Query Optimization

13

PROJECT

X

WORKS_ON

EMPLOYEE

(b) Moving SELECT

operations down the query

tree.

Bdate>‘1957-12-31’σpnumber=pno σ

σEssn=ssn

X

σPname=‘Aquarius’

πLname

Page 14: Query Processing and Query Optimization

14

PROJECT

WORKS_ON EMPLOYEE

(c) Replacing CARTESIAN

PRODUCT and SELECT with

JOIN operations.

Bdate>‘1957-12-31’pnumber=pno σ

Essn=ssn

σPname=‘Aquarius’

πLname

Page 15: Query Processing and Query Optimization

15

PROJECT

WORKS_ON

EMPLOYEE

(d) Moving PROJECT

operations down the query

tree.

Bdate>‘1957-12-31’

pnumber=pno

σ

Essn=ssn

σPname=‘Aquarius’

πLname

πssn,Lname

πpnumber

πEssn

πEssn,pno

Page 16: Query Processing and Query Optimization

Using Selectivity and Cost Estimates

in Query Optimization

A query optimizer does not depend solely on heuristic

rules .

it also estimates and compares the costs of executing a

query using different execution strategies and

algorithms.

and it then chooses the strategy with the lowest cost

estimate .

16

Page 17: Query Processing and Query Optimization

Cost Components for Query Execution

1. Access cost to secondary storage

2. Disk storage cost

3. Computation cost

4. Memory usage cost

5. Communication cost

17

Page 18: Query Processing and Query Optimization

Query

Query

code

generator

Query code generator

Code generator

generates the code to

execute that plan.

18

Query

optimizer

Scanning

parsing

and

validating

Page 19: Query Processing and Query Optimization

Runtime database processor

Query

The runtime database processor

has the task of running

(executing) the query code,

whether in compiled or

interpreted mode, to produce

the query result.

19

Scanning

parsing

and

validating

Query

optimizer

Query

code

generator

Runtime database processor

Page 20: Query Processing and Query Optimization

Code can be :

Excecuted directly (interpreted mode)

Stored and executed later whenever needed (compiled mode)

The term optimization is actually a misnomer because in some

cases the chosen execution plan is not the optimal (or absolute

best) strategy it is just a reasonably efficient strategy for executing

the query.

20

Page 21: Query Processing and Query Optimization

Result of query

Query

Result of

query

21

Scanning

parsing

and

validating

Query

optimizer

Query

code

generator

Runtime

database

processor

Page 22: Query Processing and Query Optimization