chapter 13 query processing melissa jamili cs 157b november 11, 2004

28
Chapter 13 Query Chapter 13 Query Processing Processing Melissa Jamili CS 157B November 11, 2004

Upload: jocelyn-cole

Post on 31-Dec-2015

230 views

Category:

Documents


7 download

TRANSCRIPT

Page 1: Chapter 13 Query Processing Melissa Jamili CS 157B November 11, 2004

Chapter 13 Query ProcessingChapter 13 Query Processing

Melissa Jamili

CS 157B

November 11, 2004

Page 2: Chapter 13 Query Processing Melissa Jamili CS 157B November 11, 2004

OutlineOutline

OverviewMeasures of query costSelection operation– Basic algorithms– Selections using indices– Selections involving comparisons– Implementation of complex selections

Sorting

Page 3: Chapter 13 Query Processing Melissa Jamili CS 157B November 11, 2004

OverviewOverview

Query processing – the range of activities involved in extracting data from a database.

– Includes: translation of queries, query-optimizing transformations, evaluation of queries.

Page 4: Chapter 13 Query Processing Melissa Jamili CS 157B November 11, 2004

Overview (cont.)Overview (cont.)

Steps involved in processing a query:

1) Parsing and translation

2) Optimization

3) Evaluation

Page 5: Chapter 13 Query Processing Melissa Jamili CS 157B November 11, 2004

Step 1. Parsing and translationStep 1. Parsing and translation

System checks the syntax of the query.

Creates a parse-tree representation of the query.

Translates the query into a relational-algebra expression.

Page 6: Chapter 13 Query Processing Melissa Jamili CS 157B November 11, 2004

Step 2. OptimizationStep 2. Optimization

Each relational-algebra operation can be executed by one of several different algorithms. Ex. Every tuple can be searched linearly

or a B+ tree can be used to index the tuples.

A query optimizer must know the cost of each operation.

Page 7: Chapter 13 Query Processing Melissa Jamili CS 157B November 11, 2004

Step 3. EvaluationStep 3. Evaluation

A relational-algebra expression and evaluation primitive are needed.– evaluation primitive - relational-algebra

expression annotated with instructions specifying how to evaluate each operation.

Page 8: Chapter 13 Query Processing Melissa Jamili CS 157B November 11, 2004

Step 3. Evaluation (cont.)Step 3. Evaluation (cont.)

Query execution plan or query-evaluation plan – sequence of primitive operations that can be used to evaluate the expression.

Π balance (σ balance < 2500 (account))

Page 9: Chapter 13 Query Processing Melissa Jamili CS 157B November 11, 2004

Step 3. Evaluation (cont.)Step 3. Evaluation (cont.)

Then the query-execution engine takes a query-evaluation plan, executes that plan, and returns the answers to the query.

Page 10: Chapter 13 Query Processing Melissa Jamili CS 157B November 11, 2004

Steps in query processingSteps in query processing

Page 11: Chapter 13 Query Processing Melissa Jamili CS 157B November 11, 2004

Measures of Query CostMeasures of Query Cost

Measured in terms of different resources, including disk accesses and CPU execution time

Measured in response time for a query-evaluation plan (clock time to execute the plan)

Page 12: Chapter 13 Query Processing Melissa Jamili CS 157B November 11, 2004

Measures of Query Cost (cont.)Measures of Query Cost (cont.)

Disk accesses, number of block transfers from disk, are usually the most important cost

To calculate add these numbers:1) # of seek operations performed2) # of blocks read3) # of blocks writtenAfter multiplying them by the average seek time,

average transfer time, and average transfer time, respectively.

Page 13: Chapter 13 Query Processing Melissa Jamili CS 157B November 11, 2004

Selection OperationSelection Operation

File scan is the lowest-level operator to access data.

File scans are search algorithms that locate and retrieve records that fulfill a selection condition.

Allows an entire relation to be read where the relation is stored in a single, dedicated file.

Page 14: Chapter 13 Query Processing Melissa Jamili CS 157B November 11, 2004

Basic AlgorithmsBasic Algorithms

Two basic scan algorithms to implement the selection operation:

A1 (linear search). – The system scans each file block and tests all records to

see whether they satisfy the selection condition.– System terminates if the required record is found,

without looking at the other records. A2 (binary search).

– System performs the binary search on the blocks of the file.

– File is ordered on an attribute, and selection condition is an equality comparison.

Page 15: Chapter 13 Query Processing Melissa Jamili CS 157B November 11, 2004

Selections Using IndicesSelections Using Indices

Index structures referred to a access paths, since they provide a path through which data can be located and accessed.

Primary index – allows records to be read in an order that corresponds to the physical order in the file.

Secondary index – any index that is not a primary index.

Page 16: Chapter 13 Query Processing Melissa Jamili CS 157B November 11, 2004

Selections Using Indices (cont.)Selections Using Indices (cont.)

A3 (primary index, equality on key).– Use the index to retrieve a single record

that satisfies the corresponding equality condition.

A4 (primary index, equality on nonkey).– Same as A3, but multiple records may

need to be fetched.

Page 17: Chapter 13 Query Processing Melissa Jamili CS 157B November 11, 2004

Selections Using Indices (cont.)Selections Using Indices (cont.)

A5 (secondary index, equality).– Retrieve a single record if the equality

condition is on a key.– Retrieve multiple records if the indexing

field is not a key.

Page 18: Chapter 13 Query Processing Melissa Jamili CS 157B November 11, 2004

Selections Involving ComparisonsSelections Involving Comparisons

Consider the form σA ≥v (r). Can be implemented by linear or binary

search or by using indices: A6 (primary index, comparison).– For A ≥ v, look for first tuple that has the value

of A = v, return all the tuples starting from the tuple up to the end of the file.

– For A < v, file scan from the beginning up to (but not including) the first tuple with attribute A = v.

Page 19: Chapter 13 Query Processing Melissa Jamili CS 157B November 11, 2004

Selections Involving Comparisons Selections Involving Comparisons (cont.)(cont.)A7 (secondary index, comparison).– The lowest-level blocks are scanned• From the smallest value up to v for < and ≤.• From v up to the maximum value for > and

≥.

Page 20: Chapter 13 Query Processing Melissa Jamili CS 157B November 11, 2004

Implementation of Complex Implementation of Complex SelectionsSelectionsPreviously only considered simple

selection conditions with an equality or comparison operation.

Now consider more complex selection predicates.

Page 21: Chapter 13 Query Processing Melissa Jamili CS 157B November 11, 2004

Implementation of Complex Implementation of Complex Selections (cont.)Selections (cont.)Conjunction: σθ1 ∩ θ2 ∩ … ∩ θn (r)

Disjunction: σθ1 ∪ θ2 … ∪ ∪ θn (r)

Negation: σ-θ (r)

Page 22: Chapter 13 Query Processing Melissa Jamili CS 157B November 11, 2004

Implementation of Complex Implementation of Complex Selections (cont.)Selections (cont.)A8 (conjunctive selection using one

index).– Determine if there is an access path for

an attribute in one of the simple conditions.

– If yes, algorithms A2 through A7 can be used.

– On all records retrieved test if they satisfy the remaining simple conditions.

Page 23: Chapter 13 Query Processing Melissa Jamili CS 157B November 11, 2004

Implementation of Complex Implementation of Complex Selections (cont.)Selections (cont.)A9 (conjunctive selection using

composite index (multiple attributes).– Search index directly if selection

specifies an equality condition on 2 or move attributes and a composite index exists.

– Type of index determines if A3, A4, or A5 will be used.

Page 24: Chapter 13 Query Processing Melissa Jamili CS 157B November 11, 2004

Implementation of Complex Implementation of Complex Selections (cont.)Selections (cont.) A10 (conjunction selection by intersection

of identifiers).– Requires indices with record pointers.– Scan each index for pointers that satisfy an

individual condition.– Take intersection of all retrieved pointers to

get set of pointers that satisfy the conjunctive condition.

– Then pointers used to retrieve actual records.– If indices not available on all conditions, then

algorithm tests retrieved records against remaining conditions.

Page 25: Chapter 13 Query Processing Melissa Jamili CS 157B November 11, 2004

Implementation of Complex Implementation of Complex Selections (cont.)Selections (cont.)A11 (disjunctive selection by union

of identifiers).– Each index is scanned for pointers to

tuples that satisfy the individual condition.

– Union of all the retrieved pointers gives the set of pointers to all tuples that satisfy the disjunctive condition.

– Pointers then used to retrieve actual records.

Page 26: Chapter 13 Query Processing Melissa Jamili CS 157B November 11, 2004

SortingSorting

Focus on external sorting where relations that are bigger than memory.

Most common is external merge-sort algorithm.

Page 27: Chapter 13 Query Processing Melissa Jamili CS 157B November 11, 2004

Sorting (cont.)Sorting (cont.)

N-way merge– A number of runs

are created and sorted.

– Runs are merged.

Page 28: Chapter 13 Query Processing Melissa Jamili CS 157B November 11, 2004

ConclusionConclusion

Query processing - translation, transformation, and evaluation.

Query cost measured by number of disk accesses.

Selection operation algorithmsSorting – N-way merge.