chapter 13 query processing melissa jamili cs 157b november 11, 2004

of 28 /28
Chapter 13 Query Chapter 13 Query Processing Processing Melissa Jamili CS 157B November 11, 2004

Author: jocelyn-cole

Post on 31-Dec-2015

227 views

Category:

Documents


7 download

Embed Size (px)

TRANSCRIPT

  • Chapter 13 Query ProcessingMelissa JamiliCS 157BNovember 11, 2004

  • OutlineOverviewMeasures of query costSelection operationBasic algorithmsSelections using indicesSelections involving comparisonsImplementation of complex selectionsSorting

  • OverviewQuery processing the range of activities involved in extracting data from a database.

    Includes: translation of queries, query-optimizing transformations, evaluation of queries.

  • Overview (cont.)Steps involved in processing a query:Parsing and translationOptimizationEvaluation

  • Step 1. Parsing and translationSystem checks the syntax of the query.Creates a parse-tree representation of the query.Translates the query into a relational-algebra expression.

  • Step 2. OptimizationEach relational-algebra operation can be executed by one of several different algorithms. Ex. Every tuple can be searched linearly or a B+ tree can be used to index the tuples.A query optimizer must know the cost of each operation.

  • Step 3. EvaluationA relational-algebra expression and evaluation primitive are needed.evaluation primitive - relational-algebra expression annotated with instructions specifying how to evaluate each operation.

  • Step 3. Evaluation (cont.)

    Query execution plan or query-evaluation plan sequence of primitive operations that can be used to evaluate the expression.

    balance ( balance < 2500 (account))

  • Step 3. Evaluation (cont.)Then the query-execution engine takes a query-evaluation plan, executes that plan, and returns the answers to the query.

  • Steps in query processing

  • Measures of Query CostMeasured in terms of different resources, including disk accesses and CPU execution timeMeasured in response time for a query-evaluation plan (clock time to execute the plan)

  • Measures of Query Cost (cont.)Disk accesses, number of block transfers from disk, are usually the most important cost

    To calculate add these numbers:# of seek operations performed# of blocks read# of blocks writtenAfter multiplying them by the average seek time, average transfer time, and average transfer time, respectively.

  • Selection OperationFile scan is the lowest-level operator to access data.File scans are search algorithms that locate and retrieve records that fulfill a selection condition.Allows an entire relation to be read where the relation is stored in a single, dedicated file.

  • Basic AlgorithmsTwo basic scan algorithms to implement the selection operation:

    A1 (linear search). The system scans each file block and tests all records to see whether they satisfy the selection condition.System terminates if the required record is found, without looking at the other records.A2 (binary search).System performs the binary search on the blocks of the file.File is ordered on an attribute, and selection condition is an equality comparison.

  • Selections Using IndicesIndex structures referred to a access paths, since they provide a path through which data can be located and accessed.

    Primary index allows records to be read in an order that corresponds to the physical order in the file.Secondary index any index that is not a primary index.

  • Selections Using Indices (cont.)A3 (primary index, equality on key).Use the index to retrieve a single record that satisfies the corresponding equality condition.A4 (primary index, equality on nonkey).Same as A3, but multiple records may need to be fetched.

  • Selections Using Indices (cont.)A5 (secondary index, equality).Retrieve a single record if the equality condition is on a key.Retrieve multiple records if the indexing field is not a key.

  • Selections Involving ComparisonsConsider the form A v (r).Can be implemented by linear or binary search or by using indices:A6 (primary index, comparison).For A v, look for first tuple that has the value of A = v, return all the tuples starting from the tuple up to the end of the file.For A < v, file scan from the beginning up to (but not including) the first tuple with attribute A = v.

  • Selections Involving Comparisons (cont.)A7 (secondary index, comparison).The lowest-level blocks are scannedFrom the smallest value up to v for < and .From v up to the maximum value for > and .

  • Implementation of Complex SelectionsPreviously only considered simple selection conditions with an equality or comparison operation.Now consider more complex selection predicates.

  • Implementation of Complex Selections (cont.)Conjunction: 1 2 n (r)Disjunction: 1 2 n (r)Negation: - (r)

  • Implementation of Complex Selections (cont.)A8 (conjunctive selection using one index).Determine if there is an access path for an attribute in one of the simple conditions.If yes, algorithms A2 through A7 can be used.On all records retrieved test if they satisfy the remaining simple conditions.

  • Implementation of Complex Selections (cont.)A9 (conjunctive selection using composite index (multiple attributes).Search index directly if selection specifies an equality condition on 2 or move attributes and a composite index exists.Type of index determines if A3, A4, or A5 will be used.

  • Implementation of Complex Selections (cont.)A10 (conjunction selection by intersection of identifiers).Requires indices with record pointers.Scan each index for pointers that satisfy an individual condition.Take intersection of all retrieved pointers to get set of pointers that satisfy the conjunctive condition.Then pointers used to retrieve actual records.If indices not available on all conditions, then algorithm tests retrieved records against remaining conditions.

  • Implementation of Complex Selections (cont.)A11 (disjunctive selection by union of identifiers).Each index is scanned for pointers to tuples that satisfy the individual condition.Union of all the retrieved pointers gives the set of pointers to all tuples that satisfy the disjunctive condition.Pointers then used to retrieve actual records.

  • SortingFocus on external sorting where relations that are bigger than memory.Most common is external merge-sort algorithm.

  • Sorting (cont.)N-way mergeA number of runs are created and sorted.Runs are merged.

  • ConclusionQuery processing - translation, transformation, and evaluation.Query cost measured by number of disk accesses.Selection operation algorithmsSorting N-way merge.