query optimization in apache tajo

44
Query Optimization in Apache Tajo Jihoon Son / Gruter inc.

Upload: jihoon-son

Post on 24-Jan-2017

1.324 views

Category:

Engineering


0 download

TRANSCRIPT

Page 1: Query optimization in Apache Tajo

Query Optimization in Apache TajoJihoon Son / Gruter inc.

Page 2: Query optimization in Apache Tajo

About Me

● Jihoon Son (@jihoonson)○ Tajo project co-founder ○ Committer and PMC member of Apache Tajo○ Research engineer at Gruter

2

Page 3: Query optimization in Apache Tajo

● Introduction to Tajo● Query processing in Tajo

○ Query plans in Tajo○ Query processing example

● Query optimization in Tajo○ Introduction to query optimization○ Query optimization techniques in Tajo

Outline

3

Page 4: Query optimization in Apache Tajo

● Apache Top-level Project○ Data warehouse system

■ Efficient processing of analytic queries■ ANSI-SQL compliant

○ Scalable and rapid query execution with own engine■ Distributed query processing■ Fault-tolerance

○ Beyond SQL-on-Hadoop■ Support various types of storage

● HDFS, S3, hbase, rdbms, ...

What is Tajo?

4

Page 5: Query optimization in Apache Tajo

Highlighted Features

● Support long-running batch queries as well as interactive ad-hoc queries○ Fast query processing

■ Optimized scan performance● 120 MB/sec per physical disk (SATA)

○ Reliability■ Fault tolerance■ No single point of failure with HA support

5

Page 6: Query optimization in Apache Tajo

Highlighted Features

● Support of various kinds of data sources○ HDFS, Amazon S3, Google Cloud Storage, HBase,

RDBMS, ...● Mature SQL support

○ Various kinds of join support○ Window function support○ Cost-based query optimization

● Integration with other systems○ Notebooks like Zeppelin○ BI tools

6

Page 7: Query optimization in Apache Tajo

Recent Release: 0.11

● Feature highlights○ Query federation○ JDBC-based storage support○ Self-describing data formats support○ Multi-query support○ More stable and efficient join execution○ Index support○ Python UDF/UDAF support

7

Page 8: Query optimization in Apache Tajo

Tajo Master

Catalog Server

Tajo Master

Catalog Server

Architecture Overview

DBMS

HCatalog

Tajo Master

Catalog Server

Tajo Worker

Query Master

Query Executor

Storage Service

Tajo Worker

Query Master

Query Executor

Storage Service

Tajo Worker

Query Master

Query Executor

Storage Service

JDBC client

TSQLWebUI

REST API

Storage

Submit a query

Manage metadataAllocate

a query

Send tasks & monitor

Send tasks & monitor

8

Page 9: Query optimization in Apache Tajo

Tajo Worker

Query Master

Tajo Worker

Query Master

Tajo Worker

Query Master

Query Execution Steps

9

Tajo Master

Catalog ServerTajo Client

① Submit a query

DBMS

② Assign a query

● Initializing a query execution

③ Build a query execution plan

Page 10: Query optimization in Apache Tajo

Tajo Worker

Query Executor

Storage Service

Tajo Worker

Query Master

Query Executor

Storage Service

Tajo Worker

Query Executor

Storage Service

Query Execution Steps

10

Storage

⑥ Send status and progress

⑤ Read and process data

④ Send tasks & monitor

● Executing a query

Tajo Master

Page 11: Query optimization in Apache Tajo

Tajo Worker

Query Executor

Storage Service

Tajo Worker

Query Master

Query Executor

Storage Service

Tajo Worker

Query Executor

Storage Service

Query Execution Steps

11

Tajo Client

Storage

⑧ Notify that query execution is completed

⑦ Store the result on storage

⑨ Send the result location

⑩ Read the result

● Finalizing the query execution

Tajo Master

Page 12: Query optimization in Apache Tajo

Query Processing in Tajo

12

Page 13: Query optimization in Apache Tajo

● Given a user query, a query execution plan is an ordered set of steps to execute the query○ Example

■ Read data from storage, and then do join on some join keys, and finally aggregate with some aggregation keys

● In Tajo, there are three kinds of query plans○ Query master generates a logical query plan and a

distributed query plan○ Query executor of tajo workers generates a local query

plan

Query Execution Plan

13

Page 14: Query optimization in Apache Tajo

Query Planning Steps in Tajo

14

SQLSQL

AnalyzerAlgebraic

ExpressionLogicalPlanner

Logical Query Plan

Global Planner

Distributed Query Plan

Physical Planner

Local Query Plan

Query Executor

Query Master

Distributed to tajo workers

Page 15: Query optimization in Apache Tajo

Join

Logical Query Plan

● A tree of relational algebras● Example

15

SELECT item.brand, sum(price)FROM sales, itemWHERE sales.item_key = item.item_keyGROUP BY item.brand,

Scan on item

Scan on sales

Group by

< SQL > < Logical query plan >

key: item_key

key: brandfunc: sum(price)

Page 16: Query optimization in Apache Tajo

Distributed Query Plan

● A plan with additional annotations for distributed execution○ Data exchange (shuffle) keys, methods, ...

16< Distributed query plan >

Join

Scan on item

Scan on sales

Group by

< Logical query plan >

key: item_key

key: brandfunc: sum(price)

Join

Scan on item

Scan on sales

Group by

key: item_key

key: brandfunc: sum(price)

Hash shuffle with item_key

Hash shuffle with item_key

Range shuffle with brand

Page 17: Query optimization in Apache Tajo

Local Query Plan

● A plan with additional annotations for local execution○ In-memory algorithm, disk-based algorithm, …

17

< Distributed query plan >

Join

Scan on item

Scan on sales

Group by

key: item_key

key: brandfunc: sum(price)

Hash shuffle with item_key

Hash shuffle with item_key

Range shuffle with brand

< Local query plan >

Join

Scan on item

Scan on sales

Group by

key: item_key

key: brandfunc: sum(price)

Hash shuffle with item_key

Hash shuffle with item_key

Range shuffle with brandSort-merge

join

Hash aggregation

Page 18: Query optimization in Apache Tajo

Query Processing in Tajo

● A query is executed by executing multiple stages subsequently○ A stage is a minimum unit to execute at least a single

operator● Each stage is processed by multiple query executors of

tajo worker in parallel

18

Join

Scan on item

Scan on sales

key: item_keyStage 2

Stage 1

Page 19: Query optimization in Apache Tajo

● SQL ● Logical query plan

Query Processing Example

19

Join

SELECT item.brand, sum(price)FROM sales, itemWHERE sales.item_key = item.item_keyGROUP BY item.brand,

Scan on item

Scan on sales

Group by

key: item_key

key: brandfunc: sum(price)

Page 20: Query optimization in Apache Tajo

● Logical query plan ● Distributed query plan

Query Processing Example

20

Join

Scan on item

Scan on sales

Group by

key: item_key

key: brandfunc: sum(price)

Join

Scan on item

Scan on sales

Group by

key: item_key

key: brandfunc: sum(price)

Stage 3

Stage 2

Stage 1

Hash shuffle with item_key

Range shuffle with brand

Hash shuffle with item_key

Page 21: Query optimization in Apache Tajo

Query Processing Example

● Distributed query plan

21

Join

Scan on item

Scan on sales

Group by

key: item_key

key: brandfunc: sum(price)

Stage 3

Stage 2

Stage 1

Hash shuffle with item_key

Range shuffle with brand

Hash shuffle with item_key

item item sales sales sales

WorkerScan

WorkerScan

WorkerScan

WorkerScan

WorkerScan

● Distributed processing

Page 22: Query optimization in Apache Tajo

Query Processing Example

22

Join

Scan on item

Scan on sales

Group by

key: item_key

key: brandfunc: sum(price)

Stage 3

Stage 2

Stage 1

Hash shuffle with item_key

Range shuffle with brand

Hash shuffle with item_key

item item sales sales sales

WorkerScan

WorkerScan

WorkerScan

WorkerScan

WorkerScan

WorkerJoin

WorkerJoin

WorkerJoin

WorkerJoin

WorkerJoin

shuffle

● Distributed query plan ● Distributed processing

Page 23: Query optimization in Apache Tajo

Query Processing Example

● Distributed query plan

23

Join

Scan on item

Scan on sales

Group by

key: item_key

key: brandfunc: sum(price)

Stage 3

Stage 2

Stage 1

Hash shuffle with item_key

Range shuffle with brand

Hash shuffle with item_key

item item sales sales sales

WorkerScan

WorkerScan

WorkerScan

WorkerScan

WorkerScan

WorkerJoin

WorkerJoin

WorkerJoin

WorkerJoin

WorkerJoin

WorkerGroup by

WorkerGroup by

WorkerGroup by

WorkerGroup by

WorkerGroup by

shuffle

shuffle

● Distributed processing

Page 24: Query optimization in Apache Tajo

Query Optimization in Tajo

24

Page 25: Query optimization in Apache Tajo

Query Optimization

● Mostly, user queries are not optimized for performance

● The query optimizer attempts to determine the most efficient way to execute a user query ○ Considering the possible query plans, and choosing the

best one

25

Page 26: Query optimization in Apache Tajo

Extreme Example

● Query○ select * from t where name like 'tajo%' order by id;

● Possible plans

26

Scan

Sort

Filter

Scan with Filter

Sort● Naive plan○ Filtering out tuples

after sort○ Large cost for sort

● Better plan○ Filtering out tuples

after scan immediately○ Small cost for sort○ Reduced number of

operations

Page 27: Query optimization in Apache Tajo

Two Kinds of Query Optimization

● Rule-based optimization○ A set of predefined rules is used to choose a good plan○ Usually, heuristic approaches are used

■ Ex) filters should be pushed down to the lower part of the query plan as much as possible

● Cost-based optimization○ Enumerating possible query plans and choosing the one

having the lowest cost○ Cost function has an important role

● Tajo utilizes both types of optimization

27

Page 28: Query optimization in Apache Tajo

Query Optimization in Tajo

● Difference from traditional query optimization○ Unlike traditional database systems, pre-collected

statistics is not so important ■ Data may be added or updated by several systems

including Flume, Kafka, Tajo, … ■ Pre-collected statistics can be useful, but is not fully

trustworthy○ It is important to optimize query plans with minimal

statistics ■ Volume of input relations

28

Page 29: Query optimization in Apache Tajo

Query Optimization in Tajo

● Tajo has two different approaches for query optimization○ Static optimization

■ Traditional approach■ Optimizing the plan during the query planning phase

○ Progressive optimization■ Optimizing the plan based on the intermediate statistics

while executing the query● A query plan can be optimized without pre-collected

statistics

● Especially effective for queries which require multiple stage execution 29

Page 30: Query optimization in Apache Tajo

Logical Query Plan Optimization

● Rule-based optimization○ Access path rewrite rule

■ Choosing access path to data■ Index scan has the highest priority if available

○ Distributivity rule■ Reducing filters based on distributivity

○ Filter pushdown rule■ Pushing down filters to the lowest part as much as

possible○ In-subquery rewrite rule

■ Transforming subqueries in 'IN' filters to semi(anti) joins30

Page 31: Query optimization in Apache Tajo

Logical Query Plan Optimization

● Rule-based optimization (cont')○ Projection pushdown rule

■ Pushing down projections to the lowest part as much as possible

● Cost-based optimization○ Join order optimization

■ Finding a join order of lowest cost■ Greedy heuristic: ordering relations from small ones to

large ones● Very effective in single computing environment● Need to improve for parallel computing environment

31

Page 32: Query optimization in Apache Tajo

Distributed Query Plan Optimization

● Rule-based optimization○ Two-phase execution of operators

■ Operators which require data shuffling like aggregation, join, or sort are executed in two-phase

■ First phase is for local computing to reduce the amount of shuffled data

■ Second phase is to get the result of the operation

32

Page 33: Query optimization in Apache Tajo

Two-phase Execution Example

● Logical query plan

33

● Distributed query plan

Group by

Scan

Sort

Group by

Scan

SortStage 3

Stage 2

Stage 1

Group by

Sort

Local group by

Local sort

Page 34: Query optimization in Apache Tajo

Distributed Query Plan Optimization

● Distributed join algorithm selection○ Two representative distributed join algorithms

■ Join cannot be performed within a single stage in distributed systems● Tuples of the same join key may be distributed over cluster

nodes■ Repartition join

● Both input relations are shuffled with the join key columns■ Broadcast join

● Small relations are broadcasted to every node before join

34

Page 35: Query optimization in Apache Tajo

Example of Repartition Join

● select … from employee e, department d where e.DeptName = d.DeptName

35

Page 36: Query optimization in Apache Tajo

Example of Broadcast Join

● select … from employee e, department d where e.DeptName = d.DeptName

36

Page 37: Query optimization in Apache Tajo

Distributed Join Algorithm Selection

● Repartition join VS broadcast join○ Given a set of joins, some parts can be executed with

broadcast join while remaining parts are executed with repartition join

● Which parts will be executed with broadcast join?○ Greedy heuristic: broadcast join is used as many as

possible ■ The size of input relation should be smaller than pre-

defined threshold■ The total volume of broadcasted relations should not

exceed pre-defined threshold37

Page 38: Query optimization in Apache Tajo

Distributed Join Algorithm Selection Example

● select … from lineitem, nation, region …

38

Page 39: Query optimization in Apache Tajo

Local Query Plan Optimization

● Selecting the best algorithm based on the current resource status○ Aggregation

■ Hash aggregation, sort aggregation○ Join

■ Hash join, sort-merge join● For sort, hash sort is basically used with spilling data to

disk when it doesn't fit into memory

39

Page 40: Query optimization in Apache Tajo

Progressive Optimization

● Data repartition○ Some operators like join or aggregation require to

shuffle data with keys○ The number of result partitions of shuffle should be

carefully decided■ The number of partitions is related to the number of tasks

of the next stage● At the beginning of each stage, the number of

partitions is decided based on the input size

40

Page 41: Query optimization in Apache Tajo

Progressive Optimization Example

41

Group by

Scan on item (100GB)

SortStage 3

Stage 2

Stage 1

Group by

Sort

# of partitions: 100

● If the default task size is 1GB,

Group by

Scan on item

SortStage 3

Stage 2

Stage 1

Group by(50GB)

Sort# of partitions: 50

# of tasks: 100

# of tasks: 50

Page 42: Query optimization in Apache Tajo

Future Work

● Adding more optimization methods● Improve cost functions for more effective cost-based

optimization● Adding new approaches for progressive optimization

○ Runtime query rewriting○ Integrating with genetic algorithm○ …

42

Page 43: Query optimization in Apache Tajo

43

Get Involved!

● General○ http://tajo.apache.org

● Getting Started○ http://tajo.apache.org/docs/current/getting_started.html

● Downloads○ http://tajo.apache.org/downloads.html

● Jira – Issue Tracker○ https://issues.apache.org/jira/browse/TAJO

● Join the mailing list○ [email protected][email protected]

Page 44: Query optimization in Apache Tajo

44

Thanks!