query optimization in microsoft sql …cis.csuohio.edu/~sschung/cis611/mspdwoptimization.pdfquery...
TRANSCRIPT
QUERY OPTIMIZATION IN MICROSOFT SQL SERVER PDW
M I C R O S O F T C O R P O R AT I O N
Naveen Baskaran
PAPER EXPLAINS
Why we need Massive parallel processing(MPP)?
What is PDW?
Why QO is complex in PDW?
What are the changes done in SQL Server optimizer?
How to calculate cost ?
PDW
Composed of Hardware and Software
Shared nothing or loosely coupled
SQL Server uses Symmetric multiprocessing (SMP) - uses only one server
MPP runs several servers in parallel and independent
Cost effective
Easy to add extra server and storage
Easy upgraded or individually replaced (CPU’s,Memory,Storage,.Etc.)
CONTROL NODE & COMPUTE NODE
Control Node:
�Distribute Queries among the compute nodes
�Accepts Client connection –ODBC, OLE DB, ADO.NET
�Contains additional S/W to support distributed architecture of PDW
�Manages DMS – communication layer between nodes
�Contains shell database
�Performs data base authentication and authorization
Compute Node:
�Host for single server instance
�Runs for communication and data transfer
�Each node contains portion of user data (hash-partitioned)
SHELL DATABASE
Contains metadata of user tables
No user data
Undistinguishable among the one contains actual data
Used for testing and debugging of compilations issues
Additionally stores,� Users and privileges
� Check for security and access rights
Provides same security as SQL server
Contains global statistics
Calculate local statistics among each nodes
DATA MOVEMENT SERVICE
�Responsible for moving data between all the nodes and appliance
�One instance, intermediate results move to one compute node to another
�Another instance, one or more compute nodes move intermediate results to control node
�Control nodes computes final aggregations and sorting prior to returning the results
�Uses temp tables to store intermediate tables
�In some cases, queries generate direct results without intermediate tables and sent back to client (DMS will not involve in the process)
DSQL PLAN AND ITS EXECUTION
�DSQL Plan contains
�SQL operations – executed directly in SQL server
�DMS operations – Moves data among nodes
�Temp table operations
�Return operations – push data to clients
�Query plans executed serially, one step at a time.
�However, serial process runs parallelly across nodes
DSQL PLAN EXAMPLE
Assume the customer table (custkey)and order table (orderkey)
SELECT c_custkey, o_orderdate
FROM Orders, Customer
WHERE o_custkey = c_custkey AND o_totalprice > 100
Not compatible with join because of primary key
DSQL plan
1. DMS operation: repartitions the table according custkey
2. Return SQL operation: sent the final tuples to the client
DSQL PLAN GENERATION
Input: Physical operator tree
Output: DSQL formatted plan
Framework: QRel Programming
�Sends SQL queries to the compute nodes, instead of the operator tree (unlike other MPP systems E.g: GreenPlum)
�SQL statements executed in underlying compute nodes and DMS operations used to transfer data.
�Similar to Asterdata Approach
7 DATA MOVEMENT OPERATIONS:
1. Shuffle Move (many-to-many). Rows are moved from each compute node to target table based on a hash of the value in the specified distribution column.
2. Partition Move (many-to-one). Rows are moved from each compute node to the target table on the target node (typically the control node but this is not a requirement).
3. Control-Node Move (From the control node to the compute nodes). A table in the control node is replicated to all compute nodes.
4. Broadcast Move. Rows are moved from each compute node to the target table on all compute nodes.
5. Trim Move. Trim move is initiated against a replicated table on all compute nodes where the destination is to a distributed table on its own nodes. Hashing will take place so that only rows that this node is responsible for will be kept.
6. Replicated broadcast. A table which is only in one compute node it is replicated via a broadcast move.
7. Remote copy to single node . Can be either a remote copy of a replicated table (from control node or from compute node) or, a Remote copy of a distributed table.
COST OF DMS OPERATIONS
�Separated as two components: Source (sending side) and Target (Receiving side)
�Source and target are divided into sub components
�Source:������� = max(������� , ��������)
�Target: ������� = max(������� , ����������)
�DMS operation cost:� !� = max(�������, �������)
�Data transmission happens asynchronously
�Source and targets operation performs parallelly in each node