best partitioning
Post on 05-Sep-2015
234 Views
Preview:
DESCRIPTION
TRANSCRIPT
-
Get the best out of Oracle Partitioning
Yasin Mohammed Technology Consultantyasin.mohammed@oracle.com
Nirmal GrewalTechnology Sales Representativenirmal.grewal@oracle.com
-
Agenda
Partitioning in a nutshell
Getting optimal pruning
Partition exchange loading
Partitioning and unusable indexes
Efficient statistics management
Q&A
-
The Concept of Partitioning
Simple Yet Powerful
Large Table
Difficult to Manage
Partition
Divide and Conquer
Easier to Manage
Improve Performance
Composite Partition
Better Performance
More flexibility to match business needs
Transparent to applications
-
It is
Powerful functionality to logically partition objects into
smaller pieces
Only driven by business requirements
Partitioning for Performance, Manageability, and
Availability
What is Oracle Partitioning?
It is not
Just a way to physically divide or clump - any large
data set into smaller buckets
Enabling pre-requirement to support a specific
hardware/software design
Hash mandatory for shared nothing systems
-
Physical versus Logical Partitioning
Shared Nothing Architecture
Physical Partitioning
Fundamental system setup
requirement
Node owns piece of DB
Enables parallelism
Number of partitions is equivalent to min.
parallelism
Always needs HASH distribution
Equally sized partitions per node required
for proper load balancing
DB DB DB
-
Physical versus Logical Partitioning
Shared Everything Architecture - Oracle
Logical Partitioning
Does not underlie any constraints
SMP, MPP, Cluster, Grid does not matter
Purely based on the business
requirement
Availability, Manageability, Performance
Beneficial for every environment
Provides the most comprehensive
functionality
DB
-
Agenda
Partitioning in a nutshell
Getting optimal pruning
Partition exchange loading
Partitioning and unusable indexes
Efficient statistics management
Q&A
-
Sales Table
May 22nd 2008
May 23rd 2008
May 24th 2008
May 18th 2008
May 19th 2008
May 20th 2008
May 21st 2008
Select sum(sales_amount)
From SALES
Where sales_date between
to_date(05/20/2008,MM/DD/YYYY)
And
to_date(05/23/2008,MM/DD/YYYY);
Q: What was the total
sales for the weekend
of May 20 - 22 2008?
Only the 3
relevant
partitions are
accessed
Partition Pruning
-
Partition Pruning
Works for simple and complex SQL statements
Support for every data access
Transparent to any application
No extra coding required
Two flavors of pruning
Static pruning at compile time
Dynamic pruning at runtime
Complementary to Exadata Storage Server
Partitioning prunes logically through partition elimination
Exadata prunes physically through storage indexes
Further data reduction through filtering and projection
-
Relevant Partitions are known at compile time
Look for actual values in PSTART/PSTOP columns in the
plan
Optimizer has most accurate information for the SQL
statement
04-May04-Apr04-Feb04-Jan
SELECT sum(amount_sold) FROM sales
WHERE times_id
BETWEEN 01-MAR-2004 and 31-MAY-2004;
04-Mar 04-Jun
Static Partition Pruning
-
Static Pruning
Sample plan
-
Static Pruning
Sample plan
-
SELECT sum(amount_sold)
FROM sales s, times t
WHERE t.time_id = s.time_id
AND t.calendar_month_desc IN
(MAR-2004, APR-2004,
MAY-2004);
04-May
04-Apr
04-Feb
04-Jan
04-Mar
04-Jun
Sales
Time
Dynamic Partition Pruning
Advanced Pruning mechanism for
complex queries
Recursive statement evaluates the
relevant partitions at runtime
Look for the word KEY in PSTART/PSTOP
columns in the Plan
-
Sample explain plan output
Dynamic Partition Pruning
Nested Loop
Sample plan
-
Sample explain plan output
Dynamic Partition Pruning
Nested Loop
Sample plan
-
Sample plan
Dynamic Partition Pruning
Subquery pruning
-
Sample plan
Dynamic Partition Pruning
Bloom filter pruning
-
20
Enhanced Pruning Capabilities
Oracle Database 11g Release 2
Extended modeling capabilities for better data
placement and pruning
Support for virtual columns as primary and foreign key for
Reference Partitioning
Enhanced optimizer support for Partitioning
AND pruning
Intelligent multi-branch execution plan with unusable index
partitions
-
21
AND Pruning
All predicates on partition key will used for pruning
Dynamic and static predicates will now be used combined
A.k.a. multi-predicate pruning
Example:
Star transformation with pruning predicate on both the FACT
table and a dimensionFROM sales s, times t
WHERE s.time_id = t.time_id ..
AND t.fiscal_year in (2000,1999)
AND s.time_id
between TO_DATE('01-JAN-1999','DD-MON-YYYY')
and TO_DATE('01-JAN-2000','DD-MON-YYYY')
Dynamic pruning
Static pruning
-
AND Pruning
Sample plan
-
Ensuring Partition Pruning
Dont use functions on partition key filter predicates
-
Ensuring Partition Pruning
Dont use functions on partition key filter predicates
-
Agenda
Partitioning in a nutshell
Getting optimal pruning
Partition exchange loading
Partitioning and unusable indexes
Efficient statistics management
Q&A
-
Sales Table
May 22nd 2008
May 23rd 2008
May 24th 2008
May 18th 2008
May 19th 2008
May 20th 2008
May 21st 2008
DBA
1. Create external table
for flat files
4. Alter table Sales
exchange partition
May_24_2008 with table
tmp_sales
2. Use CTAS command
to create non-
partitioned table
TMP_SALES
Tmp_ sales Table
Sales Table
May 22nd 2008
May 23rd 2008
May 24th 2008
May 18th 2008
May 19th 2008
May 20th 2008
May 21st 2008
5. Collect
stats
Sales
table now
has all the
data3. Create indexes
Tmp_ sales
Table
Partition Exchange loading
-
Agenda
Partitioning in a nutshell
Getting optimal pruning
Partition exchange loading
Partitioning and unusable indexes
Efficient statistics management
Q&A
-
Unusable Indexes
Unusable index partitions are commonly used in
environments with fast load requirements
Safe the time for index maintenance at data insertion
Unusable index segments do not consume any space (11.2)
Unusable indexes are ignored by the optimizer SKIP_UNUSABLE_INDEXES = [TRUE | FALSE ]
Partitioned indexes can be used by the optimizer
even if some partitions are unusable
Prior to 11.2, static pruning and only access of usable index
partitions mandatory
With 11.2, intelligent rewrite of queries using UNION ALL
-
Intelligent Multi-Branch Execution
Intelligent UNION ALL expansion in the presence of
partially unusable indexes
Transparent internal rewrite
Usable index partitions will be used
Full partition access for unusable index partitions
-
Multi-Branch Execution
Sample plan
-
Agenda
Partitioning in a nutshell
Getting optimal pruning
Partition exchange loading
Partitioning and unusable indexes
Efficient statistics management
Q&A
-
Statistics Gathering
You must gather optimizer statistics
Using dynamic sampling is not an adequate solution
Statistics on global and partition level recommended
Run all queries against empty tables to populate
column usage
This helps identify which columns automatically get
histograms created on them
Optimizer statistics should be gathered after the data
has been loaded but before any indexes are created
Oracle will automatically gather statistics for indexes as they
are being created
-
Efficient Statistics Management
Use AUTO_SAMPLE_SIZE
The only setting that enables new efficient statistics collection
Hash based algorithm, scanning the whole table
Speed of sampling, accuracy of compute
Enable incremental global statistics collection
Avoids scan of all partitions after changing single partitions
Prior to 11.1, scan of all partitions necessary for global stats
Managed on per table level
Static setting
-
Incremental Global Statistics
Sales Table
May 22nd 2008
May 23rd 2008
May 18th 2008
May 19th 2008
May 20th 2008
May 21st 2008
Sysaux Tablespace
1. Partition level stats are
gathered & synopsis
created
2. Global stats generated by aggregating partition
synopsis
-
Incremental Global Statistics Contd
Sales Table
May 22nd 2008
May 23rd 2008
May 24th 2008
May 18th 2008
May 19th 2008
May 20th 2008
May 21st 2008
Sysaux Tablespace
3. A new partition is added to the table & Data is
Loaded
May 24th 2008 4. Gather partition statistics for new
partition
5. Retrieve synopsis for each of the other
partitions from Sysaux
6. Global stats generated by aggregating the original
partition synopsis with the new one
-
Step necessary to gather accurate statistics
Turn on incremental feature for the tableEXEC
DBMS_STATS.SET_TABLE_PREFS('SH,'SALES','INCREMENTAL','TRUE');
After load gather table statistics using GATHER_TABLE_STATS
No need to specify parameters
EXEC DBMS_STATS.GATHER_TABLE_STATS('SH','SALES');
The command will collect statistics for partitions and update the global
statistics based on the partition level statistics and synopsis
Possible to set incremental to true for all tables
Only works for already existing tables
EXEC DBMS_STATS.SET_GLOBAL_PREFS('INCREMENTAL','TRUE');
Partition Advisor SQL Access Advisor
-
Summary
Partitioning in a nutshell
Getting optimal pruning
Partition exchange loading
Partitioning and unusable indexes
Efficient statistics management
Demo (Performance & availability)
-
Partitioning Demonstration
Date 16/03/2010
Data Partitioning provides significant Service Benefits
-
Scenario Partitioning for Reliability
Two interactive scenarios will demonstrate:
Query performance between Partitioned vs Non-
Partitioned data.
Query resilience against unanticipated events
affecting data availability.
-
Demo Data Overview : Sales InformationBelow tables hold same sales information, with different storage structures
Size: 5,513,058 sales entries
Table: SALES_p1
Partitioning Scheme used: Initially Partitioned into yearly, halve-yearly, and quarterly periods
Further Partitioned by country regions
Table: SALES_nop1
For comparison purposes, a similar non-partitioned table is created.
050000
100000150000200000250000300000350000400000450000
-
Demo Data Overview : Customer Information
Below tables hold same customer information, with different storage structures
Size: 832,500 customer entries
Table: CUSTOMER_p1
Partitioning Scheme used: Partitioned by country regions
Table: CUSTOMER_nop1
For comparison purposes, a similar table is created as non-partitioned.
-
Demo Distribution of Customers across
Countries:
0
50000
100000
150000
200000
250000
300000
COUNT(C.CUST_ID)
Brazil
Denmark
Poland
South Africa
China
United Kingdom
New Zealand
Saudi Arabia
United States of America
Germany
Spain
France
Australia
Canada
Singapore
Argentina
Italy
Japan
Turkey
-
Demonstration Infrastructure
Equipment Amazon Cloud-based Virtual Machine Image (AMI)
2 Core, 1.7 GB Memory
Oracle Linux OS (OEL 5)
Software Configuration of Public Amazon AMI Oracle 11g Enterprise Edition (v11.1.0.7)
One disk /u02 (dev8-2) dedicated to Oracle storage I/O for benchmark accuracy.
Oracle Sample Data (i.e. SH repository )installed and extended for demo purposes.
-
Scenario 1 Key Performance Benefits
Areas highlighted in red show the resulting overhead of accessing normal table structures.
Areas highlighted in green reveal the overhead benefits of accessing partitions instead the whole table.
Note: Above graph details were generated by sar directives from a VMWare image installed on a notebook. The Demo will use an Amazon AMI.
-
Scenario 2 Key Availability Benefits
Non Partitioned Table
For a particular date range, i.e. 1999 - 2000, relevant non-partitioned tables are not reachable resulting in an error.
Database files that hold data involved in below queries have
been accidently removed !!!!
Partitioned Table
As expected, for a particular date range, i.e. 1999 - 2000, data in a partitioned table is not reachable resulting in an error.
Note: sales_nop2 data resides in example tablespace which in-turn references datafile example01.dbf.
-
Scenario 2 Key Availability Benefits
(Cont.)
Partitioned Table
More current information, i.e. 2000 - 2001, data is now reachable as a result of using partitions to better isolate against data disruptions.
related to database file: example01_01.dbf
related to missing file: example02_01.dbf
related to database file: example03_01.dbf
Non Partitioned Table
More current date range, i.e. 2000 - 2001, data in is still not reachable resulting in an error.
-
Demo Summary
Performance improvement - Scenario 1
Up to 3 times faster than traditional methods of data retrieval.
Availability Enhancement - Scenario 2
Limits detrimental effects of data access failures.
Data Partitioning provides significant Service Benefits
In General
Improves system scalability and manageability.
Adds data-level of protection to any High Availability strategy.
-
Next Steps
Upcoming Webinars
More coming soon!!!
http://otn.oracle.com/database
Follow OracleDirect ANZ on Twitter at
http://www.twitter.com/OracleDirectANZ
Or our NEW!!! blog http://blogs.oracle.com/techtalk
http://otn.oracle.com/databasehttp://www.twitter.com/OracleDirectANZhttp://blogs.oracle.com/techtalk
top related