sas , sun, oracle: on mashups, enterprise 2.0 and ideation · sas ®, sun, oracle: on mashups,...

Post on 24-May-2020

8 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

SAS®, Sun, Oracle: On Mashups, Enterprise 2.0 and Ideation

Charlie Garry, Director, Product Manager, Oracle CorporationCharlie Garry, Director, Product Manager, Oracle CorporationPaul Kent, Vice President, Platform R&D, SASMaureen Chew, Principal Software Engineer, Oracle Corporation

Agenda

� Getting to Enterprise 2.0 – The Optimized Information Infrastructure

� Speaking / Tweeting towards All Things In-Database

� SQL optimization (the obvious, not-so-obvious, unobvious)

� In-database – current landscape� In-database – current landscape

� In-database – A Look Ahead

� Apple iPad Drawing

Analytics and Enterprise 2.0

� Market estimated to be $105B with a CAGR of 7%

� Layer Enterprise 2.0 on top

� Even MORE information

� Is the information timely?

� Is the information clean?� Is the information clean?

� Is the information accurate?

Non-Conventional Thinking

“Almost two out of three data centers (63.0%)

worldwide report a ‘dramatic’ increase in their

storage requirements over the past five years”

(2009/2010 AFCOM Data Center Trends Survey Results & Analysis)

The Result

� MORE SERVERS – Poor Utilization

� MORE STORAGE – Poor Utilization

� MORE COMPLEXITY – Inability to Change or Integrate Quickly/Safely

Simplify/Optimize Your Information Infrastructure

�Reduce Analytic Latency

�Reduce OPEX and CAPEX

�Faster Time-To-Value

6

� Offload Infrastructure Design and Maintenance

© 2010 Oracle Corporation

SAS and Oracle

“Our customers are very interested in getting maximum value out of our business analytic solutions and that means putting less effort into provisioning and managing infrastructure. Our ability to partner with and leverage the fusion of Sun into Oracle to simplify infrastructure will be a benefit to our mutual

On Collaboration ...

infrastructure will be a benefit to our mutual customers.”

Keith Collins, SAS Senior Vice President and Chief Technology Officer

Sun & Oracle – A Better Platform for SAS

� SAS uses many Oracle and Sun technologies

� Solaris is a leading UNIX deployment platform for SAS

� Sun HW / Storage

� WebLogic

� Java

� LDAP� LDAP

� ACCESS / Oracle / Exadata

� MySQL

Oracle & SAS Collaboration

� Partnership & Collaboration

� High end performance testing

» SAS Enterprise BI, Sun Enterprise M9000

» JMP Genomics

» SAS Grid

» Sun Blade 6000» Sun Blade 6000

» Sun ZFS Storage 7420

� Broad Engineering collaboration

� http://oracle.com/sas

SQL Optimization – the “Obvious”

� UNION, MINUS, INTERSECT : sort to elminate duplicate rows; UNION ALL : no sort, includes dups

� IN vs EXISTS

� Queries using IN or NOT IN could convert to EXISTS / NOT EXISTS (or vice versa) - bit.ly/gZvzeM

� Wildcard search against an index� Wildcard search against an index

� Indexes (ie: COL) usable only from beginning of column

» “COL like 'abc123%'” uses index, “COL like '%abc123%'” does not

� Functions cannot use index, create “functional” index

� UPPER(COL)='ABC123' → create index idx on tablename(UPPER(COL));

SQL Optimization – the “less Obvious”

� Collect good statistics using DBMS_STATS

� Poor query performance can result from stale stats, data skew

� Partition large tables

� ie: Partition data by week - retrieves 1/52 of table

� CTAS instead of UPDATE/DELETE (DML)

� If deleting large number of rows, often better to CREATE TABLE xyz AS SELECT … from abc”

� INSERT with APPEND hint bypasses buffer cache and typically faster than conventional inserts

� Use parallelism – ie: query, dml, data load, replication, ... (bit.ly/eLFRQy)

SQL Optimization – the “unObvious”� Maria Colgan – Top Tips for Getting Optimal SQL

Execution All the Time

� Cardinality

» How to combat common causes for incorrect cardinality

� Access path

» What causes the wrong access path

� Join type

» Common causes for why the wrong join type was selected

� Join order

» Common causes for why the wrong join order was selected

� References

� Preso above: bit.ly/enQxBK

� Optimizer blog: blogs.oracle.com/optimizer

Tactics for Pushing SQL whitepaper

Tactics ... – TOC excerpts

In-Database Processing – Performance

� libname x oracle insertbuff=1000 ...

� SAS/ACCESS - dbslice – threaded read – data step 116GB file

set GE_Data.OBSERVATION_F ( DBSLICE=

("MOD(OBSERVATION_KEY,2)=0" "MOD(OBSERVATION_KEY,2)=1" ));

DATA Step Time Total Run TimeDATA Step Time Total Run Time

DATA Step Only 5 hrs, 4 min, 21 sec 8 hrs, 26 min, 38 sec

Input from Exadata 4 hrs, 36 min, 22 sec 7 hrs, 19 min, 26 sec

Input from Exadata +threaded read 1 hr, 47 min, 53 sec 4 hrs, 51 min, 23 sec

In-Database Processing – Current Landscape

� Pass through

� Implicit SQL – SAS code converted to SQL passthrough (Use Tactics for Pushing SQL to the Relational Database)

» 9.2M2 -significant improvements (inline views, SQL views, tables, expressions using CALCULATED keyword, SELECT, WHERE, HAVING, ON, GROUP BY, ORDER BY clauses)

� Explicit SQL� Explicit SQL

� In-database BASE PROCS for Oracle

� Available as of 9.2M3 (http://bit.ly/gMCvbo)

� FREQ, RANK, REPORT, SORT, SUMMARY/MEANS, TABULATE

The Most Important Acceleration Strategies

Co-location (of data and analytics)

Co-location (of data and analytics)

Co-location (of data and analytics)

Avoid the disk, use memory

ParallelizeParallelize

But, co-location

has many technological solutions

has to be done right

has to adjust to the complexity of the analytic task

Acceleration Strategies With DBMS

Customers want to improve response times to SAS workload that accesses data inside DBMS

What are the options

Re-state the work as SQL, let DBMS parallelize

Extend SQL with UDFs Inside-DB

SQL-PassThru

Extend SQL with UDFs

Go beyond the simple (obvious) transforms

Put SAS CPUs closer to DBMS CPUs Alongside-DB

Inside-DB

Inside-DB

Teradata, Greenplum…… but what about Oracle

Progress Report

It can be done!

We have the basic wiring assembled

We need to test the performance

Co-location. Not so muchCo-location. Not so much

Stay Tuned

Interested? Paul.Kent@sas.com

Thank You

Charlie Garry, charlie.garry@oracle.com

Paul Kent, paul.kent@sas.com

Maureen Chew, maureen.chew@oracle.com

Appendix

Sun Blade 6000; Sun Blade X6270 M2 server modules

Sun ZFS Storage 7420Sun Blade 6000 Ethernet Switched

Network Express Module (10GbE, 24p)

Sun ZFS Storage 7420

SAS® Grid Computing, Sun Blade 6000,Sun ZFS Storage 7420

Extreme Compute and I/O Performance

Sun ZFS Storage 7420 Extreme Performance� SAS Grid workload � Shared Read/Write filesystem via NFS� 10 x Sun Blade X6270 M2

� 2.01 GB/sec via 10GbE� even read/write load

ZFS 7420 I/O analytics� Grid I/O by client

Sun Blade 6000; Sun Blade X6270 M2 server modules

Sun ZFS Storage 7420Sun Blade 6000 Ethernet Switched

Network Express Module (10GbE, 24p)

SAS® Grid Computing, Sun Blade 6000,Sun ZFS Storage 7420

Extreme Compute and I/O Performance

modules

Sun ZFS Storage 7420 Extreme Performance� SAS Grid workload � Shared Read/Write filesystem via NFS� 10 x Sun Blade X6270 M2

� 2.01 GB/sec via 10GbE� even read/write load

top related