agenda odi performance odi scheduling odi deployment/release

Post on 30-Mar-2015

295 Views

Category:

Documents

7 Downloads

Preview:

Click to see full reader

TRANSCRIPT

AGENDA

• ODI Performance• ODI Scheduling• ODI Deployment/Release

BI-Quotientwww.bi-q.ie

ULI BETHKE

• Dublin based• Blog www.bi-q.ie• ODI 2007• Reviewer two ODI books• ODI articles OTN• Deputy chair OUG BI SIG. Next event 11th June• ODI advanced trainer

BI-Quotientwww.bi-q.ie

ODI PERFORMANCE

ODI is a metadata driven (SQL) code generator using code templates (knowledge modules). It uses a Java

agent to communicate and send data between source and target systems and the repository over

the network.

BI-Quotientwww.bi-q.ie

SQL

- > 80%: ODI performance issues = SQL issues => SQL main ODI skill

- Perfect your SQL. Advanced SQL. Analytic Functions

- Know your database(s) inside out. In particular the target

- Understand, write, and modify Knowledge Modules

BI-Quotientwww.bi-q.ie

AGENT

- Light weight Java based application- Tied to host OS- Generates code based on ODI metadata.- Communicates source, target, repository.- JDBC data transport- XML- Jetty- Interpreters: Jython, JBS, JavaScript, Groovy- HSQLDB in memory database- Scheduler- Sizing

BI-Quotientwww.bi-q.ie

AGENT

Target- Least amount of roundtrips. Network (JDBC, XML)- One target database server only (DW)Another Server- ODBC drivers- JEE agent on Weblogic- No support for target OS- Resources on target- DBA

BI-Quotientwww.bi-q.ie

INTERFACES

- No!! KM using row by row processing- Use ODI functions rather than DB functions- Don’t overuse CKM (especially for large data

volumes)- temp indexes (I$)- Gather statistics (C$, I$, TGT when applicable)- Rule of thumb: Use loader KMs or db link

KMs rather than JDBC KMs

BI-Quotientwww.bi-q.ie

SOURCE/TARGET

- Schemas on same database server. Physical schema and not data server.

- Have sources physically close to target- Minimize impact on source- Chunking

BI-Quotientwww.bi-q.ie

CRITICAL PATH BI-Quotientwww.bi-q.ie

NETWORK PATHS: PATH DURATIONS:B > E > H 6 + 2 + 11=19B > D > F 6 + 4 + 14=24B > D > G 6 + 4 + 10=20A > C > G 9 + 8 + 10=27 CRITICAL

PATH

MICRO TUNING

• JDBC drivers• JVM• Type 4 or 5 JDBC drivers (Data Direct)• Array fetch size. • DB packet size. • Network packet size.

BI-Quotientwww.bi-q.ie

PERFORMANCE MONITORING

• ODI Log Data Mart• Facts• Dimensions• Metrics• Frontend

BI-Quotientwww.bi-q.ie

DBMS_SQLTUNE_UTIL0

• dbms_sqltune_util0.sqltext_to_sqlid • Link to Data Dictionary Tables

BI-Quotientwww.bi-q.ie

MACIEJ KOCON

• Dublin based• ODI 2005 (Sunopsis)• Reviewer two ODI books• Blog www.bi-q.ie• maciek@bi-q.ie

BI-Quotientwww.bi-q.ie

ORCHESTRATING DWH PROCESSES

• Orchestration of Data Process Flow– Standard DWH Process flow orchestration– Packages in Oracle Data Integrator 10g– Load Plans in Oracle Data Integrator 11g

• Process Flow use cases - efficiency analysis• Alternative scheduling

– benefits

BI-Quotientwww.bi-q.ie

1

TYPICAL DATA FLOW in DWH

step

STAGE

DATA EXTRACTloads data from

sources

E-LT

BI-Quotientwww.bi-q.ie

1 2

TYPICAL DATA FLOW in DWH

step

STAGE

DATA EXTRACTloads data from

sources

step

DIMs

LABELprovides

structured labelinginformation

E-LT

BI-Quotientwww.bi-q.ie

1 2 3

TYPICAL DATA FLOW in DWH

step

STAGE

DATA EXTRACTloads data from

sources

step

DIMs

LABELprovides

structured labelinginformation

step

FACTS

FACTSconsists of

measurements, metrics or facts

E-LT

BI-Quotientwww.bi-q.ie

1 2 3

TYPICAL DATA FLOW in DWH

step

STAGE

DATA EXTRACTloads data from

sources

step

DIMs

LABELprovides

structured labelinginformation

step

FACTS

FACTSconsists of

measurements, metrics or facts data transport &

transform units

E-LT

BI-Quotientwww.bi-q.ie

1 2 3

TYPICAL DATA FLOW in DWH

step

STAGE

DATA EXTRACTloads data from

sources

step

DIMs

LABELprovides

structured labelinginformation

step

FACTS

FACTSconsists of

measurements, metrics or facts data transport &

transform units

ODI 10gPackages orchestration

E-LT

ODI 11Load Plans

BI-Quotientwww.bi-q.ie

PRC_B

INT_A

PKG_ABC

ORCHESTRATION – ODI PACKAGES

INT_C

INT_D

PKG_DE

INT_E

using object directly

BI-Quotientwww.bi-q.ie

INT_C

PRC_B

INT_A

PKG_ABCDE

PKG_DE

PRC_B

INT_A

PKG_ABC

ORCHESTRATION – ODI PACKAGES

INT_C

INT_D

PKG_DE

INT_E

using object directly using scenarios – compiled code

SYNCHRONOUS

BI-Quotientwww.bi-q.ie

INT_C

PRC_B

INT_A

PKG_ABCDE

PKG_DE

PRC_B

INT_A

PKG_ABC

ORCHESTRATION – ODI PACKAGES

INT_C

INT_C

PRC_B

INT_A

PKG_ABCDE

PKG_DE

INT_D

PKG_DE

INT_E

using object directly using scenarios – compiled code

SYNCHRONOUS

ASYNCHRONOUS

BI-Quotientwww.bi-q.ie

ODI 10g vs. ODI 11STAGE DIMs FACTS

INT_CPRC_B

INT_A

PKG_ABC

PRC_D

INT_C

PKG_DE

PRC_G

INT_F

PKG_FGPKG_DM

A

B

C

D

EF

G

ODI 10gPackages

BI-Quotientwww.bi-q.ie

ODI 10g vs. ODI 11STAGE DIMs FACTS

INT_CPRC_B

INT_A

PKG_ABC

PRC_D

INT_C

PKG_DE

PRC_G

INT_F

PKG_FGPKG_DM

ODI 11Load plans

ODI 10gPackages

BI-Quotientwww.bi-q.ie

ODI 10g vs. ODI 11STAGE DIMs FACTS

INT_CPRC_B

INT_A

PKG_ABC

PRC_D

INT_C

PKG_DE

PRC_G

INT_F

PKG_FGPKG_DM

ODI 10gPackages

ODI 11Load plans

A

B

C

D

EF

GSAME EFFECT!

BI-Quotientwww.bi-q.ie

PROCESS FLOW EFFICIENCY ANALYSIS

30

30

10

10

10 10

10

A

B

C

D

E

F

G

sequential

para

llel

30 + 30 + 10 = 70

A 30

B 10

C 10

D 10

E30

F 10

G10

Standard Flow Orchestration: Stage-(stop)DIMs-(stop)Facts

BI-Quotientwww.bi-q.ie

Standard Flow Orchestration: Stage-(stop)DIMs-(stop)Facts

PROCESS FLOW EFFICIENCY ANALYSIS

30

30

10

10

10 10

10

A

B

C

D

E

F

G

sequential

para

llel

30 + 30 + 10 = 70

A 30

B 10

C 10

D 10

E30

F 10

G10

DOWNSIDES:• POSSIBLE INEFFICIENCIES (IDLE RESOURCES)

BI-Quotientwww.bi-q.ie

PROCESS FLOW EFFICIENCY ANALYSIS

A 30

B 10

C 10

D 10

E30

F 10

G10

OPTIMIZATION ATTEMPT

BI-Quotientwww.bi-q.ie

PROCESS FLOW EFFICIENCY ANALYSIS

A 30

B 10

C 10

D 10

E30

F 10

G10

30

30

10

10

1010

1030 + 10 10 + 30

+ 10 = 50

B

C

A D

E

F

G

sequential

para

llel

OPTIMIZATION ATTEMPT

70 50 = 1.4 times quicker!UPSIDE:• EFFICIENCY IMPROVED

BI-Quotientwww.bi-q.ie

ADVANCED DATA FLOW EXAMPLE BI-Quotientwww.bi-q.ie

ENTERPRISE DWH DATA FLOW EXAMPLE BI-Quotientwww.bi-q.ie

ENTERPRISE DWH DATA FLOW EXAMPLE BI-Quotientwww.bi-q.ie

PROCESS FLOW EFFICIENCY ANALYSIS

A 30

B 10

C 10

D 10

E30

F 10

G10

30

30

10

10

1010

1030 + 10 10 + 30

+ 10 = 50

B

C

A D

E

F

G

sequential

para

llel

OPTIMIZATION ATTEMPT

70 50 = 1.4 times quicker!UPSIDE:• EFFICIENCY IMPROVEDDOWNSIDES:• TIMINGS KNOWLEDGE REQUIRED• OVERALL DEPENDECY KNOWLEDGE REQURED

BI-Quotientwww.bi-q.ie

PROCESS FLOW EFFICIENCY ANALYSIS

30

30

10

10

10 10

10

A

B

C

D

E

F

G

sequential

para

llel

30 + 30 + 10 = 70

A 30

B 10

C 10

D 10

E30

F 10

G10

OPTIMIZATION ATTEMPT

DOWNSIDE:• INEFFICIENCY EXISTS BUT CAN’T BE RESOLVED• CONSUMER WAITING & IMPACT

70

70

BI-Quotientwww.bi-q.ie

• Possible inefficiencies (idle resources)• Timings knowledge required• Overall dependecy knowledge requred• Inefficiency exists but can’t be resolved• Consumer waiting & impact

TRADITIONAL SCHEDULING - LIMITATIONS BI-Quotientwww.bi-q.ie

• Possible inefficiencies (idle resources)• Timings knowledge required• Overall dependecy knowledge required• Inefficiency exists but can’t be resolved• Consumer waiting & impact

TRADITIONAL SCHEDULING - LIMITATIONS

SCHEDULER

BI-Quotientwww.bi-q.ie

DEPENDENCY DRIVEN SCHEDULINGA

B

C

D

E

B

A

C

D

E

A

B

C

D

E

B

A

C

D

E

B

A

C

D

E

A

B

C

D

E

B

A

C

D

E

BI-Quotientwww.bi-q.ie

DEPENDENCY DRIVEN SCHEDULINGA

B

C

D

E

B

A

C

D

E

A

B

C

D

E

B

A

C

D

E

B

A

C

D

E

A

B

C

D

E

B

A

C

D

E

PACKGAGES&

LOAD PLANS

BI-Quotientwww.bi-q.ie

PROCESS FLOW EFFICIENCY ANALYSIS

30

30

10

10

10 10

10

A

B

C

D

E

F

G

sequential

para

llel

30 + 30 + 10 = 70

A 30

B 10

C 10

D 10

E30

F 10

G10

30 30

10

10

10 10

10

70

70

A 30

B 10

C 10

D 10

E30

F 10

G10

BI-Quotientwww.bi-q.ie

PROCESS FLOW EFFICIENCY ANALYSIS

30

30

10

10

10 10

10

A

B

C

D

E

F

G

sequential

para

llel

30 + 30 + 10 = 70

A 30

B 10

C 10

D 10

E30

F 10

G10

30 30

10

10

10 10

10

70

70

A 30

B 10

C 10

D 10

E30

F 10

G10

70

30

70 30 = 2.3 times faster!

BI-Quotientwww.bi-q.ie

DEPENDENCY DRIVEN SCHEDULING

• Simplifies orchestrating the flow– only immediate upstream definition required– execution timings not relevant– self-adapts in the most effective way

• Improves overall E-LT performance– Less idle resources – better utilization– Independency– unveils its full potential in complex Enterprise class

DWHs (Inmon)

BI-Quotientwww.bi-q.ie

DEPENDENCY DRIVEN SCHEDULING

• Notifications– errors (+auto-restartability)– finish summary– logging

• Multiple/overlapping E-LT streams– load with different frequencies

• Parameterization– improved system stress control– process prioritization

BI-Quotientwww.bi-q.ie

F I R S T RUN

10p ro c e s s e s

F I R S T RUN

10p ro c e s s e s

T O D A Y

584p ro c e s s e s

1389DEPENDENCIES

F I R S T RUN

10p ro c e s s e s

T O D A Y

584p ro c e s s e s

132 231 SCENARIOS RUN

1389DEPENDENCIES

F I R S T RUN

10p ro c e s s e s

T O D A Y

584p ro c e s s e s

132 231 SCENARIOS RUN

1389DEPENDENCIES

12h43mLOAD PLANS

TIM

E

F I R S T RUN

10p ro c e s s e s

T O D A Y

584p ro c e s s e s

132 231 SCENARIOS RUN

1389DEPENDENCIES

12h43mLOAD PLANS

4h21mSCHEDULER

TIM

E

2.9T I M E S

F A S T E R

ENTERPRISE DWH DATA FLOW BI-Quotientwww.bi-q.ie

RELEASE 1.0 BI-Quotientwww.bi-q.ie

RELEASE 2.0 TST BI-Quotientwww.bi-q.ie

TESTING RELEASE 2.0 BI-Quotientwww.bi-q.ie

DEPLOY RELEASE 2.0 PRD BI-Quotientwww.bi-q.ie

THE HOT FIX SITUATION

RELEASE FREQUENTLY BI-Quotientwww.bi-q.ie

CI ENVIRONMENT BI-Quotientwww.bi-q.ie

CI ENVIRONMENT BI-Quotientwww.bi-q.ie

THE BUILD MASTER BI-Quotientwww.bi-q.ie

AUTOMATE STUFF BI-Quotientwww.bi-q.ie

ODI VS. SOURCE CONTROL BI-Quotientwww.bi-q.ie

ODI STRUCTURE BI-Quotientwww.bi-q.ie

BEYOND INTRA BUILD DEPENDENCIES

BI-Quotientwww.bi-q.ie

top related