introduction to owb(oracle warehouse builder) 1 2009-04-01
TRANSCRIPT
Introduction to OWB(Oracle Warehouse Builder)
1
2009-04-01
Agenda• Data Warehouse o Data Warehouse Concepts o ETL Process
• Oracle Warehouse Builder(OWB)o OWB Architectureo Data Sources and Data Targetso ETL: Mappingso ETL: Process Flowso Data Quality Management
• Demonstration o Extracting Datao Data Profiling and Cleansingo Transforming Data
Data Warehouse
3“one of the major ETL tools in the market “
• ETL (Extract/ Transform/ Load)• Data Quality Control• Meta data Management
• Find Pattern• Predict Behaviour or value (Classification/ Regression)• Generate Report
Oracle Warehouse Builder Oracle OLAP/ Data Miner
ETL Process
4
• Extract: extract data from sources and put in a so-called Staging Area(SA), usually with the same structure as the source.
• Load: finally, data is loaded into a central warehouse, usually into fact and dimension tables.
• Transform: join and union tables, filter and sort the calculations. In this step, we can check on data quality and cleans the data if necessary.
OWB Architecture
5
Design Centre
6
• Oracle– Tables, Views, MViews,
Queues, External Tables, Sqlloader, Transportable Tablespaces, Data Pump…
• DB2, Sybase, SQLServer, Informix, Mainframes, … (Oracle Transparent Gateways)
• ODBC• Flat Files• XML• Applications
– Oracle Ebusiness Suite– PeopleSoft– SAP– Siebel
Sources Targets
• Oracle• DB2, Sybase, SQLServer, Informix,
Mainframes, … (Oracle Transparent Gateways)
• ODBC• Flat Files• XML
Data sources and Data Targets
7
ETL: Mappings
8
• Declarative modeling of Data Flows
• Map from Source to Target• Integrated Data Quality
– N&A standardization– Match/Merge– Profiling
• Generates SQL & PL/SQL– Merge, transportable
tablespaces, data pump, sqlloader, xml data types, BLOBS/CLOBS, …
• Leverage custom data transformations
ETL: Process flows
9
• Declarative modeling of Process/work Flows
• Co-ordinate execution of Maps and other activities
• Create complex transitions• Send email, FTP
source/target files, call any external process, SQL Plus, Notifications
• Generates Oracle Workflow, Oracle Scheduler & XPDL
Data Quality Management
10
• Data Profiling• Missing or invalid values• Distributions of the values
in a specific column• Data Rule for Cleansing
Metadata Management
11
• Dependency Management– Data Lineage at attribute level– Impact Analysis at attribute level
• Metadata Snapshots• Change Management (diff, merge and reconcile)• Reporting (browser)• APIs (Scripting, SQL, PL/SQL)• Exchange (import/export)
Demonstration
12
1. Identifying data sources/ targets and importing metadata
3. Data profiling and decide data cleansing strategy
4. Design and execute mappings (Merging) and cleansing
5. Design dimension tables
Define Sources & Targets Extract Data Profiling
Transform Load
“Derived Data Rule”
2. Import data and design and execute mappings (Extract)
“Generated Code”
“Generated Code”