business analytics on ids-waiug

Upload: muruganandhams

Post on 09-Apr-2018

219 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/8/2019 Business Analytics on IDS-WAIUG

    1/41

    Business Analytics with IDS

    Fred Ho, IDS Development

  • 8/8/2019 Business Analytics on IDS-WAIUG

    2/41

    Copyright IBM Corporation 2009. All rights reserved.U.S. Government Users Restricted Rights - Use, duplication or disclosure restricted by GSA ADP Schedule

    Contract with IBM Corp.

    THE INFORMATION CONTAINED IN THIS PRESENTATION IS PROVIDED FOR INFORMATIONAL PURPOSESONLY. WHILE EFFORTS WERE MADE TO VERIFY THE COMPLETENESS AND ACCURACY OF THEINFORMATION CONTAINED IN THIS PRESENTATION, IT IS PROVIDED AS IS WITHOUT WARRANTY OF

    ANY KIND, EXPRESS OR IMPLIED. IN ADDITION, THIS INFORMATION IS BASED ON IBMS CURRENT

    PRODUCT PLANS AND STRATEGY, WHICH ARE SUBJECT TO CHANGE BY IBM WITHOUT NOTICE. IBMSHALL NOT BE RESPONSIBLE FOR ANY DAMAGES ARISING OUT OF THE USE OF, OR OTHERWISERELATED TO, THIS PRESENTATION OR ANY OTHER DOCUMENTATION. NOTHING CONTAINED IN THISPRESENTATION IS INTENDED TO, NOR SHALL HAVE THE EFFECT OF, CREATING ANY WARRANTIES OR

    REPRESENTATIONS FROM IBM (OR ITS SUPPLIERS OR LICENSORS), OR ALTERING THE TERMS ANDCONDITIONS OF ANY AGREEMENT OR LICENSE GOVERNING THE USE OF IBM PRODUCTS AND/OR

    SOFTWARE.

    IBM, the IBM logo, ibm.com, Informix, solid, DataMirror, Optim, Cognos are trademarks or registered trademarks ofInternational Business Machines Corporation in the United States, other countries, or both. If these and other IBM

    trademarked terms are marked on their first occurrence in this information with a trademark symbol ( or ), these symbolsindicate U.S. registered or common law trademarks owned by IBM at the time this information was published. Suchtrademarks may also be registered or common law trademarks in other countries.

    Other company, product, or service names may be trademarks or service marks of others.

    Disclaimer

  • 8/8/2019 Business Analytics on IDS-WAIUG

    3/41

    Contents

    Definition of BI/DW/BA

    Types of IDS BI Users

    OLTP vs. Data Warehousing

    Informix Warehouse

    IDS Storage Optimization

    Your Feedback and Requirements

  • 8/8/2019 Business Analytics on IDS-WAIUG

    4/41

    Business Intelligence

    A set of concepts and methodologies to improve decision making inbusiness through use of facts and fact-based systems

    ..Howard Dresner, The Gartner Group

    The processes, technologies, and tools needed to turn data intoinformation, information into knowledge, and knowledge into plans that

    drive profitable business actions

    .David Loshin, Business Intelligence: The Savvy Managers Guide

    The foundation that enables BI is the enterprise architecture business, data, and technology. A well-implemented data warehousing

    program provides much of that foundation.

  • 8/8/2019 Business Analytics on IDS-WAIUG

    5/41

    Data Warehousing

    A data warehouse is a subject-oriented, integrated, non-volatile, timevariant collection of data organized to support management needs

    .W H Inmon

    The Data Warehouse is nothing more than the union of all the constituentdata marts

    .Ralph Kimball, et al, The Data Warehouse Life Cycle Toolkit

    The data warehousing process turns raw data into potentially valuable

    information usable by people and systems. Warehousing enhances data

    assets value by:

    Applying standards and consistency to the data Organizing the data into subject areas that cross business functional

    lines

    Integrating the data Enforcing data consistency over time to provide meaningful history Acting as a stable and reliable source Providing easy access to data

  • 8/8/2019 Business Analytics on IDS-WAIUG

    6/41

    Business Analytics

    The process of using information to enhance knowledge and apply thatknowledge to help a business achieve its objectives. Analytic applications

    provide tools to facilitate the business analytics process.

    Business Metrics and Business Management

    Business Process Management

    Business Performance Management

    Business Activity Monitoring

    Customer Relationship Management

    Supply Chain Management

    Performance Dashboards for Information Delivery

    Real-time (or near Real-time) Monitoring

    Scorecards for Information Delivery

    Monitoring history & trends

    Analytic Applications for Information Delivery

    Customer Analysis, Marketplace Analysis, Sales Channel Analysis,

  • 8/8/2019 Business Analytics on IDS-WAIUG

    7/41

    Range of Business Analytics

    Reporting

    Using Query,Reporting and

    search tools

    Analysis

    Monitoring

    Prediction

    Using OLAP &Virtualization

    tools

    Using Dashboards& Scorecards

    Using PredictiveAnalysis tools

    Business Value HighLow

    High

    Complex

    ity

    Source: TDWI

  • 8/8/2019 Business Analytics on IDS-WAIUG

    8/41

    IDS in BI/Warehousing

    Given the IDS Characteristics of Reliability, High Availability,Performance, Ease of Use, why isnt IDS in this space?

    IDS has traditionally been viewed as an OLTP solution However, there a lot more warehousing users on IDS than one

    realizes!

    Some customers have implemented IDS warehouses atTerabyte levels There are a lot of features already in IDS that make it suitable

    for BI/Warehousing

    BI tools have become very sophisticated over the years

    We recognize the need to provide better warehousing capabilitiesfor IDS users

  • 8/8/2019 Business Analytics on IDS-WAIUG

    9/41

    Whats Available? IDS Warehousing Features

    Performance & Scalability Inherent SMP Multi-threading Parallel Data Query (PDQ) Light Scan for fast table scans Online Index build Efficient Hash Joins Auto Fragment Elimination Memory Grant Manager (MGM) High Performance Loader Optimistic Concurrency

    Easy of Management Time cyclic data management using Range Partitioning OPTCOMPIND optimization

  • 8/8/2019 Business Analytics on IDS-WAIUG

    10/41

    BI Users Classification

    1.BI on Existing OLTP Schema (Operational BI)2.BI on Star Schema (Data Mart)3.BI in a Mix-Workload Environment4.Enterprise BI

  • 8/8/2019 Business Analytics on IDS-WAIUG

    11/41

    Type 1: BI/Analytics on OLTP Schema

    Majority of todays IDS customers have the need to do BI/Analytics on their existing IDS (OLTP) database.

    They currently use a combination of 4GL programs, Excel,and BI tools (Business Objects, Cognos, Crystal Reports)

    Custom code and maintenance required by customer Performance may be acceptable even on an OLTP schema Allows for operational BI

  • 8/8/2019 Business Analytics on IDS-WAIUG

    12/41

    OLTP vs. Data Warehousing Workload

    Short Transactions Relatively simple SQL

    Random Updates Few Rows accessed

    Sub-second response time ER Modeling

    Minimizes redundancy Normalized data (5NF)

    Minimizes duplicates Few indexes

    Avoids index maintenance Pre-compiled queries

    Repeated execution of queries

    Longer Transactions Complex SQL with analytics

    Sequential Updates Many Rows Accessed

    Secs to Mins response time Dimensional Modeling

    OK to have redundancy De-normalized data (3NF)

    Duplicates are OK OK to have more indexes

    Mostly read only Ad-hoc queries

    Unpredictable load

  • 8/8/2019 Business Analytics on IDS-WAIUG

    13/41

    Type 2. BI/Analytics on IDS on Star Schema

    Transform OLTP database into StarSchema database

    Better performance for datawarehousing and dimensional

    queries

    Star Schema database may be on aseparate machine/domain

    Suitable for customers buildingseparate data mart

    Use IDS as is against Star Schema

  • 8/8/2019 Business Analytics on IDS-WAIUG

    14/41

    Whats Available? BI Tools

    The Performance Management FrameworkCognos identifies best-practice decision areas, orinformation sweet spots by business function:

    Cognos 8 provides a comprehensive set of BI tools for:

    Reporting

    Analysis

    Dashboards

    Scorecards

    Performance Management Framework for:

    Solutions for different areas of the organization

  • 8/8/2019 Business Analytics on IDS-WAIUG

    15/41

    Cognos Business Intelligence and Performance Management

    One Platform, One Architecture

    Industry and

    Functional Solutions

    Complete Coverage

    of all capabilities

    Enterprise-Class

    SOA Platform

  • 8/8/2019 Business Analytics on IDS-WAIUG

    16/41

    Data Warehouse Architecture

  • 8/8/2019 Business Analytics on IDS-WAIUG

    17/41

    SQL Warehousing Tool Overview

    Warehousing Process Design Studio Admin Console Summary

  • 8/8/2019 Business Analytics on IDS-WAIUG

    18/41

    SQL Warehousing Tools Overview

    Typical process Identify requirements Data Architect

    Define data transformation (ETL/ELT)process

    SQL/ETL developer Development of sql/shell scripts

    SQL/ETL developer Deployment in production system

    Application Architect, DBA Reporting

    Business user Refine requirements

    SQW Solution Data Modeling

    Physical Data Model (Reverseengineering, New from scratch,generate DDL), compare & sync

    Data Flows Visual Design Optimized SQL code generation Control flow supports programming

    logic

    Admin Console Schedule, Monitor, Parameterizedvalues

    Eclipse free reporting tool e.g. BIRT

    Reusable flows Easy refinement Copy & paste, refactor Challenges

    Dynamic requirements Constantly refinement

    Multiple roles, tools Each have different

    perspective Communication cost/

    information loss

    Unreadable, hard-to-debug scripts Poor productivity

    Values Easy to design & reuse

    Increased productivity Integrated tools

    Seamless integration insideEclipse

    Auto generated code from visualizedflows

    Optimized SQL code Impact analysis for any data model

    change

  • 8/8/2019 Business Analytics on IDS-WAIUG

    19/41

    SQW

    Control DB

    IDS

    Execution

    DESIGN

    Design Center(Eclipse)

    Data Flows + ControlFlows

    DEPLOY

    Deploymentpreparation

    Deployment

    packageCode Units

    Build Profile

    User scripts

    Deploy

    RUNTIME

    HTTP service (WAS )

    SQW Runtime

    ApplicationsOther Servers

    (DataStage)

    WarehouseDB

    IDS

    DB2

    Oracle

    SQL Server

    De

    sign

    S

    tudio

    AdminConsole

    Deploy

    SQW

    Execution

    DB

    IDS

    Data Source

    Databa

    ses

    SQW Architecture

  • 8/8/2019 Business Analytics on IDS-WAIUG

    20/41

    SQW: Design Studio

    Design Studio Eclipse based IDE

    Integrated tools, shell sharing Team development

    CVS, clearcase for checkin/checkoutprojects, flows

    Data Warehousing Project Data Models Data Flows Control Flows Warehouse Applications (deployment

    packages) Subflow & Subprocess (reusable flow

    module) Variables

    Data Source Explorer Database connections to multiple

    vendors, e.g. Informix, DB2 LUW,Oracle, SQL Server, MySQL, DB2 z/OS

    DataStage Servers Integration with IBM DataStage

  • 8/8/2019 Business Analytics on IDS-WAIUG

    21/41

    SQW: Data Modeling

    Physical Data ModelVisualized data modelingImpact analysisReverse engineering or new from scratchCompare & syncGenerate DDLOverview diagram

    Shell Sharing with Rational Data Architect & other DataStudio products

  • 8/8/2019 Business Analytics on IDS-WAIUG

    22/41

    SQW: Data Flows

    Data Flow Operators:

    -- source & target operators (table, file)

    -- SQL Transformation operators

    -- Warehousing operators

    File source

    Table source

    Table

    join

    aggregation

    Table target

  • 8/8/2019 Business Analytics on IDS-WAIUG

    23/41

    SQW: Data Flows

    A simple flow

    Generated SQL code

    -- optimization across SQL statements.

    -- optimized staging strategy

    -- in-database transformation

  • 8/8/2019 Business Analytics on IDS-WAIUG

    24/41

    SQW: Control Flows

    Control flow

    Common utility operatorsControl logic, parallel execution, loop iterationError handling

  • 8/8/2019 Business Analytics on IDS-WAIUG

    25/41

    SQW Overview

    Design Studio

    Eclipse Based Design Environment

    Admin Console

    Production Environment in Websphere

    deploy

    Application package (zip file)

    deployment profile(database connections, machine resources,

    variable definitions, DDL files etc..)

    Generated code

    crea

    te

    Manage warehouse applications

    ScheduleMonitor

    manage

  • 8/8/2019 Business Analytics on IDS-WAIUG

    26/41

    Admin Console

    Flex RIA based WarehouseAdmin Console

    Admin Console managescommon resources (e.g.

    databases connections, ftp

    servers, datastage servers)

    Schedule & monitor warehouseprocesses

  • 8/8/2019 Business Analytics on IDS-WAIUG

    27/41

    XPS Customers Looking to Migrate to IDS

    External Tables XPS style loader for easy migration

    Partitioning Strategies Auto fragmentation Fragment Advisor

    Fragment stats Update Truncate Fragments

    Primary Storage Manager (PSM) For simpler, easier management of backups

    (replacing ISM)

    Merge

    UpSert capabilities

    * Features to be included in the next release(s)

  • 8/8/2019 Business Analytics on IDS-WAIUG

    28/41

    Shared

    Disk

    OLTP Apps

    SQW

    Connection Manager

    Primary

    SDS

    SDS

    OLTPNodeGroup

    SDS

    SQWNodeGroup

    MAC

    H

    11

    Blade Server

    User transparencySingle

    database

    view

    OLTP Apps SQW

    OR

    (ETL)

    OLTPDatabase

    DataWarehouseDatabase

    Use Separate Boxes

    Use MACH 11

    Using Mach11 for OLTP/Warehousing in IDS

  • 8/8/2019 Business Analytics on IDS-WAIUG

    29/41

    IDS Storage Optimization

    Now Available as of 11.50xC4

    Deep Compression + Storage Optimization

  • 8/8/2019 Business Analytics on IDS-WAIUG

    30/41

    Row Compression Concepts

    Compression looks for repeating patterns across the entire table When pattern found, string replaced by a 12 bit symbol Symbols are stored in a dictionary for fast lookup

    Data resides compressed on pages (both on-disk and in bufferpool) Significant I/O bandwidth savings better performance Significant memory savings more efficient memory utilization Some CPU overhead costs

    Rows must be uncompressed before being processed forevaluation

  • 8/8/2019 Business Analytics on IDS-WAIUG

    31/41

    Row Compression Using a Compression Dictionary

    Dictionary contains repeated information from the rows inthe table

    Compression candidates can be across column boundariesor within columns

    A (01)220

    J200 (02) S (01)

    580

    T132 (02)

    Animated

    Slide

    PartCode SPart Quantity LotNum BinLoc Aisle

    ANCPRPLT 220J 200 Z165-3 NE132 6157

    SNCPRPLT 580T 132 Z165-3 NE132 6157

    Dictionary

    01 NCPRPLT

    02 Z165-3NE1326157

    ANCPRPLT 220J 200 Z165-3 NE132 6157 SNCPRPLT 580T 132 Z165-3 NE132 6157

    A (01) 220J 200 (02) S (01) 580T 132 (02)

  • 8/8/2019 Business Analytics on IDS-WAIUG

    32/41

    Storage savings

    Tables will often compress in the range of 60% - 80% Overall database storage savings will be between 40% and 50% Thats 50% less disk space needed to support IDS 11 database!

    81%Smaller

    78%Smaller

    Sales Table Product Table

  • 8/8/2019 Business Analytics on IDS-WAIUG

    33/41

    Performance Benefit

    Performance can be improved using compression Many queries will benefit from compression with fewer I/Os Consumes more CPU - most customers not 100% CPU bound

    40%Faster

    Lab tests show I/O boundworkloads improve by 30-40%

    Many utility (backup and recoveryfor example) will be faster

    2x as fast in some cases as thedatabase may now be the size

  • 8/8/2019 Business Analytics on IDS-WAIUG

    34/41

    IDS 11 Compression Operations

    estimate_compression Estimates compression ratio on a table

    create_dictionary Creates compression dictionary for a table

    compress

    Does implicit create_dictionary and compress all previous datauncompress

    Uncompress the table and deactivates compressionuncompress_offline

    XLOCK table and uncompress it. Also deactivates compressionpurge_dictionary

    Delete old inactive dictionaries

  • 8/8/2019 Business Analytics on IDS-WAIUG

    35/41

    Storage Optimization Operations

    repack Move rows within a table or fragment to consolidate free space

    repack_offline XLOCK the table and move rows within a table or fragment to

    consolidate free space

    shrink Return free space at end of table or fragment to the dbspace Normally done after a repack

  • 8/8/2019 Business Analytics on IDS-WAIUG

    36/41

    Compression On Data Page With Multiple Rows

    compress repack

    Uncompressed Compressed Compressed

    shrink

    Multiple

    Compressed

    Pages

    Dictionary

    Empty Data PagesAnimated

    Slide

  • 8/8/2019 Business Analytics on IDS-WAIUG

    37/41

    Admin API Interface

    All compression and storage optimization operations are invokedvia the IDS Admin API built-in UDRs

    execute function task();execute function admin();

    Exampleexecute function task

    (

    table compress repack shrink,

    table_name, database_name, owner_name

    );

  • 8/8/2019 Business Analytics on IDS-WAIUG

    38/41

    Features That Cannot Be Compressed

    Out-of-row data (e.g. blobs) Indexes Temp tables Catalog tables (Data Dictionary) Partition tables (Tablespace Tablespace) Dictionary Partitions Tables in the following databases:

    Sysmaster

    SysutilsSysuser

    SyscdrSyscdcv1

  • 8/8/2019 Business Analytics on IDS-WAIUG

    39/41

    HDR, ER, CDC (DataMirror) and Compression

    All are supported on compressed tables

    HDR Tables will be compressed on secondary iff they are

    compressed on primary

    ER

    Compression status of tables is independent between sourceand target, specified by user

    CDC Compression of targets is a function of what the target

    database supports and what use specifies

  • 8/8/2019 Business Analytics on IDS-WAIUG

    40/41

    Summary

    Storage optimization through IDS 11 compression can save40-50% of your database storage requirements

    For IO-bound workloads Compression can also improveperformance

    You not only see your online database shrink but often moreimportantly, your backup storage and disaster recovery storage iscut in half as well

    In real customer examples storage savings are realized andperformance benefits are apparent

    Add in the time savings with utilities processing (particularlydatabase backup and recover time is cut in half) and you can see

    the benefits of IDS 11 compression

  • 8/8/2019 Business Analytics on IDS-WAIUG

    41/41