07 joiner agg sorter

14
PowerCenter 9.x Level I Developer Lab Guide 7 - 1 Lab 7 Heterogeneous Join, Aggregator, & Sorter Lab at a Glance ..............................................................2 Objectives ........................................................................2 Summary .........................................................................2 Duration...........................................................................3 Exercises........................................................................4 Exercise 1: Create the Mapping .......................................4 Exercise 2: Create and Run the Workflow....................... 11 Reference ..................................................................... 14

Upload: dulce-vallejo

Post on 08-Sep-2015

231 views

Category:

Documents


1 download

DESCRIPTION

07 Joiner Agg Sorter IPC

TRANSCRIPT

  • PowerCenter 9.x Level I Developer Lab Guide 7 - 1

    Lab 7

    Heterogeneous Join,

    Aggregator, & Sorter

    Lab at a Glance ..............................................................2

    Objectives........................................................................2 Summary .........................................................................2 Duration...........................................................................3

    Exercises........................................................................4

    Exercise 1: Create the Mapping .......................................4 Exercise 2: Create and Run the Workflow....................... 11

    Reference ..................................................................... 14

  • Lab 7. Heterogeneous Join, Aggregator, & Sorter

    7 - 2 PowerCenter 9.x Level I Developer Lab Guide

    Lab at a Glance

    The exercises in this lab are designed to walk the student

    through the process of using the Joiner transformation to join data from heterogeneous sources. The student will also learn how to use the Aggregator & Sorter transformations.

    Objectives

    After completing the lab, the student will be able to:

    Perform a heterogeneous join using the Joiner

    transformation.

    Use the Sorter transformation.----ORDER BY CLAUSE

    Aggregate data using the Aggregator transformation.---

    GROUP BY

    Use the sorted input property of the Aggregator

    transformation.

    Summary

    The purpose of this lab is to populate an ODS table by loading

    data from a flat file and a relational table. The flat file contains information about orders placed with a vendor for products and supplies.

    An example of data from the flat file follows:

    The product table contains product data, such as the make and

    model name, vendor ID, and cost. An example of data from the product table follows:

  • Lab 7. Heterogeneous Join, Aggregator, & Sorter

    PowerCenter 9.x Level I Developer Lab Guide 7 - 3

    The goal of this lab is to load an ODS table with the costs

    summarized by date, product and vendor. This will be the raw data that will be used to populate the fact table.

    Assuming there is a large amount of data broken down into several groups, the data will flow through the mapping much

    faster if the data is sorted. Since the data is not coming only from a relational source, grouping the data in the Source Qualifier will serve very little purpose.

    SOURCES: PRODUCT,

    ORDER flat file

    TARGET: ODS_ORDER_AMOUNT

    The completed mapping should look as follows:

    Duration

    This lab should take approximately 45 minutes.

  • Lab 7. Heterogeneous Join, Aggregator, & Sorter

    7 - 4 PowerCenter 9.x Level I Developer Lab Guide

    Exercises

    Exercise 1: Create the Mapping

    Step 1. Import the sources.

    Clear the Source Analyzer workspace (right-click anywhere in the workspace and select Clear All).

    Continue to work in the assigned student folder and import the tab delimited flat file, ORDER.txt.

    In the wizard step 1, check the Import Field Names From First Line checkbox because the first row includes the

    field names.

    In the wizard step 2, Delimiters area, check Tab and

    uncheck Comma.

    Import the relational table, PRODUCT from the SDBU

    database schema.

    The sources should look as follows:

    Save your work.

    Step 2. Import the target.

    Switch to the Warehouse Designer tool.

    Clear the workspace. (Right-click anywhere in the

    workspace and select Clear All).

    Import the relational database target,

    ODS_ORDER_AMOUNT, from the TDBUxx database schema.

    Save your work.

    Step 3. Create a mapping.

    Create a mapping called m_ODS_ORDER_AMOUNT_xx.

  • Lab 7. Heterogeneous Join, Aggregator, & Sorter

    PowerCenter 9.x Level I Developer Lab Guide 7 - 5

    Step 4. Add sources and target.

    Remember that each source must hav e its

    ow n Source Qualifier. If they do not, then

    they w ill hav e to be created manually .

    Note that as more objects are added to the

    mapping, the Nav igator and Output

    w indows can be toggled off, prov iding

    more room in the w orkspace:

    Add the ORDER and PRODUCT source definitions with their

    respective Source Qualifiers to the mapping:

    Add the target definition, ODS_ORDER _AMOUNT:

    Save the mapping.

  • Lab 7. Heterogeneous Join, Aggregator, & Sorter

    7 - 6 PowerCenter 9.x Level I Developer Lab Guide

    Step 5. Create a joiner transformation.

    The Joiner transformation can join data

    from tw o related heterogeneous sources

    that reside in different locations or file

    sy stems.

    One of the tw o sources in each Joiner

    must be deemed the Master and the other

    w ill become the Detail by default.

    Refer to the Reference section at the end

    of this Lab for more information about

    selecting Master and Detail sources.

    Create a Joiner transformation and name it

    jnr_ODS_ORDER_AMOUNT.

    Copy/link the following ports to jnr_ODS_ORDER_AMOUNT:

    From sq_ORDER, add ORDER_DATE, PRODUCT and

    QUANTITY.

    From sq_PRODUCT, add PRODUCT_CODE,

    VENDOR_ID, PRICE and COST.

    Edit jnr_ODS_ORDER_AMOUNT. On the Ports tab, select

    the ports from sq_PRODUCT as the master ports by checking the M boxes.

    On the Ports tab, increase size of the PRODUCT port to a

    precision of 10.

    On the Condition tab, add a condition where

    PRODUCT_CODE = PRODUCT by clicking on the Add a

    New Condition button.

    On the Properties tab, confirm the Join Type is Normal Join.

    The Joiner transformation should appear as follows:

    Save the mapping.

    Step 6. Create an expression transformation.

    Create an Expression transformation called

    exp_ORDER_DATE.

  • Lab 7. Heterogeneous Join, Aggregator, & Sorter

    PowerCenter 9.x Level I Developer Lab Guide 7 - 7

    Copy/link the ORDER_DATE port from the

    jnr_ODS_ORDER_AMOUNT to exp_ORDER_DATE.

    Rename the ORDER_DATE port to ORDER_DATE_in and make it input only.

    Add a new port called ORDER_DATE_out with the datatype date/time.

    Make ORDER_DATE_out an output only port.

    Create an expression for the ORDER_DATE_out port as follows:

    TO_DATE(ORDER_DATE_in, 'DD-MON-

    YYYY')

    The Expression transformation should look as follows:

    Step 7. Create a Sorter transformation.

    Create a Sorter transformation and name it

    srt_ODS_ORDER_AMOUNT.

    Copy/link the ORDER_DATE_out port from

    exp_ORDER_DATE.

    Copy/link the QUANTITY, PRODUCT_CODE, VENDOR_ID,

    PRICE and COST ports from jnr_ODS_ORDER_AMOUNT.

    Rename ORDER_DATE_out to ORDER_DATE.

    Check the key checkbox for the ORDER_DATE,

    PRODUCT_CODE and VENDOR_ID ports. Be certain the ports are in that order.

    The mapping should look as follows:

  • Lab 7. Heterogeneous Join, Aggregator, & Sorter

    7 - 8 PowerCenter 9.x Level I Developer Lab Guide

    Click OK.

    Save the repository.

    Step 8. Create an Aggregator transformation.

    Create an Aggregator transformation and name it agg_ODS_ORDER_AMOUNT.

    Copy/link all ports from srt_ODS_ORDER_AMOUNT to agg_ODS_ORDER_AMOUNT.

    Edit agg_ODS_ORDER_AMOUNT.

    It is important to specify how to group the

    data w hen doing aggregate calculations.

    The group by ports can be input, output or

    v ariable ports.

    The order of the ports from top to bottom

    determines the group by order.

    On the Ports tab, group the data by:

    ORDER_DATE

    PRODUCT_CODE

    VENDOR_ID

    On the Ports tab, append _in to the QUANTITY port.

    The end result will be QUANTITY_in.

    Position the QUANTITY_in port after the ORDER_DATE

    port.

    Make this an input only port by turning off the Output

    port checkbox.

    On the Ports tab, create an output port (after

    QUANTITY_in) called QUANTITY_out with a data type of integer and precision 10.

    Add the following expression to the QUANTITY_out port:

    SUM(QUANTITY_in)

  • Lab 7. Heterogeneous Join, Aggregator, & Sorter

    PowerCenter 9.x Level I Developer Lab Guide 7 - 9

    Under the Properties tab, check the Sorted Input box.

    Click OK.

    The mapping should look as follows

    Save the mapping.

    Step 9. Link the target definition.

    Use the Autolink feature to link from

    agg_ODS_ORDER_AMOUNT to ODS_ORDER_AMOUNT.

    In the Autolink dialog box, click the More>> button.

    All of the port names are the same as the target except

    QUANTITY_out. Autolink by name, using _out in the From Transformation suffix field:

  • Lab 7. Heterogeneous Join, Aggregator, & Sorter

    7 - 10 PowerCenter 9.x Level I Developer Lab Guide

    Click the OK button.

    The linking should look as follows:

    Step 10. Save and validate the mapping.

    Use the Arrange All Iconic feature. The completed mapping should look as follows:

    Save the mapping and check for validation information

    on the Save tab in the Output window:

  • Lab 7. Heterogeneous Join, Aggregator, & Sorter

    PowerCenter 9.x Level I Developer Lab Guide 7 - 11

    Exercise 2: Create and Run the Workflow

    Step 1. Create a workflow.

    Create a new workflow and name it

    wf_ODS_ORDER_AMOUNT_xx.

    Create a session task called s_m_ODS_ORDER_AMOUNT_xx.

    Edit the session and select the Mapping tab.

    In the Navigation box, select the source SQ_ORDER.

    Under Properties for ORDER: select File Reader; for

    Source filename enter ORDER.txt.

    In the Navigation box, select the source SQ_PRODUCT.

    Under Connections, click on the down arrow , select

    native_source and click OK.

    In the Navigation box, select the target

    ODS_ORDER_AMOUNT.

    Under Connections, click on the down arrow , select

    native_target_xx and click OK.

    Under Properties, the Target load type should be

    defaulted to Normal. Scroll down to select the Truncate target table option.

    Click OK to close the Edit Tasks dialog box.

    Save the repository.

    Link Start to s_m_ODS_ORDER_AMOUNT_xx.

  • Lab 7. Heterogeneous Join, Aggregator, & Sorter

    7 - 12 PowerCenter 9.x Level I Developer Lab Guide

    Save, validate and start wf_ODS_ORDER_AMOUNT_xx.

    Monitor and review the results for

    s_m_ODS_ORDER_AMOUNT_xx in the Workflow Monitor.

    Step 2. Verify results session properties.

    Step 3. Verify results session transformation

    statistics.

    Step 4. Verify results preview data (in Designer).

    Note that only the first few row s are show n

    here.

  • Lab 7. Heterogeneous Join, Aggregator, & Sorter

    PowerCenter 9.x Level I Developer Lab Guide 7 - 13

  • Lab 7. Heterogeneous Join, Aggregator, & Sorter

    7 - 14 PowerCenter 9.x Level I Developer Lab Guide

    Reference

    Only the sorted Joiner can link two flows arising from the same

    Source Qualifier.

    To join more than two flows, multiple Joiners must be nested

    the results from one Joiner must be passed on to the next and so forth.

    The Joiner needs a minimum of two ports, one from each input

    flow, to create the join condition. Those two ports must have compatible data types and precisions for the join condition to be

    valid.

    In a Joiner, one input flow must be designated as the master,

    and the other as the detail. Master input ports are cached in memory choose the flow with the least duplicate rows as the master. Specify the master input ports by checking the M attribute (unchecked input ports will be detail).