07 joiner agg sorter
DESCRIPTION
07 Joiner Agg Sorter IPCTRANSCRIPT
-
PowerCenter 9.x Level I Developer Lab Guide 7 - 1
Lab 7
Heterogeneous Join,
Aggregator, & Sorter
Lab at a Glance ..............................................................2
Objectives........................................................................2 Summary .........................................................................2 Duration...........................................................................3
Exercises........................................................................4
Exercise 1: Create the Mapping .......................................4 Exercise 2: Create and Run the Workflow....................... 11
Reference ..................................................................... 14
-
Lab 7. Heterogeneous Join, Aggregator, & Sorter
7 - 2 PowerCenter 9.x Level I Developer Lab Guide
Lab at a Glance
The exercises in this lab are designed to walk the student
through the process of using the Joiner transformation to join data from heterogeneous sources. The student will also learn how to use the Aggregator & Sorter transformations.
Objectives
After completing the lab, the student will be able to:
Perform a heterogeneous join using the Joiner
transformation.
Use the Sorter transformation.----ORDER BY CLAUSE
Aggregate data using the Aggregator transformation.---
GROUP BY
Use the sorted input property of the Aggregator
transformation.
Summary
The purpose of this lab is to populate an ODS table by loading
data from a flat file and a relational table. The flat file contains information about orders placed with a vendor for products and supplies.
An example of data from the flat file follows:
The product table contains product data, such as the make and
model name, vendor ID, and cost. An example of data from the product table follows:
-
Lab 7. Heterogeneous Join, Aggregator, & Sorter
PowerCenter 9.x Level I Developer Lab Guide 7 - 3
The goal of this lab is to load an ODS table with the costs
summarized by date, product and vendor. This will be the raw data that will be used to populate the fact table.
Assuming there is a large amount of data broken down into several groups, the data will flow through the mapping much
faster if the data is sorted. Since the data is not coming only from a relational source, grouping the data in the Source Qualifier will serve very little purpose.
SOURCES: PRODUCT,
ORDER flat file
TARGET: ODS_ORDER_AMOUNT
The completed mapping should look as follows:
Duration
This lab should take approximately 45 minutes.
-
Lab 7. Heterogeneous Join, Aggregator, & Sorter
7 - 4 PowerCenter 9.x Level I Developer Lab Guide
Exercises
Exercise 1: Create the Mapping
Step 1. Import the sources.
Clear the Source Analyzer workspace (right-click anywhere in the workspace and select Clear All).
Continue to work in the assigned student folder and import the tab delimited flat file, ORDER.txt.
In the wizard step 1, check the Import Field Names From First Line checkbox because the first row includes the
field names.
In the wizard step 2, Delimiters area, check Tab and
uncheck Comma.
Import the relational table, PRODUCT from the SDBU
database schema.
The sources should look as follows:
Save your work.
Step 2. Import the target.
Switch to the Warehouse Designer tool.
Clear the workspace. (Right-click anywhere in the
workspace and select Clear All).
Import the relational database target,
ODS_ORDER_AMOUNT, from the TDBUxx database schema.
Save your work.
Step 3. Create a mapping.
Create a mapping called m_ODS_ORDER_AMOUNT_xx.
-
Lab 7. Heterogeneous Join, Aggregator, & Sorter
PowerCenter 9.x Level I Developer Lab Guide 7 - 5
Step 4. Add sources and target.
Remember that each source must hav e its
ow n Source Qualifier. If they do not, then
they w ill hav e to be created manually .
Note that as more objects are added to the
mapping, the Nav igator and Output
w indows can be toggled off, prov iding
more room in the w orkspace:
Add the ORDER and PRODUCT source definitions with their
respective Source Qualifiers to the mapping:
Add the target definition, ODS_ORDER _AMOUNT:
Save the mapping.
-
Lab 7. Heterogeneous Join, Aggregator, & Sorter
7 - 6 PowerCenter 9.x Level I Developer Lab Guide
Step 5. Create a joiner transformation.
The Joiner transformation can join data
from tw o related heterogeneous sources
that reside in different locations or file
sy stems.
One of the tw o sources in each Joiner
must be deemed the Master and the other
w ill become the Detail by default.
Refer to the Reference section at the end
of this Lab for more information about
selecting Master and Detail sources.
Create a Joiner transformation and name it
jnr_ODS_ORDER_AMOUNT.
Copy/link the following ports to jnr_ODS_ORDER_AMOUNT:
From sq_ORDER, add ORDER_DATE, PRODUCT and
QUANTITY.
From sq_PRODUCT, add PRODUCT_CODE,
VENDOR_ID, PRICE and COST.
Edit jnr_ODS_ORDER_AMOUNT. On the Ports tab, select
the ports from sq_PRODUCT as the master ports by checking the M boxes.
On the Ports tab, increase size of the PRODUCT port to a
precision of 10.
On the Condition tab, add a condition where
PRODUCT_CODE = PRODUCT by clicking on the Add a
New Condition button.
On the Properties tab, confirm the Join Type is Normal Join.
The Joiner transformation should appear as follows:
Save the mapping.
Step 6. Create an expression transformation.
Create an Expression transformation called
exp_ORDER_DATE.
-
Lab 7. Heterogeneous Join, Aggregator, & Sorter
PowerCenter 9.x Level I Developer Lab Guide 7 - 7
Copy/link the ORDER_DATE port from the
jnr_ODS_ORDER_AMOUNT to exp_ORDER_DATE.
Rename the ORDER_DATE port to ORDER_DATE_in and make it input only.
Add a new port called ORDER_DATE_out with the datatype date/time.
Make ORDER_DATE_out an output only port.
Create an expression for the ORDER_DATE_out port as follows:
TO_DATE(ORDER_DATE_in, 'DD-MON-
YYYY')
The Expression transformation should look as follows:
Step 7. Create a Sorter transformation.
Create a Sorter transformation and name it
srt_ODS_ORDER_AMOUNT.
Copy/link the ORDER_DATE_out port from
exp_ORDER_DATE.
Copy/link the QUANTITY, PRODUCT_CODE, VENDOR_ID,
PRICE and COST ports from jnr_ODS_ORDER_AMOUNT.
Rename ORDER_DATE_out to ORDER_DATE.
Check the key checkbox for the ORDER_DATE,
PRODUCT_CODE and VENDOR_ID ports. Be certain the ports are in that order.
The mapping should look as follows:
-
Lab 7. Heterogeneous Join, Aggregator, & Sorter
7 - 8 PowerCenter 9.x Level I Developer Lab Guide
Click OK.
Save the repository.
Step 8. Create an Aggregator transformation.
Create an Aggregator transformation and name it agg_ODS_ORDER_AMOUNT.
Copy/link all ports from srt_ODS_ORDER_AMOUNT to agg_ODS_ORDER_AMOUNT.
Edit agg_ODS_ORDER_AMOUNT.
It is important to specify how to group the
data w hen doing aggregate calculations.
The group by ports can be input, output or
v ariable ports.
The order of the ports from top to bottom
determines the group by order.
On the Ports tab, group the data by:
ORDER_DATE
PRODUCT_CODE
VENDOR_ID
On the Ports tab, append _in to the QUANTITY port.
The end result will be QUANTITY_in.
Position the QUANTITY_in port after the ORDER_DATE
port.
Make this an input only port by turning off the Output
port checkbox.
On the Ports tab, create an output port (after
QUANTITY_in) called QUANTITY_out with a data type of integer and precision 10.
Add the following expression to the QUANTITY_out port:
SUM(QUANTITY_in)
-
Lab 7. Heterogeneous Join, Aggregator, & Sorter
PowerCenter 9.x Level I Developer Lab Guide 7 - 9
Under the Properties tab, check the Sorted Input box.
Click OK.
The mapping should look as follows
Save the mapping.
Step 9. Link the target definition.
Use the Autolink feature to link from
agg_ODS_ORDER_AMOUNT to ODS_ORDER_AMOUNT.
In the Autolink dialog box, click the More>> button.
All of the port names are the same as the target except
QUANTITY_out. Autolink by name, using _out in the From Transformation suffix field:
-
Lab 7. Heterogeneous Join, Aggregator, & Sorter
7 - 10 PowerCenter 9.x Level I Developer Lab Guide
Click the OK button.
The linking should look as follows:
Step 10. Save and validate the mapping.
Use the Arrange All Iconic feature. The completed mapping should look as follows:
Save the mapping and check for validation information
on the Save tab in the Output window:
-
Lab 7. Heterogeneous Join, Aggregator, & Sorter
PowerCenter 9.x Level I Developer Lab Guide 7 - 11
Exercise 2: Create and Run the Workflow
Step 1. Create a workflow.
Create a new workflow and name it
wf_ODS_ORDER_AMOUNT_xx.
Create a session task called s_m_ODS_ORDER_AMOUNT_xx.
Edit the session and select the Mapping tab.
In the Navigation box, select the source SQ_ORDER.
Under Properties for ORDER: select File Reader; for
Source filename enter ORDER.txt.
In the Navigation box, select the source SQ_PRODUCT.
Under Connections, click on the down arrow , select
native_source and click OK.
In the Navigation box, select the target
ODS_ORDER_AMOUNT.
Under Connections, click on the down arrow , select
native_target_xx and click OK.
Under Properties, the Target load type should be
defaulted to Normal. Scroll down to select the Truncate target table option.
Click OK to close the Edit Tasks dialog box.
Save the repository.
Link Start to s_m_ODS_ORDER_AMOUNT_xx.
-
Lab 7. Heterogeneous Join, Aggregator, & Sorter
7 - 12 PowerCenter 9.x Level I Developer Lab Guide
Save, validate and start wf_ODS_ORDER_AMOUNT_xx.
Monitor and review the results for
s_m_ODS_ORDER_AMOUNT_xx in the Workflow Monitor.
Step 2. Verify results session properties.
Step 3. Verify results session transformation
statistics.
Step 4. Verify results preview data (in Designer).
Note that only the first few row s are show n
here.
-
Lab 7. Heterogeneous Join, Aggregator, & Sorter
PowerCenter 9.x Level I Developer Lab Guide 7 - 13
-
Lab 7. Heterogeneous Join, Aggregator, & Sorter
7 - 14 PowerCenter 9.x Level I Developer Lab Guide
Reference
Only the sorted Joiner can link two flows arising from the same
Source Qualifier.
To join more than two flows, multiple Joiners must be nested
the results from one Joiner must be passed on to the next and so forth.
The Joiner needs a minimum of two ports, one from each input
flow, to create the join condition. Those two ports must have compatible data types and precisions for the join condition to be
valid.
In a Joiner, one input flow must be designated as the master,
and the other as the detail. Master input ports are cached in memory choose the flow with the least duplicate rows as the master. Specify the master input ports by checking the M attribute (unchecked input ports will be detail).