oracle tca dqm

17
Data Quality Management (DQM) Oracle TCA Ensure and Maintain Data Quality in TCA Registry White Paper June 2004 Ramakrishna Goud, Accenture, IDC [email protected]

Upload: anandg7720

Post on 24-Mar-2015

628 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Oracle Tca Dqm

Data Quality Management (DQM) Oracle TCA

Ensure and Maintain Data Quality in TCA Registry

White Paper June 2004

Ramakrishna Goud, Accenture, IDC [email protected]

Page 2: Oracle Tca Dqm

Data Quality Management (DQM) Oracle Trading Community Architecture

Executive Overview

Oracle Trading Community Architecture (TCA) is a data model which consists of information about parties and customers. The TCA registry is a single repository across Oracle ERP and CRM Applications. It is summation of all entities that are related to each other. The key entity is Party, which can be of type �Person�, �Organization� or �Relationship�. Party may contain party information like �Address�, �Location�, �Contact Points� etc.

Oracle Trading Community Architecture Implementation allows you to maintain the relationship between different entities. It is global data schema between Oracle ERP and CRM Applications. As TCA is a single repository accessed by multiple applications, the accuracy and quality of data play an important role in transactions. The inconsistency and redundancy of data in TCA registry will affect the performance and efficiency while processing party information.

The document emphasizes on structural guidance, the way of implementing Oracle TCA Data Quality Management functionality, which prevents and eliminates duplicate parties in TCA registry.

This document is primarily targeted at functional consultants, who are working with TCA as part of their Oracle E-Business Suite Implementation.

Page 3: Oracle Tca Dqm

Introduction Trading Community Model The below diagram depicts the relationship among the entities of Trading Community. Refer to the Oracle Trading Community Architecture Administration Guide for complete information.

Email

Phone

URL

Electronic Contact Point

Classification

ClassificationCode

Code Assignment

Sub Code

Internal Organization

Account

Party Account Role Account Relationship

Account SiteRelationship Type

Party Relationship

Relationship Type GroupContact Preferences

Party

Person Organization Group

Party Site Use

Party Site

Physical Contact Point / Location

Entity Relationship Diagram

Email

Phone

URL

Electronic Contact Point

Classification

ClassificationCode

Code Assignment

Sub Code

Internal Organization

Account

Party Account Role Account Relationship

Account SiteRelationship Type

Party Relationship

Relationship Type GroupContact Preferences

Party

Person Organization Group

Party Site Use

Party Site

Physical Contact Point / Location

Email

Phone

URL

Electronic Contact Point

EmailEmail

PhonePhone

URLURL

Electronic Contact Point

Classification

ClassificationCode

Code Assignment

Sub Code

Classification

ClassificationCode

Code Assignment

Sub Code

Internal Organization

Account

Party Account Role Account Relationship

Account Site

Internal Organization

Account

Party Account Role Account Relationship

Account SiteRelationship Type

Party Relationship

Relationship Type Group

Relationship Type

Party Relationship

Relationship Type GroupContact PreferencesContact Preferences

Party

Person Organization Group

Party

PersonPerson OrganizationOrganization GroupGroup

Party Site Use

Party Site

Physical Contact Point / Location

Party Site Use

Party Site

Party Site Use

Party Site

Physical Contact Point / Location

Entity Relationship Diagram

Page 4: Oracle Tca Dqm

The Need The party information inside TCA registry can be duplicated for many reasons like incomplete data, typographical errors, conversion from external systems or spelling mistakes. This duplicate party information may reduce performance and efficiency of the transaction processing and further lead to several business critical issues. The Solution To prevent, maintain and eliminate duplicate party information inside TCA registry, the Data Quality Management functionality need to be implemented. This DQM functionality is part of Oracle Trading Community Architecture which maintains party and customer information free of duplicates. It would also help in performing powerful searches on the parties. After identifying the duplicates, it passes the information to merge program, which actually eliminates the duplicates from the TCA registry.

Page 5: Oracle Tca Dqm

Data Quality Management (DQM) Process The Data Quality Management functionality allows duplicate data to be easily identified and passed to the merge program. How DQM works? The below diagram illustrates how the different features of DQM functionality work together.

Page 6: Oracle Tca Dqm

The TCA registry contains the party information, which could have been duplicated. When you run the DQM staging program, it transforms the attribute values like Party Name, Party Number etc into a staging schema. Each attribute represents a table column in the TCA repository. The staged schema also stores attributes and transformation functions used with the attributes. The staged schema is separate schema from original registry and contains attribute values (Party Name = �XYZ Corporation�) Then when you run the duplicate identification program or search for a party, the attribute values of the newly entered records will be converted to transformations using a match rule. The attribute values on the newly entered record will be compared against the attribute values in the staged schema. Then the duplicates are identified based on the match rule, attribute match and score.

Page 7: Oracle Tca Dqm

DQM Setup Steps The below diagram illustrates the setup steps to be performed while implementing the DQM functionality

Each step in the above diagram is briefly explained below. Refer to the Oracle Trading Community Architecture Data Quality Management Guide for complete information on the setup.

Page 8: Oracle Tca Dqm

Define attributes and transformation functions Navigation: Trading Community Manager ! Data Quality Management ! Setup ! Attributes and Transformation Functions DQM uses attributes which are part of TCA registry. The attribute categories include Party, Address, Contact and Contact Points. Each attribute represents a table column from TCA repository. In this setup you can define the attributes which will be used in searching the duplicates. For example �Date of Birth� can be defined as attribute which will be used to identify duplicate parties with the same �Date of Birth� You can define custom attributes if seeded attributes fail to satisfy the business needs while searching for duplicate parties.

Page 9: Oracle Tca Dqm

Define word replacements (Optional Setup)

Navigation: Trading Community Manager ! Data Quality Management ! Setup ! Word Replacements To reduce inconsistency within the party information, you can define word replacements and use in transformation functions. Primarily the word replacement will help in identifying the words, which act like synonyms. Oracle provided seeded word replacement lists as part of DQM which can be used in transformation functions.

Page 10: Oracle Tca Dqm

Define and compile match rules

Navigation: Trading Community Manager ! Data Quality Management ! Setup ! Match Rules In this setup, you can define Match Rule which determines whether a particular party is a duplicate or potential duplicate. Duplicate Identification Program internally uses the match rule to identify the duplicates. Primarily match rule works in Score matching which is mainly divided into two parts. 1. Acquisition (Provides the initial criteria to filter the records) 2. Score (Assigns score to each attribute) While defining match rule, you have to identify the key attributes and assign the score accordingly. A record is identified as a duplicate when the attribute score is greater than or equal to threshold value. After match rule is defined, it should be compiled. The �DQM Match Rule for Duplicate Identification� profile option is used to setup the match rule, which will be used while searching for Duplicates.

Page 11: Oracle Tca Dqm

Run the DQM staging program to create the staged schema and intermedia indexes You have to run the DQM staging program to create/update staged schema which consists of attribute values and transformation functions. The intermedia index on the staged schema will help in fast search for the duplicates. Run the DQM index optimisation program, this should be scheduled to run periodically Run the DQM index optimisation program to improve the performance of the intermedia indexing. Run the DQM synchronization program, this should be scheduled to run periodically Run the DQM synchronization program to synchronize the data between staged schema, intemedia index and TCA registry. Running this program frequently will ensure that the changes/updations to TCA registry are reflected in staged schema and intermedia index Note: Oracle workflow listener can be used to synchronize the data between staged schema and TCA registry. Run the �Workflow Agent Listener Concurrent Program� periodically to automate the synchronization between the staged schema and the TCA registry.

Page 12: Oracle Tca Dqm

Define Custom attributes, and then rerun the DQM staging program (Optional Setup) Navigation: Trading Community Manager ! Data Quality Management ! Setup ! Attributes and Transformation Functions You can use the �Attributes and Transformation Functions� screen to define the Custom Attributes. Custom Attributes are used when business needs doesn�t match with the seeded Attributes provided by the Oracle. Follow the below steps to create a custom attribute 1. Write a PL/SQL function with the signature. The p_record_id in

the function is the primary key of the table from which the attribute value is picked. The values for the p_record_id could be PARTY_ID, PARTY_SITE_ID etc.

FUNCTION <custom_attribute_proc> ( p_record_id IN NUMBER, p_entity_name IN VARCHAR2, p_attrib_name IN VARCHAR2)

RETURN VARCHAR2;

2. In the Attributes and Transformation Functions window, select one CUSTOMATTRBUTE and enter the PL/SQL function name against it in �Custom Procedure� field.

3. You have to run the �DQM Staging Program� again after the custom attributes are defined

Page 13: Oracle Tca Dqm

Define and submit duplicate identification batch

Navigation: Trading Community Manager ! Data Quality Management ! Duplicate Identification ! Batch Definition Duplicate Identification Batch program is defined to identify the duplicates in TCA registry and this should be defined after transformation functions, match rules, and staged schema. After submission, the batch program will identify the duplicate parties which can be viewed from Duplicate Identification: Batch Review screen.

Page 14: Oracle Tca Dqm

Review the DQM batch duplication identification program Navigation: Trading Community Manager ! Data Quality Management ! Duplicate Identification ! Batch Definition

The review batch window is used to create merge batches. This window displays the potential duplicate records identified in DQM Duplicate Identification Program

Page 15: Oracle Tca Dqm

Create a merge batch for merging

Navigation: Trading Community Manager ! Data Quality Management ! Duplicate Identification ! Batch Definition After the merge batches are created, they can be submitted for merging. In TCA, merging feature performs the actual merging of the parties. After merging, the merged party may not exist in the TCA registry.

Page 16: Oracle Tca Dqm

Using DQM API in Custom Programs DQM can also be implemented to prevent duplicate data entry into TCA registry. This section targets technical consultants who would like to implement DQM and search/identify duplicates. This API can be called from custom programs and the internal API; HZ_PARTY_SEARCH can be used to achieve this. The below given example shows how to search duplicate parties evaluating PARTY_NAME and PARTY_TYPE attributes. This example can be extended to search addresses, contact and contact points with different attributes. Note that this is not public API; so using this API is not recommended. Follow the below steps to search or prevent duplicates 1. Identify the logical entities and attributes of the input record For example, Logical Entity = PARTY Attribute1 = PARTY_NAME = �XYZ CORPORATION� Attribute2 = PARTY_TYPE = �ORGANIZATION� 2. Declare party record variable l_party_rec HZ_PARTY_SEARCH.party_search_rec_type 3. Assign attribute values to the party type record variable l_party_rec.party_type = �ORGANIZATION� l_party_rec-.party_name = �XYZ CORPORATION�; 4. Call HZ_PARTY_SEARCH.find_parties sending the party record

variable as parameter 5. Check the value of out parameter num_matches

6. If number of matches is greater than zero, then retrieve the

duplicate party_id�s from the temporary table HZ_MATCHED_PARTIES_GT

Page 17: Oracle Tca Dqm

Conclusion

As the volume of parties and customer data increases inside the Oracle TCA registry, it is prone to incomplete and duplicate information. Hence Implementing DQM will help in successful run of the business by ensuring that the quality of data is maintained, which is correct and accurate.