dq 901hf2 repositorymigrationguide en

Upload: evrim-ay

Post on 07-Jul-2018

214 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/18/2019 DQ 901HF2 RepositoryMigrationGuide En

    1/29

    Informatica Data Quality (Version 9.0.1 HotFix 2)

    Repository Migration uide

  • 8/18/2019 DQ 901HF2 RepositoryMigrationGuide En

    2/29

    Informatica Data Quality Repository Migration Guide

    Version 9.0.1 HotFix 2January 2015

    Copyright (c) 2010-2015 Informatica Corporation. All rights reserved.

    This software and documentation contain proprietary information of Informatica Corporation and are provided under a license agreement containing restrictions on useand disclosure and are also protected by copyright law. Reverse engineering of the software is prohibited. No part of this document may be reproduced or transmitted inany form, by any means (electronic, photocopying, recording or otherwise) without prior consent of Informatica Corporation. This Software may be protected by U.S.and/or international Patents and other Patents Pending.

    Use, duplication, or disclosure of the Software by the U.S. Government is subject to the restrictions set forth in the applicable software license agreement and asprovided in DFARS 227.7202-1(a) and 227.7702-3(a) (1995), DFARS 252.227-7013©(1)(ii) (OCT 1988), FAR 12.212(a) (1995), FAR 52.227-19, or FAR 52.227-14

    (ALT III), as applicable.

    The information in this product or documentation is subject to change without notice. If you find any problems in this product or documentation, please report them to usin writing.

    Informatica, Informatica Platform, Informatica Data Services, PowerCenter, PowerCenterRT, PowerCenter Connect, PowerCenter Data Analyzer, PowerExchange,PowerMart, Metadata Manager, Informatica Data Quality, Informatica Data Explorer, Informatica B2B Data Transformation, Informatica B2B Data Exchange InformaticaOn Demand, Informatica Identity Resolution, Informatica Application Information Lifecycle Management, Informatica Complex Event Processing, Ultra Messaging andInformatica Master Data Management are trademarks or registered trademarks of Informatica Corporation in the United States and in jurisdictions throughout the world. All other company and product names may be trade names or trademarks of their respective owners.

    Portions of this software and/or documentation are subject to copyright held by third parties, including without limitation: Copyright DataDirect Technologies. All rightsreserved. Copyright © Sun Microsystems. All rights reserved. Copyright © RSA Security Inc. All Rights Reserved. Copyright © Ordinal Technology Corp. All rightsreserved.Copyright © Aandacht c.v. All rights reserved. Copyright Genivia, Inc. All rights reserved. Copyright Isomorphic Software. All rights reserved. Copyright © MetaIntegration Technology, Inc. All rights reserved. Copyright © Intalio. All rights reserved. Copyright © Oracle. All rights reserved. Copyright © Adobe SystemsIncorporated. All rights reserved. Copyright © DataArt, Inc. All rights reserved. Copyright © ComponentSource. All rights reserved. Copyright © Microsoft Corporation. Allrights reserved. Copyright © Rogue Wave Software, Inc. All rights reserved. Copyright © Teradata Corporation. All rights reserved. Copyright © Yahoo! Inc. All rightsreserved. Copyright © Glyph & Cog, LLC. All rights reserved. Copyright © Thinkmap, Inc. All rights reserved. Copyright © Clearpace Software Limited. All rightsreserved. Copyright © Information Builders, Inc. All rights reserved. Copyright © OSS Nokalva, Inc. All rights reserved. Copyright Edifecs, Inc. All rights reserved.Copyright Cleo Communications, Inc. All rights reserved. Copyright © International Organization for Standardization 1986. All rights reserved. Copyright © ej-

    technologies GmbH. All rights reserved. Copyright © Jaspersoft Corporation. All rights reserved. Copyright © International Business Machines Corporation. All rightsreserved. Copyright © yWorks GmbH. All rights reserved. Copyright © Lucent Technologies. All rights reserved. Copyright (c) University of Toronto. All rights reserved.Copyright © Daniel Veillard. All rights reserved. Copyright © Unicode, Inc. Copyright IBM Corp. All rights reserved. Copyright © MicroQuill Software Publishing, Inc. Allrights reserved. Copyright © PassMark Software Pty Ltd. All rights reserved. Copyright © LogiXML, Inc. All rights reserved. Copyright © 2003-2010 Lorenzi Davide, Allrights reserved. Copyright © Red Hat, Inc. All rights reserved. Copyright © The Board of Trustees of the Leland Stanford Junior University. All rights reserved. Copyright© EMC Corporation. All rights reserved. Copyright © Flexera Software. All rights reserved. Copyright © Jinfonet Software. All rights reserved. Copyright © Apple Inc. Allrights reserved. Copyright © Telerik Inc. All rights reserved. Copyright © BEA Systems. All rights reserved. Copyright © PDFlib GmbH. All rights reserved. Copyright ©

    Orientation in Objects GmbH. All rights reserved. Copyright © Tanuki Software, Ltd. All rights reserved. Copyright © Ricebridge. All rights reserved. Copyright © Sencha,Inc. All rights reserved. Copyright © Scalable Systems, Inc. All rights reserved. Copyright © jQWidgets. All rights reserved.

    This product includes software developed by the Apache Software Foundation (http://www.apache.org/), and/or other software which is licensed under various versionsof the Apache License (the "License"). You may obtain a copy of these Licenses at http://www.apache.org/licenses/. Unless required by applicable law or agreed to inwriting, software distributed under these Licenses is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express orimplied. See the Licenses for the specific language governing permissions and limitations under the Licenses.

    This product includes software which was developed by Mozilla (http://www.mozilla.org/), software copyright The JBoss Group, LLC, all rights reserved; softwarecopyright © 1999-2006 by Bruno Lowagie and Paulo Soares and other software which is licensed under various versions of the GNU Lesser General Public License Agreement, which may be found at http:// www.gnu.org/licenses/lgpl.html. The materials are provided free of charge by Informatica, "as-is", without warranty of anykind, either express or implied, including but not limited to the implied warranties of merchantability and fitness for a particular purpose.

    The product includes ACE(TM) and TAO(TM) software copyrighted by Douglas C. Schmidt and his research group at Washington University, University of California,Irvine, and Vanderbilt University, Copyright (©) 1993-2006, all rights reserved.

    This product includes software developed by the OpenSSL Project for use in the OpenSSL Toolkit (copyright The OpenSSL Project. All Rights Reserved) andredistribution of this software is subject to terms available at http://www.openssl.org and http://www.openssl.org/source/license.html.

    This product includes Curl software which is Copyright 1996-2013, Daniel Stenberg, . All Rights Reserved. Permissions and limitations regarding thissoftware are subject to terms available at http://curl.haxx.se/docs/copyright.html. Permission to use, copy, modify, and distribute this software for any purpose with orwithout fee is hereby granted, provided that the above copyright notice and this permission notice appear in all copies.

    The product includes software copyright 2001-2005 (©) MetaStuff, Ltd. All Rights Reserved. Permissions and limitations regarding this software are subject to termsavailable at http://www.dom4j.org/ license.html.

    The product includes software copyright © 2004-2007, The Dojo Foundation. All Rights Reserved. Permissions and limitations regarding this software are subject toterms available at http://dojotoolkit.org/license.

    This product includes ICU software which is copyright International Business Machines Corporation and others. All rights reserved. Permissions and limitationsregarding this software are subject to terms available at http://source.icu-project.org/repos/icu/icu/trunk/license.html.

    This product includes software copyright © 1996-2006 Per Bothner. All rights reserved. Your right to use such materials is set forth in the license which may be found at

    http:// www.gnu.org/software/ kawa/Software-License.html.

    This product includes OSSP UUID software which is Copyright © 2002 Ralf S. Engelschall, Copyright © 2002 The OSSP Project Copyright © 2002 Cable & WirelessDeutschland. Permissions and limitations regarding this software are subject to terms available at http://www.opensource.org/licenses/mit-license.php.

    This product includes software developed by Boost (http://www.boost.org/) or under the Boost software license. Permissions and limitations regarding this software aresubject to terms available at http:/ /www.boost.org/LICENSE_1_0.txt.

    This product includes software copyright © 1997-2007 University of Cambridge. Permissions and limitations regarding this software are subject to terms available athttp:// www.pcre.org/license.txt.

    This product includes software copyright © 2007 The Eclipse Foundation. All Rights Reserved. Permissions and limitations regarding this software are subject to termsavailable at http:// www.eclipse.org/org/documents/epl-v10.php and at http://www.eclipse.org/org/documents/edl-v10.php.

    This product includes software licensed under the terms at http://www.tcl.tk/software/tcltk/license.html, http://www.bosrup.com/web/overlib/?License, http://www.stlport.org/doc/ license.html, http:// asm.ow2.org/license.html, http://www.cryptix.org/LICENSE.TXT, http://hsqldb.org/web/hsqlLicense.html, http://httpunit.sourceforge.net/doc/ license.html, http://jung.sourceforge.net/license.txt , http://www.gzip.org/zlib/zlib_license.html, http://www.openldap.org/software/release/

  • 8/18/2019 DQ 901HF2 RepositoryMigrationGuide En

    3/29

    license.html, http://www.libssh2.org, http:/ /slf4j.org/license.html, http://www.sente.ch/software/OpenSourceLicense.html, http://fusesource.com/downloads/license-agreements/fuse-message-broker-v-5-3- license-agreement; http://antlr.org/license.html; http://aopalliance.sourceforge.net/; http://www.bouncycastle.org/licence.html;http://www.jgraph.com/jgraphdownload.html; http://www.jcraft.com/jsch/LICENSE.txt; http://jotm.objectweb.org/bsd_license.html; . http://www.w3.org/Consortium/Legal/2002/copyright-software-20021231; http://www.slf4j.org/license.html; http:/ /nanoxml.sourceforge.net/orig/copyright.html; http://www.json.org/license.html; http://forge.ow2.org/projects/javaservice/, http://www.postgresql.org/about/licence.html, http://www.sqlite.org/copyright.html, http://www.tcl.tk/software/tcltk/license.html, http://www.jaxen.org/faq.html, http://www.jdom.org/docs/faq.html, http://www.slf4j.org/license.html; http://www.iodbc.org/dataspace/iodbc/wiki/iODBC/License; http: //www.keplerproject.org/md5/license.html; http://www.toedter.com/en/jcalendar/license.html; http://www.edankert.com/bounce/index.html; http://www.net-snmp.org/about/license.html; http://www.openmdx.org/#FAQ; http://www.php.net/license/3_01.txt; http://srp.stanford.edu/license.txt; http://www.schneier.com/blowfish.html; http://www.jmock.org/license.html; http://xsom.java.net; http://benalman.com/about/license/; https://github.com/CreateJS/EaselJS/blob/master/src/easeljs/display/Bitmap.js;http://www.h2database.com/html/license.html#summary; http://jsoncpp.sourceforge.net/LICENSE; http:/ /jdbc.postgresql.org/license.html; http://protobuf.googlecode.com/svn/trunk/src/google/protobuf/descriptor.proto; https://github.com/rantav/hector/blob/master/LICENSE; http://web.mit.edu/Kerberos/krb5-current/doc/mitK5license.html; http://jibx.sourceforge.net/jibx-license.html; https://github.com/lyokato/libgeohash/blob/master/LICENSE; https://github.com/hjiang/jsonxx/blob/master/LICENSE; and https://code.google.com/p/lz4/.

    This product includes software licensed under the Academic Free License (http://www.opensource.org/licenses/afl-3.0.php), the Common Development and DistributionLicense (http://www.opensource.org/licenses/cddl1.php) the Common Public License (http://www.opensource.org/licenses/cpl1.0.php), the Sun Binary Code License Agreement Supplemental License Terms, the BSD License (http:// www.opensource.org/licenses/bsd-license.php), the new BSD License (http://opensource.org/licenses/BSD-3-Clause), the MIT License (http://www.opensource.org/licenses/mit-license.php), the Artistic License (http://www.opensource.org/licenses/artistic-license-1.0) and the Initial Developer’s Public License Version 1.0 (http://www.firebirdsql.org/en/initial-developer-s-public-license-version-1-0/).

    This product includes software copyright © 2003-2006 Joe WaInes, 2006-2007 XStream Committers. All rights reserved. Permissions and limitations regarding thissoftware are subject to terms available at http://xstream.codehaus.org/license.html. This product includes software developed by the Indiana University Extreme! Lab.For further information please visit http://www.extreme.indiana.edu/.

    This product includes software Copyright (c) 2013 Frank Balluffi and Markus Moeller. All rights reserved. Permissions and limitations regarding this software are subjectto terms of the MIT license.

    This Software is protected by U.S. Patent Numbers 5,794,246; 6,014,670; 6,016,501; 6,029,178; 6,032,158; 6,035,307; 6,044,374; 6,092,086; 6,208,990; 6,339,775;6,640,226; 6,789,096; 6,823,373; 6,850,947; 6,895,471; 7,117,215; 7,162,643; 7,243,110; 7,254,590; 7,281,001; 7,421,458; 7,496,588; 7,523,121; 7,584,422;7,676,516; 7,720,842; 7,721,270; 7,774,791; 8,065,266; 8,150,803; 8,166,048; 8,166,071; 8,200,622; 8,224,873; 8,271,477; 8,327,419; 8,386,435; 8,392,460;8,453,159; 8,458,230; 8,707,336; 8,886,617 and RE44,478, International Patents and other Patents Pending.

    DISCLAIMER: Informatica Corporation provides this documentation "as is" without warranty of any kind, either express or implied, including, but not limited to, the

    implied warranties of noninfringement, merchantability, or use for a particular purpose. Informatica Corporation does not warrant that this software or documentation iserror free. The information provided in this software or documentation may include technical inaccuracies or typographical errors. The information in this software anddocumentation is subject to change at any time without notice.

    NOTICES

    This Informatica product (the "Software") includes certain drivers (the "DataDirect Drivers") from DataDirect Technologies, an operating company of Progress SoftwareCorporation ("DataDirect") which are subject to the following terms and conditions:

    1.THE DATADIRECT DRIVERS ARE PROVIDED "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING BUT NOT

    LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT.

    2. IN NO EVENT WILL DATADIRECT OR ITS THIRD PARTY SUPPLIERS BE LIABLE TO THE END-USER CUSTOMER FOR ANY DIRECT, INDIRECT,

    INCIDENTAL, SPECIAL, CONSEQUENTIAL OR OTHER DAMAGES ARISING OUT OF THE USE OF THE ODBC DRIVERS, WHETHER OR NOT

    INFORMED OF THE POSSIBILITIES OF DAMAGES IN ADVANCE. THESE LIMITATIONS APPLY TO ALL CAUSES OF ACTION, INCLUDING, WITHOUT

    LIMITATION, BREACH OF CONTRACT, BREACH OF WARRANTY, NEGLIGENCE, STRICT LIABILITY, MISREPRESENTATION AND OTHER TORTS.

    Part Number: DQ-MIG-90100-HF2-0003

  • 8/18/2019 DQ 901HF2 RepositoryMigrationGuide En

    4/29

    Table of Contents

    Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

    Informatica Resources. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

    Informatica My Support Portal. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

    Informatica Documentation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

    Informatica Product Availability Matrixes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

    Informatica Web Site. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

    Informatica How-To Library. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

    Informatica Knowledge Base. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

    Informatica Support YouTube Channel. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

    Informatica Marketplace. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

    Informatica Velocity. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

    Informatica Global Customer Support. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

    Chapter 1: Introduction to Data Quality Repository Migration. . . . . . . . . . . . . . . . . . . 8Overview of Repository Migration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

    Informatica Data Quality 8.6.2 Repository Features. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

    Data Quality Plan and Mapping Comparisons. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

    Changes to Data Quality Transformations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

    Changes to Data Quality Sources and Targets. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

    Component Comparison Checklist. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

    Changes to Reference Data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

    Migration and Data Profiling. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

    Chapter 2: Migrating Repository and Reference Data. . . . . . . . . . . . . . . . . . . . . . . . . . 15

    Overview of Migration Process. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

    Migration Report Files. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

    PackageReport Status Information. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

    Migration Prerequisites. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

    Migration.Properties File Settings. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

    Finding the EDR.Port Value. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

    Database Considerations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

    Reference Table Considerations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

    Exporting Data from the Data Quality 8.6.2 Repository. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

    Exporting Data from the 8.6.2 Workbench Machine. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

    Exporting Data from the 8.6.2 Server Machine. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

    Importing Data to Informatica Data Quality 9.0.1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

    Chapter 3: Troubleshooting Migration of Data Quality Objects. . . . . . . . . . . . . . . . . 25

    Overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

    Token Labeling and Token Parsing. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

    4 Table of Contents

  • 8/18/2019 DQ 901HF2 RepositoryMigrationGuide En

    5/29

    Rule-Based Analyzer Components Without Input Fields. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

    Text Qualifier Support for Sources and Targets. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

    Empty Reference Tables. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

    Data Quality 8.6.2 Settings Not Available in Data Quality 9.0.1. . . . . . . . . . . . . . . . . . . . . . . . 28

    Changes in Address Validator Functionality. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

    Identity Match Settings. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

    Partial Support for the Context Parser Merge Option. . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

    Substring Dictionary Labeling. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

    Table of Contents 5

  • 8/18/2019 DQ 901HF2 RepositoryMigrationGuide En

    6/29

    Preface

    The Informatica Data Quality R epository Migration Guide is written for data quality developers. This guide

    assumes that you have an understanding of data quality concepts, flat file and relational database concepts,

    and the database engines in your environment. This guide also assumes that you are familiar with the

    concepts presented in the Informatica Developer User Guide.

    Informatica Resources

    Informatica My Support Portal

     As an Informatica customer, you can access the Informatica My Support Portal at

    http://mysupport.informatica.com .

    The site contains product information, user group information, newsletters, access to the Informatica

    customer support case management system (ATLAS), the Informatica How-To Library, the Informatica

    Knowledge Base, Informatica Product Documentation, and access to the Informatica user community.

    Informatica Documentation

    The Informatica Documentation team makes every effort to create accurate, usable documentation. If you

    have questions, comments, or ideas about this documentation, contact the Informatica Documentation team

    through email at [email protected] . We will use your feedback to improve our

    documentation. Let us know if we can contact you regarding your comments.

    The Documentation team updates documentation as needed. To get the latest documentation for your

    product, navigate to Product Documentation from http://mysupport.informatica.com .

    Informatica Product Availability Matrixes

    Product Availability Matrixes (PAMs) indicate the versions of operating systems, databases, and other typesof data sources and targets that a product release supports. You can access the PAMs on the Informatica My

    Support Portal at https://mysupport.informatica.com/community/my-support/product-availability-matrices .

    Informatica Web Site

    You can access the Informatica corporate web site at http://www.informatica.com . The site contains

    information about Informatica, its background, upcoming events, and sales offices. You will also find product

    and partner information. The services area of the site includes important information about technical support,

    training and education, and implementation services.

    6

    https://mysupport.informatica.com/community/my-support/product-availability-matriceshttps://mysupport.informatica.com/community/my-support/product-availability-matriceshttp://www.informatica.com/https://mysupport.informatica.com/community/my-support/product-availability-matriceshttp://mysupport.informatica.com/mailto:[email protected]://mysupport.informatica.com/

  • 8/18/2019 DQ 901HF2 RepositoryMigrationGuide En

    7/29

    Informatica How-To Library

     As an Informatica customer, you can access the Informatica How-To Library at

    http://mysupport.informatica.com . The How-To Library is a collection of resources to help you learn more

    about Informatica products and features. It includes articles and interactive demonstrations that provide

    solutions to common problems, compare features and behaviors, and guide you through performing specific

    real-world tasks.

    Informatica Knowledge Base

     As an Informatica customer, you can access the Informatica Knowledge Base at

    http://mysupport.informatica.com . Use the Knowledge Base to search for documented solutions to known

    technical issues about Informatica products. You can also find answers to frequently asked questions,

    technical white papers, and technical tips. If you have questions, comments, or ideas about the Knowledge

    Base, contact the Informatica Knowledge Base team through email at [email protected].

    Informatica Support YouTube Channel

    You can access the Informatica Support YouTube channel at http://www.youtube.com /user/INFASupport. The

    Informatica Support YouTube channel includes videos about solutions that guide you through performing

    specific tasks. If you have questions, comments, or ideas about the Informatica Support YouTube channel,

    contact the Support YouTube team through email at [email protected]  or send a tweet to

    @INFASupport.

    Informatica Marketplace

    The Informatica Marketplace is a forum where developers and partners can share solutions that augment,

    extend, or enhance data integration implementations. By leveraging any of the hundreds of solutions

    available on the Marketplace, you can improve your productivity and speed up time to implementation on

    your projects. You can access Informatica Marketplace at http://www.informaticamarketplace.com .

    Informatica Velocity

    You can access Informatica Velocity at http://mysupport.informatica.com . Developed from the real-world

    experience of hundreds of data management projects, Informatica Velocity represents the collective

    knowledge of our consultants who have worked with organizations from around the world to plan, develop,

    deploy, and maintain successful data management solutions. If you have questions, comments, or ideas

    about Informatica Velocity, contact Informatica Professional Services at [email protected].

    Informatica Global Customer Support

    You can contact a Customer Support Center by telephone or through the Online Su pport.

    Online Support requires a user name and password. You can request a user name and password at

    http://mysupport.informatica.com .

    The telephone numbers for Informatica Global Customer Support are available from the Informatica web site

    at http://www.informatica.com/us/services-and-training/support-services/global-support-centers/ .

    Preface 7

    http://mysupport.informatica.com/mailto:[email protected]:[email protected]://www.informaticamarketplace.com/mailto:[email protected]://www.youtube.com/user/INFASupporthttp://www.youtube.com/user/INFASupportmailto:[email protected]://www.informatica.com/us/services-and-training/support-services/global-support-centers/http://mysupport.informatica.com/mailto:[email protected]://mysupport.informatica.com/http://www.informaticamarketplace.com/mailto:[email protected]://www.youtube.com/user/INFASupportmailto:[email protected]://mysupport.informatica.com/http://mysupport.informatica.com/

  • 8/18/2019 DQ 901HF2 RepositoryMigrationGuide En

    8/29

    C H A P T E R   1

    Introduction to Data Quality

    Repository Migration

    This chapter includes the following topics:

    • Overview of Repository Migration, 8

    • Informatica Data Quality 8.6.2 Repository Features, 9

    • Data Quality Plan and Mapping Comparisons, 9

    • Changes to Data Quality Transformations, 9

    • Changes to Data Quality Sources and Targets, 10

    • Component Comparison Checklist, 10

    • Changes to Reference Data, 13

    • Migration and Data Profiling, 14

    Overview of Repository MigrationInformatica provides batch files that you can use to export the contents of an Informatica Data Quality 8.6.2

    repository to a 9.0.1 Model repository.

    The batch files perform the following tasks:

    • Export Data Quality 8.6.2 repository objects and reference data to the file system in XML format.

    • Convert the 8.6.2 repository objects to 9.0.1 format.

    • Import the ref erence data as reference tables to the 9.0.1 Model repository and staging area

    You complete the migration in the Developer tool by importing the XML package containing the

    transformation, mapplet, and mapping XML to the Model repository.

    Note: If a Data Quality 8.6.2 object reads a database source, the migration process preserves the database

    connection information. You do not need to re-create the database connection in Data Quality 9.0.1.

    To migrate from Data Quality 8.6.2 to Data Quality 9.0.1 HotFix 1, run the migration files associated with Data

    Quality 9.0.1 HotFix 1.

    To migrate from Data Quality 8.6.2 to Data Quality 9.0.1, run the migration files associated with the Data

    Quality 9.0.1.

    8

  • 8/18/2019 DQ 901HF2 RepositoryMigrationGuide En

    9/29

    Informatica Data Quality 8.6.2 Repository Features

    The Informatica Data Quality 8.6.2 repository shows the following similarities and differences when compared

    with the 9.0.1 Model repository:

    • The Informatica Data Quality 8.6.2 repository contains two types of object:  projects and plans . The 8.6.2repository does not store transformation or data source definitions as separate objects. The 8.6.2

    repository stores all metadata as XML.

    •  An 8.6.2 reposi tory project is similar to a 9.0.1 Model repository project. Both display user-defined folders

    in the repository structure.

    •  An 8.6.2 plan equates to a mapping in the 9.0.1 Model repository. A plan contains a data source and data

    target connected by zero or more transformations. It runs in the same manner as a mapping.

    • The Informatica Data Quality 8.6.2 user creates and runs plans in a client application called Data Quality

    Workbench. The application installs with a local repository. Informatica Data Quality 8.6.2 enables remote

    clients to connect to an 8.6.2 repository in a client-server manner, but all Informatica Data Quality 8.6.2

    repositories are identical.

    Data Quality Plan and Mapping Comparisons

    The migration process converts all Informatica Data Quality 8.6.2 plans to 9.0.1 mappings. Each mapping

    has a folder in the Model repository.

    Some sources, targets, and transformations in the migrated plans convert directly to sources, targets, and

    transformations in the Model repository. Some sources, targets, and transformations convert to multiple

    objects or to mapplets in the Model repository.

    Changes to Data Quality Transformations

    Some transformations are functionally identical across the product versions, while others convert to different

    transformations. Some transformations do not migrate.

    The following types of transformation change can occur:

    • The 8.6.2 transformation has a direct counterpart in 9.0.1. Informatica Data Quality 9.0.1 includes

    transformations that are effectively copies of 8.6.2 transformations. For example, the Merge, ToUpper,

    and Rule-Based Analyzer transformations in Informatica Data Quality 8.6.2 become Merge, Case, and

    Decision transformations in Informatica Data Quality 9.0.1.

    • 9.0.1 transformations provide equivalent functionality to or have evolved from 8.6.2 transformations. Forexample, the 9.0.1 Comparison transformation combines the functionality of the Bigram, Jaro, Hamming

    Distance, and Edit Distance transformations. These 8.6.2 transformations convert seamlessly to a

    Comparison transformation.

    • The 8.6.2 transformation does not have a direct counterpart in 9.0.1 but the transformation functionality is

    maintained in other transformations. In such cases, the 8.6.2 transformation metadata transfers to other

    transformations. For example, the Word Manager transformation does not migrate to 9.0.1, but its

    metadata transfers to the Standardizer transformation, which enables the same functionality.

    Informatica Data Quality 8.6.2 Repository Features 9

  • 8/18/2019 DQ 901HF2 RepositoryMigrationGuide En

    10/29

    • The 8.6.2 transformation is not supported in 9.0.1 and the transformation functionality does not transfer to

    other transformations. In such cases, the 8.6.2 transformation input and output metadata is applied to

    another transformation, for example an Expression transformation.

    Changes to Data Quality Sources and Targets

    Some Informatica Data Quality 8.6.2 sources and targets are fully compatible with 9.0.1 data source and

    target definitions. For example, a CSV Source from Informatica Data Quality 8.6.2 migrates to a file-based

    data source object in 9.0.1. These sources and targets convert seamlessly to 9.0.1 source and target

    definitions.

    Some 8.6.2 sources and targets incorporate transformation functionality and do not have a one-to-one

    correspondence with source and target definitions in 9.0.1. They convert to 9.0.1 sources and targets and

    also generate 9.0.1 transformations that perform the operations configured in 8.6.2.

    The following types of source and target do not correspond one-to-one with source and target definitions in

    9.0.1:

    • Sources and targets used in grouping data records before duplicate analysis.

    • Sources and targets used in field matching procedures.

    • Sources and targets used in identity matching procedures.

    Component Comparison Checklist

    The following table lists the source, target, and transformation components available in Data Quality

    Workbench and describes how they convert to objects in the 9.0.1 Model repository:

    8.6.2 Component 9.0.1 Component

     Aggregation Aggregator trans format ion

     Association [for PowerCenter ] Association trans format ion

    Bigram Comparison transformation

    Character Labeler Labeler transformations

    Consolidation [for

    PowerCenter]

    Consolidation transformation

    Context Parser Labeler and Parser transformat ions. The Parser transformat ion is set to pat tern-

    based parsing mode.

    Count Mapplet containing Aggregator, Union, Expression, Joiner, Sorter, and Filter

    transformations

    CSV Dual Match Source Two f ile-based data sources and a Match transformation

    10 Chapter 1: Introduction to Data Quality Repository Migration

  • 8/18/2019 DQ 901HF2 RepositoryMigrationGuide En

    11/29

    8.6.2 Component 9.0.1 Component

    CSV Identity Group Source File-based data source and Match transformation. May convert to a mapplet.

    CSV Match Target File-based data target

    CSV Match Source F ile-based data target and Match tr ansformation

    CSV Merge Target File-based data target

    CSV Target File-based data target

    CSV Source File-based data source

    DB Identity Group Source Relational data source and Match transformation. May convert to a mapplet.

    DB Match Source Relational data source and Match transformation

    DB Report Target File-based data target

    DB Target SQL transformation and relational data target

    DB Source Relational data source

    Dual Group Source Multiple f ile-based data sources and Union transformation i f required. May

    convert to a mapplet.

    Edit Distance Comparison transformation

    Fixed Width Target File-based data target

    Fixed Width Source File-based data source

    Global AV [Address Doctorengine]

     Address Val idator trans format ion. This transformat ion needs additionalconfiguration following import to the 9.0.1 Model repository.

    Global AV [Melissa Data

    engine]

     Address Val idator trans format ion. This transformat ion needs additional

    configuration following import to the 9.0.1 Model repository.

    Global AV [QAS engine] Address Validator transformation. This transformation needs additionalconfiguration following import to the 9.0.1 Model repository.

    Global AV [SDK] Not supported

    Group Target Flat-file data target and Sorter and Expression transformations

    Group Source Mult iple fi le-based data sources and Union t ransformat ion if r equi red. Mayconvert to a mapplet.

    Hamming Distance Comparison transformation

    Identity Group Target Mapplet output transformation

    Identity Match Match transformation

    Jaro Distance Comparison transformation

    Component Comparison Checklist 11

  • 8/18/2019 DQ 901HF2 RepositoryMigrationGuide En

    12/29

    8.6.2 Component 9.0.1 Component

    Match Key Target Relational data target

    Merge Merge transformation

    MinAvgMax Mapplet containing Aggregator, Union, Expression, Joiner, and Router

    transformations

    Missing Values Mapplet containing Aggregator, Expression, Joiner t ransformat ions

    Mixed Field Matcher Not supported

    Normalization [SDK] Not supported

    NYSIIS Key Generator transformation

    Parsing [SDK] Not supported

    Pro fi le Standardizer Parser transformat ion se t to pa ttern-based parsing mode

    Range Counter Linear Range: Aggregator, Expression, Joiner , and Sorter tr ansformations

    Variable Range: Aggregator, Expression, Union transformations

    Realtime Target Mapplet containing data target

    Realtime Source Mapplet containing data source

    Report Target Flat-file data target

    Rule Based Analyzer Decision transformation

    SAP Target Not supported

    SAP Source Not supported

    Scripting Not supported

    Search Replace Standardizer transformation

    Similarity [SDK] Not supported

    Soundex Key Generator transformation

    Splitter Labeler, Parser, and Expression transformations. The Parser is set to pattern-

    based parsing mode.

    Sum Mapplet containing Aggregator, Expression, Joiner, Sorter, and Union

    transformations

    To Upper Case Converter transformation

    Token Labeler Labeler transformation

    Token Parser Parser transformation

    12 Chapter 1: Introduction to Data Quality Repository Migration

  • 8/18/2019 DQ 901HF2 RepositoryMigrationGuide En

    13/29

    8.6.2 Component 9.0.1 Component

    Weigh t Based Ana lyzer Weigh ted Average t rans fo rmat ion

    Word Manager Standardizer transformation

    Changes to Reference Data

    Informatica Data Quality 8.6.2 can read reference data from dictionary files and database tables. The

    migration process can export this data for use in Data Quality 9.0.1.

    The migration process exports the following types of reference data:

    • Reference data that you created in file or database form. If you created database dictionaries in Data

    Quality 8.6.2, the export process converts these to file. The import process reads reference data files into

    the 9.0.1 Model repository and staging area.

    • Informatica dictionary files that the process does not recognize as part of the Data Quality 9.0.1 Content

    Installer file set. The process exports Country Pack and Region Pack files.

    The migration process does not export the following types of reference data:

    •  Address reference data

    • Identity population data

    • Informatica reference data shipped by default with the Data Quality 9.0.1 Content Installer 

    Note: Each version of Informatica 9.0.1 performs reference data migration in a different way. You must run

    migration files that are compatible with your version of Informatica 9.0.1.

    The following table describes the differences between each release:

    Informatica Release Dictionary File Treatment Cross-Version Compatibility

    Data Quality 9.0.1 Does not copy Informaticareference dictionary files.

    Not compatible with other versionsof Data Quality 9.0.1.

    Data Quality 9.0.1 HotFix 1 Does not copy Informatica

    reference dictionary files.

    Compatible with Data Quality 9.0.1

    HotFix 2.

    Data Quali ty 9.0.1 HotFix 2 Does not copy a di ct ionary fi le if

    the Content Installer contains anupdated version of the file.

    Copies all other dictionary files.

    Compatible with Data Quality 9.0.1

    HotFix 1.

    Run the Data Quality Content Installer to install all Informatica reference data.

    The migration process can recognize that a plan reads reference data when it exports the plan from the 8.6.2

    repository. In such cases, the migration process retains the link between the plan and the reference data in

    the exported XML. When you import the project and the plan metadata to the 9.0.1 Model repository, the

    reference data is copied into reference tables in the Model repository and staging area. The mapping created

    for the plan reads the reference data from these tables. You do not need to reconnect the mapping to the

    reference tables.

    Changes to Reference Data 13

  • 8/18/2019 DQ 901HF2 RepositoryMigrationGuide En

    14/29

    The migration process recognizes Informatica reference data even if the reference data file name has

    changed between versions 8.6.2 and 9.0.1. If an 8.6.2 plan reads a reference data file that is represented by

    a reference table in 9.0.1, the migration process updates the imported mapping to read the new reference

    table.

    The migration process requires that Data Quality 8.6.2 dictionaries use UTF-8 encoding. If your Data Quality

    8.6.2 dictionaries use encodings other than UTF-8, convert the dictionaries to UTF-8 before migration.

    Migration and Data Profiling

    Informatica Data Quality 8.6.2 performs profiling differently from Informatica Data Quality 9.0.1.

    Informatica Data Quality 8.6.2 users create plans to profile data sources and write the results to data targets.

    The migration process preserves the logic of these plans, so that the files or database tables written by the

    mapping in 9.0.1 contain data that corresponds to profile results.

    14 Chapter 1: Introduction to Data Quality Repository Migration

  • 8/18/2019 DQ 901HF2 RepositoryMigrationGuide En

    15/29

    C H A P T E R   2

    Migrating Repository and

    Reference Data

    This chapter includes the following topics:

    • Overview of Migration Process, 15

    • Migration Report Files, 16

    • Migration Prerequisites, 17

    • Exporting Data from the Data Quality 8.6.2 Repository, 21

    • Exporting Data from the 8.6.2 Workbench Machine, 22

    • Exporting Data from the 8.6.2 Server Machine, 23

    • Importing Data to Informatica Data Quality 9.0.1, 23

    Overview of Migration Process

    To migrate repository and reference data, you must run batch files pr ovided by Informatica. Informatica

    provides the files in the IDQMigration.zip file.

    IDQMigration.zip contains the following files:

    • ClientPackage. Exports the 8.6.2 repository contents and copies reference dictionary data to the file

    system. The batch processes compresses and save the files in a format legible to the ServerImport batch

    file.

    You can append parameters to the ClientPackage batch file to read plan metadata from the file system

    and not from the 8.6.2 repository. You must use these parameters when migrating metadata from a Data

    Quality Server repository.

    • ServerImport. Extracts and writes reference metadata to the 9.01 Model repository. Extracts and writes

    reference data to the 9.0.1 staging database. The file also save plan metadata in a format legible to the

    9.0.1 Model repository. It does not write the plan metadata to the Model repository.

    Note: You must manually import the plan metadata to the 9.0.1 Model repository.

    15

  • 8/18/2019 DQ 901HF2 RepositoryMigrationGuide En

    16/29

    Migration Report Files

    The migration process creates HTML report files when you run the ClientPackage and ServerImport batch

    files.

    The ClientPackage report files describe the status of the objects you export from Data Quality 8.6.2. TheServerImport report files describe the status of the objects you import to Data Quality 9.0.1.

    Review the reports as you run each batch file to verify the success of each stage in the process. Take note of

    any objects that do not export or import as expected or that require user tuning for use in Data Quality 9.0.1.

    ClientPackage Report Files

    ClientPackage.bat creates a single report file named PackageReport.html. It writes this file to the Package

    directory.

    You find the Package directory in the same directory as ClientPackage.bat.

    ServerImport Report Files

    ServerImport.bat creates a report file for each plan imported to the 9.0.1 Model repository. It also creates a

    summary file named ServerMigrationReport.html.

    ServerImport.bat writes the report files to the migration_reports directory. You find this directory in the same

    directory as ServerImport.bat.

    Note: ClientPackage and ServerImport also create log files in their respective directories. These files provide

    additional information on the success of export and import operations.

    PackageReport Status Information

    The ClientPackage process generates a report named PackageReport.htm. Review this report file before you

    run the server import process. If you find issues in PackageReport.htm, you can address them before you

    import items to Data Quality 9.0.1.

    Pay attention to the following items:

    Warnings and Errors

    If the report includes warnings or errors, you must manually edit the items affected. You can edit them

    following import to Data Quality 9.0.1, or you can edit the plans or files in your 8.6.2 environment before

    proceeding. If you make any changes in Data Quality 8.6.2, rerun the ClientPackage batch file.

    Unused Dictionaries

     An unused dictionary is not used by any plan packaged for migration. If you have many unused

    dictionaries, you may want to exclude them from the 9.0.1 import. Use the RTM.ImportSet property in the

    migration.properties file to control the import of dictionary files.

    Missing Dictionaries

     A missing dictionary is one that a plan is configured to read but that is absent from the expected location

    on the Data Quality 8.6.2 machine. If you continue with the migration, you must edit any mapping

    configured to read such a dictionary.

    If you change your dictionary location or replace missing dictionaries in Data Quality 8.6.2, rerun the

    ClientPackage batch file.

    16 Chapter 2: Migrating Repository and Reference Data

  • 8/18/2019 DQ 901HF2 RepositoryMigrationGuide En

    17/29

    Migration Prerequisites

    You must verify that the client batch file can access all Informatica Data Quality 8.6.2 objects and data. You

    must also understand the changes that migrated objects can undergo during the migration process.

    Before you begin the migration process, answer the following questions:

    Do the plans read reference data provided by Informatica?

    Informatica Data Quality 8.6.2 uses dictionary files as reference data. If you migrate plans that read

    dictionary files, you must verify that the dictionaries are accessible on the Data Quality 8.6.2 machine.

    The migration process reads the location of the dictionary files from the Data Quality config.xml file.

    Default location for configuration file: [install_dir]\config.xml

    Example: C:\Program Files\Informatica Data Quality\config.xml

    Default location for dictionaries: [install_dir]\Dictionaries

    Example: C:\Program Files\Informatica Data Quality\Dictionaries

    Note: The migration process ignores most Informatica dictionary files when it exports items from DataQuality 8.6.2. Use the Data Quality Content Installer to add Informatica reference data to Informatica

    Data Quality 9.0.1. Ensure that you include Country Pack dictionaries and premium address reference

    data files read by the 8.6.2 plans when you run the Content Installer.

    Run the Server and Client Content Installers before you perform any migration tasks on an Informatica

    Data Quality 9.0.1 machine.

    Do the plans read from or write to database tables?

    If the plans read from or write to a database, take note of the database connection details. Verify that the

    9.0.1 Data Integration Service can access the database host machines.

    If the plans read from or write to files, copy these files to a location accessible to the 9.0.1 Data

    Integration Service. You can set the location of source and target files in the migration.properties file.

    Is Informatica Data Quality 9.0.1 installed, and are the required services running?

    The following 9.0.1 services must be running before you import migrated files:

    • Model Repository Service

    • Data Integration Service

    •  Analyst Service

    Have you created a project in the Model repository for the data you want to import?

    Create this project before you import migrated files. Create a folder in the project to store the reference

    tables created from the 8.6.2 dictionary files.

    Have you reviewed the migration.properties file?

    Before you run the ServerImport process on the Data Quality 9.0.1 system, must review the

    migration.properties file and verify that the property settings are correct for your environment and the

    migration objects.

    Migration Prerequisites 17

  • 8/18/2019 DQ 901HF2 RepositoryMigrationGuide En

    18/29

    Migration.Properties File Settings

    Before you migrate repository objects or files from Data Quality 8.6.2, review the settings in the

    migration.properties file. Verify that the settings are correct for your Data Quality environment and the items

    that you migrate. You find this file in the Config directory of the IDQMigration.zip package.

    The ServerImport process reads all properties in this file. The ServerImport and ClientPackage processesread the Migration.Formatter, Migration.LogLevel, Report.Format, and Report.Generate properties.

    The following table describes the most frequently used options in migration.properties:

    Property Description

    DSO.DefaultSourceFolder The path to the folder that you want to contain f la t fi le data sources in Data

    Quality 9.0.1. Set this property if you want all flat file data objects to read

    data from a single location. This location must be accessible in the DataQuality 9.0.1 server environment.

    DSO.DefaultTargetFolder The path to the folder that you want to contain flat f ile data targets in Data

    Quality 9.0.1. Used by the ServerImport process reads this property. Set

    this property if you want all flat file data objects to write data to a singlelocation. This location must be accessible in the Data Quality 9.0.1 serverenvironment.

    EDR.Host The Data Integration Service host machine.

    EDR.Port The port number that the ServerImport process uses to communicate with

    Informatica 9.0.1 services. This port number must match the ServiceManager port used during Data Quality 9.0.1 installation process. Default is

    6006.

    Locale.Client The locale used by Informatica Developer. An incorrect locale may result in

    incorrect settings on metadata items such as match thresholds. Default isen.

    Migrat ion.Format te r The quant ity o f add it iona l information that C lien tPackage and ServerImpor t

    can write to log messages. Enter Default to enable ClientPackage and

    ServerImport to add a single line of information to log messages, in addition

    to the information defined by Migration.LogLevel. Enter Custom to enableClientPackage and ServerImport to add multiple lines. Default is Custom.

    Migration.LogLevel The level of logging per formed dur ing the ClientPackage and ServerImpor t

    processes. Default is Normal. If you see issues during the ServerImport

    process, set this property to Debug and rerun the process.

    Report.Format The file format of report files. Set the property to HTML or XML. Default is

    HTML.

    Report.Generate Determines if the ClientPackage and ServerImport processes generate

    report files. Ensure that this property is set to Yes. You must review thereport files to verify the results of each process. Default is Yes.

    RTM.AtService The Analyst Service name in Data Quality 9.0.1.

    18 Chapter 2: Migrating Repository and Reference Data

  • 8/18/2019 DQ 901HF2 RepositoryMigrationGuide En

    19/29

    Property Description

    RTM.Conten tPro jec t The Model repos itory projec t that con tains reference data ins ta lled by the

    Content Installer. If the plans you export from Data Quality 8.6.2 read

    Informatica dictionaries, the migration process can link the importedtransformations to the Informatica 9.0.1 reference data.

    Set RTM.MapRTM to Yes to enable imported objects to read Informatica

    reference data.

    RTM.ContentRootDirectory The folder within RTM.ContentProject that contains reference data read by

    the imported transformations.

    RTM.Host The Model repository host machine name in Data Quality 9.0.1.

    RTM.ImportSet Determines the dictionary files that are written as reference tables during

    the import process. If the ClientPackage process identifies a large quantity

    of unused dictionaries, set this to UsedOnly. Default is All.

    RTM.MapRTM Determines if imported objects read Informatica reference data installed to

    Data Quality 9.0.1 in place of the dictionary files configured in Data Quality8.6.2. The recommended setting is Yes. If set to No, ServerImport creates

    empty reference tables, and imported transformations reference the empty

    tables. Default is No.

    RTM.Repository The Model repository name in Data Quality 9.0.1.

    RTM.UserProject The Model repository project that reference data and mappings import to.

    Create the project before you run ServerImport.

    RTM.UserRootDirectory The fo lder with in RTM.UserProject to contain the reference data read by the

    imported objects. Create the folder before you run ServerImport.

    Server.Password Password for Data Quality 9.0.1. You must have read and write permissions

    on the project folders that you import to.

    Server.UserName User name for Data Quality 9.0.1. You must have read and write

    permissions on the project folders that you import to.

    Stage.Oracle

    Stage.SqlServer 

    Stage.ODBC

    Stage.MySQL

    The staging database type. If you have configured a staging database and

    schema of a particular type, update the property with the name of theconnection that uses the database and schema. Default for each property is

    blank.

    Finding the EDR.Port Value

    If you know the Service Manager port number that was defined during the Data Quality 9.0.1 installationprocess, use it as the EDR.Port value in migration.properties. The default Service Manager port number is

    6006.

    If you do not know the Service Manager port number, you can find it in nodemeta.xml in the Data Quality

    9.0.1 installation.

    1. Find nodemeta.xml.

    The default location of this file is

    /isp/config

    Migration Prerequisites 19

  • 8/18/2019 DQ 901HF2 RepositoryMigrationGuide En

    20/29

    2. Copy the file to another location.

    3. Open the file in a text editor.

    4. Search the file for a string in this format:

  • 8/18/2019 DQ 901HF2 RepositoryMigrationGuide En

    21/29

    UsedOnly

    Imports reference data from the migration package if the data is used by an imported mapping object.

    You may want to prevent the import of unused reference data if your Data Quality 8.6.2 installation

    contains many unused dictionaries.

    None

    Does not import any dictionaries.

    Reference Table Creation

    Use the RTM.MapRTM property to determine if the mappings you import to Data Quality 9.0.1 read reference

    tables in the Model repository.

    Set the property to one of the following options:

     Yes

    If ServerImport finds a dictionary file link during the conversion of a plan to a mapping, it searches for a

    corresponding reference table in the Model repository and links new mapping to that reference table.

    No

    If ServerImport finds a dictionary file link during the conversion of a plan to a mapping, it creates a

    dummy reference table for the dictionary and adds a warning to the report files for the affected plan.

    Informatica Reference Tables

    The Data Quality Content Installer provides a default set of reference tables for Data Quality 9.0.1. These

    reference tables include updated versions of the standard dictionary files issued for Data Quality 8.6.2.

    If your plans use Informatica dictionaries in Data Quality 8.6.2, install the Informatica reference tables to Data

    Quality 9.0.1 before proceeding with the migration. Select the Informatica reference data file when you run

    the Content Installer. Update the RTM.ContentProject and RTM.ContentRootDirectory properties in the

    migration.properties file with the project and directory locations of the reference data.

    Note: If you download and install an accelerator pack for Data Quality 9.0.1, you must also install the default

    Content Installer reference data when you install the accelerator reference data. Complete this task even if

    you have previously installed the default Content Installer data. This step maintains the links between the

    Informatica reference tables in the Model repository and the mappings that read them.

    Exporting Data from the Data Quality 8.6.2Repository

    Run ClientPackage.bat to create a compressed file that contains repository metadata and reference data.

    The export procedures differ for Data Quality Workbench and Data Quality Server installations.

    Workbench installations

    Run ClientPackage.bat on the Workbench machine to export Workbench repository contents and

    reference data to a compressed migration file.

    Exporting Data from the Data Quality 8.6.2 Repository 21

  • 8/18/2019 DQ 901HF2 RepositoryMigrationGuide En

    22/29

    Server installations

    Use Workbench to export plan metadata to the file system on a Server repository machine. Run

    ClientPackage.bat on the Server repository machine to create a compressed migration file that contains

    the plan metadata and the Server reference data.

    Exporting Data from the 8.6.2 Workbench Machine

    Run ClientPackage.bat on the Workbench machine to export the local repository contents and reference

    data.

    ClientPackage.bat creates a compressed file that contains these items. The default name for the file is

    MigrationPackage.zip.

    1. Copy the IDQMigration.zip file to the Workbench host machine.

    2. Extract the IDQMigration.zip file.

    3. Run ClientPackage.bat. You can apply the following optional parameters:

    Option Description

    -d Path to the directory where the batch file creates MigrationPackage.zip.

    -f Path to a folder that contains plans already exported from the Data Quality

    repository. Use this parameter if you have used Workbench to export repository

    contents to file. Do not use with the -r parameter.

    -o Alternative name for MigrationPackage.zip.

    -r Server repository export only. Specifies that ClientPackage.bat will run on a remoteData Quality repository and extract plan and reference data to the Workbench file

    system.

    -s Staging directory for temporary files.

    The batch process creates the compressed migration file that contains the exported repository and

    reference data files.

    4. Review the PackageReport.html file created by the export process.

    This reports lists the plans, reference files, and database connection information copied to the migration

    file during the export process.

    22 Chapter 2: Migrating Repository and Reference Data

  • 8/18/2019 DQ 901HF2 RepositoryMigrationGuide En

    23/29

    Exporting Data from the 8.6.2 Server Machine

    Use Data Quality Workbench to export the Server repository contents to the Server file system. Run

    ClientPackage.bat on the Server machine to read the exported repository data and copy reference data from

    the Server machine.

    ClientPackage.bat creates a compressed file that contains these items. The default name for the file is

    MigrationPackage.zip.

    1. Use Data Quality Workbench to export plans from the Data Quality Server repository.

    Create the exported XML files on the Server repository machine.

    2. Copy the IDQMigration.zip file to the Server repository host machine.

    3. Extract the IDQMigration.zip file.

    4. Run ClientPackage.bat. You can apply the following parameters:

    Option Description

    -d Optional. Path to the directory where the batch file creates theMigrationPackage.zip.

    -f Path to the directory that contains plans exported from the Data Quality Server

    repository. Do not use with the -r parameter.

    -o Optional. Alternative name for MigrationPackage.zip.

    -r Optional. Specifies that ClientPackage.bat will run on a remote repository and

    extract plan metadata to the local file system.

    You can run ClientPackage.bat -r in place of step 1 if you do not have access

    permissions on the Server repository machine.

    -s Optional. Staging directory for temporary files.

    The batch process creates a MigrationPackage.zip file that contains the exported repository and data files.

    Importing Data to Informatica Data Quality 9.0.1

    You import data in two steps. First, you run the ServerImport batch process to create one or more XML files

    that are compatible with Data Quality 9.0.1. Then, you import the XML files to the 9.0.1 Model repository and

    staging area.

    1. Copy the compressed file that contains the exported repository objects and reference data to the Model

    repository host machine.

    The default name of the compressed repository objects file is MigrationPackage.zip.

    2. Extract the contents of the file.

    3. Open the migration.properties file from the extracted files. Update migration.properties with the following

    information:

    • Informatica 9.0.1 Model repository host machine name.

    • Model repository name.

    Exporting Data from the 8.6.2 Server Machine 23

  • 8/18/2019 DQ 901HF2 RepositoryMigrationGuide En

    24/29

    •  Analyst Service name.

    • Name of the project and folder to contain the user-defined reference data.

    • Name of the project and folder that contains the Informatica reference data.

    • The locale setting on the Data Quality Workbench that last edited the plans. If required, use the

    Locale.Client property to set the locale.

    4. Set the RTM.MapRTM property to Yes.

    5. Save and close migration.properties.

    6. Run ServerImport.bat or ServerImport.sh. Use the following parameters:

    Option Description

    -f Required. Path to the folder that contains the compressed repository objects file.

    -d Optional. Specify an alternative Output folder for the mapping XML file.

    -o Optional. Specify a new name for the exported objects XML file.

    -p Optional. Specify an alternative properties file to migration.properties.

    -s Optional. Specify an alternative temporary working folder.

    The ServerImport batch process creates the XML that you import to the 9.0.1 Model repository and

    staging area. The process creates the XML in a subfolder named Output in the folder that contains the

    serverImport batch file.

    7. Review the ServerMigrationReport.html file and any other report files created by the ServerImport

    process. Address any issues that arose during the process.

    8. Copy the XML file or files to the Developer tool machine.

    9. Open the Developer tool and import the mapping XML to the Model repository.

    If the number of 8.6.2 plans is too large for a single XML file, ServerImport creates multiple files. In this

    case, you must import the files in numerical order by file name, starting with the lowest-numbered file.

    When the plans are imported, they appear as mappings in a folder in the Model repository. Any 8.6.2

    transformations that convert to mapplets are also saved to a separate folder.

    Note: The migration process creates connection objects for the databases that are used by the migrated

    plans. Update the JDBC string information in the database connection objects.

    24 Chapter 2: Migrating Repository and Reference Data

  • 8/18/2019 DQ 901HF2 RepositoryMigrationGuide En

    25/29

    C H A P T E R   3

    Troubleshooting Migration of Data

    Quality Objects

    This chapter includes the following topics:

    • Overview, 25

    • Token Labeling and Token Parsing, 25

    • Rule-Based Analyzer Components Without Input Fields, 26

    • Text Qualifier Support for Sources and Targets, 27

    • Empty Reference Tables, 27

    • Data Quality 8.6.2 Settings Not Available in Data Quality 9.0.1, 28

    Overview

    Because some transformations function differently in Data Quality 9.0.1 than in Data Quality 8.6.2, you mayobserve differences in mapping configuration and data output following migration. For example, the 9.0.1

    Labeler transformation outputs some tokens differently than the 8.6.2 Token Labeler component.

    Review the ServerMigrationReport.html file and associated report files to troubleshoot the effects of

    migration on Data Quality 8.6.2 plans.

    Token Labeling and Token Parsing

    Labeling and parsing functionality differs from Data Quality 8.6.2 to Data Quality 9.0.1.

    Labeler Transformation Changes

    In Data Quality 8.6.2, you use the Token Labeler or Character Labeler components to label strings. Data

    Quality 9.0.1 combines token labeling and character labeling functionality in the Labeler transformation.

    25

  • 8/18/2019 DQ 901HF2 RepositoryMigrationGuide En

    26/29

    Review the following changes in labeling behavior:

    • In Data Quality 9.0.1, some token label names have changed. The following table lists the changes:

    Data Quality 8.6.2 Data Quality 9.0.1

    codesymbol code

    num number  

    text word

    •  A Data Quality 9.0.1 mapping does not preserve delimiters between tokens in the token stream. The

    migration process replaces all delimiters with a space character.

    • Data Quality 9.0.1 does not support case-sensitive token labels. The following table shows how different

    versions of Data Quality label the string seattle Seattle SEATTLE:

    Data Quality 8.6.2 Data Quality 9.0.1

    word Word WORD word word word

    Labeler and Parser Transformation Changes

    In Data Quality 8.6.2, you use the Token Parser, Context Parser, or Profile Standardizer to parse strings by

    the type of information they contain. Data Quality 9.0.1 combines parsing functionality in the Parser

    transformation. In Data Quality 8.6.2, parsing components can read outputs from a Token Labeler

    component, and in Data Quality 9.0.1, the Parser transformation can read output from a Labeler

    transformation.

    Review the following change in parsing behavior, and note that the Labeler transformation is affected:

    • The migration process converts Token Labeler word outputs and Token Parser text outputs into word

    tokens in the 9.0.1 Labeler and Parser transformations. However, the definition of a 9.0.1 word token is

    more restrictive than 8.6.2 text output settings and word tokens. A mapping that uses the 9.0.1 word token

    may have different output than the original plan.

    Rule-Based Analyzer Components Without InputFields

    In Data Quality 8.6.2, valid plans can contain Rule-Based Analyzer components with no input ports. When

    you migrate these plans to 9.0.1 mappings, the migration process creates Decision transformations with noinput fields.

     After migration, you need to connect a fie ld from an upstream component in the mapping to the Decision

    transformation.

    26 Chapter 3: Troubleshooting Migration of Data Quality Objects

  • 8/18/2019 DQ 901HF2 RepositoryMigrationGuide En

    27/29

    Text Qualifier Support for Sources and Targets

    In Data Quality 9.0.1, you can use single or double quotes as text qualifiers for sources and targets. In Data

    Quality 8.6.2, you can use other symbols as text qualifiers for sources and targets.

    When you migrate 8.6.2 sources or targets that use text qualifiers other than quotation marks andapostrophes, the migration process sets the text qualifier to None.

    If a Data Quality 8.6.2 plan uses text qualifiers not supported in Data Quality 9.0.1, migration may produce

    mappings that process data differently than the 8.6.2 plan. For example, consider that an 8.6.2 plan contains

    a semicolon as a text qualifier and that you process the following data stream in both the 8.6.2 plan and the

    9.0.1 mapping:

    data1,;data2, data3;,data4

    In Data Quality 8.6.2, the plan parses the data into three data columns:

    data1 | data2, data3 | data4

    In Data Quality 9.0.1, the mapping parses the data into four columns:

    data1 | ;data2 | data3; | data4

    The differences that you experience in data processing depend on the unique combination of input data and

    text qualifiers that you use.

    Empty Reference Tables

    If you find a reference table within a mapping folder, the reference table is empty.

    Empty reference tables can occur for the following reasons:

    • Informatica reference data was not installed to the Data Quality 9.0.1 system.

    The ServerImport process assumes that you have run the Content Installer to add Informatica reference

    data to Data Quality 9.0.1. The ClientPackage process does not migrate an Informatica dictionary if the

    dictionary data is present in the Content Installer file set or in an accelerator pack.

    Run the Content Installer on the Data Integration Service machine, or on a machine that the Data

    Integration Service can access, to install the reference data you need. Then run the ServerImport process.

    •  A customized dictionary file was not present on the Data Quality 8.6.2 machine when the ClientPackage

    process ran.

    The ClientPackage process migrates any dictionary file that is not included in an Informatica reference

    data set or accelerator pack. In this case, find the missing dictionary, add it to the dictionary folder on the

    Data Quality 8.6.2 machine, and rerun the ClientPackage process.

    Text Qualifier Support for Sources and Targets 27

  • 8/18/2019 DQ 901HF2 RepositoryMigrationGuide En

    28/29

    Data Quality 8.6.2 Settings Not Available in DataQuality 9.0.1

    Some Data Quality 8.6.2 settings for Token Labeler, Identity Match, Character Labeler, and Context Parser

    components are not available in Data Quality 9.0.1 transformations.

    Changes in Address Validator Functionality

    Some Address Validator transformation ports may be unconnected in an imported mapping. This can arise if

    the functionality of a port has changed between 8.6.2 and 9.0.1.

    You can resolve this issue in the following ways:

    • Edit the Address Validator transformation and delete the bad links.

    • Edit the Address Validator transformation to use different ports and recreate the port links.

    • Create another Address Validator transformation in Data Quality 9.0.1, configure it in parse-only mode,

    and connect the required data ports to the new transformation.

    The following table lists the affected ports in Data Quality 8.6.2 and alternative ports you can use on a parse-

    only Address Validation transformation:

    Data Quality 8.6.2 Port Name Data Quality 9.0.1 Port Name

    GlobalAV_ParsedSuiteName SubBuildingNumber1

    GlobalAV_ParsedSuiteRange SubBuildingName1

    GlobalAV_ParsedPre_Direction StreetPreDirectional1

    GlobalAV_ParsedSuffix StreetPostDescriptor1

    GlobalAV_ParsedPost_Direction StreetPostDirectional1

    Identity Match Settings

    Some Identity Match component settings in Data Quality 8.6.2 are not available in Data Quality 9.0.1.

    The following Identity Match component settings are not available in Data Quality 9.0.1:

    Stop on Error 

    When selected, this option stops identity matching plans if the plan generates an error.

    Identity Match Decision Port

    The output from this port indicates the status of identity match decisions.

    Partial Support for the Context Parser Merge Option

    Data Quality 9.0.1 provides partial support for the functionality enabled by the Merge option in the 8.6.2

    Context Parser component.

    The Data Quality 8.6.2 Context Parser component could merge many repeated tokens into a single block of

    data. The migration process uses Context Parser components to create Labeler and Pattern-Based Parser

    28 Chapter 3: Troubleshooting Migration of Data Quality Objects

  • 8/18/2019 DQ 901HF2 RepositoryMigrationGuide En

    29/29

    transformations. Although these transformations can merge repeated tokens, input that produces more than

    five repeated tokens may result in output data that differs from Data Quality 8.6.2 plan output.

    Substring Dictionary Labeling

    In the Data Quality 8.6.2 Character Labeler component, you can set a start and length value to limit theamount of text processed when using a dictionary. You cannot use these settings in Data Quality 9.0.1.

    In Data Quality 9.0.1, the Labeler transformation applies dictionary labeling across all of the data in the input

    port. The migration operation adds a warning to the migration report file if substring dictionary labeling is

    present in a migrated plan.