oracle® endeca information discovery · pdf file...

Click here to load reader

Post on 29-Sep-2020

3 views

Category:

Documents

0 download

Embed Size (px)

TRANSCRIPT

  • Oracle® Endeca Information Discovery

    Integrator Components Guide

    Version 2.3.0 • June 2012 • Revision A

  • Copyright and disclaimer Copyright © 2003, 2012, Oracle and/or its affiliates. All rights reserved.

    Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners. UNIX is a registered trademark of The Open Group.

    This software and related documentation are provided under a license agreement containing restrictions on use and disclosure and are protected by intellectual property laws. Except as expressly permitted in your license agreement or allowed by law, you may not use, copy, reproduce, translate, broadcast, modify, license, transmit, distribute, exhibit, perform, publish or display any part, in any form, or by any means. Reverse engineering, disassembly, or decompilation of this software, unless required by law for interoperability, is prohibited.

    The information contained herein is subject to change without notice and is not warranted to be error-free. If you find any errors, please report them to us in writing.

    If this is software or related documentation that is delivered to the U.S. Government or anyone licensing it on behalf of the U.S. Government, the following notice is applicable:

    U.S. GOVERNMENT END USERS: Oracle programs, including any operating system, integrated software, any programs installed on the hardware, and/or documentation, delivered to U.S. Government end users are "commercial computer software" pursuant to the applicable Federal Acquisition Regulation and agency- specific supplemental regulations. As such, use, duplication, disclosure, modification, and adaptation of the programs, including any operating system, integrated software, any programs installed on the hardware, and/or documentation, shall be subject to license terms and license restrictions applicable to the programs. No other rights are granted to the U.S. Government.

    This software or hardware is developed for general use in a variety of information management applications. It is not developed or intended for use in any inherently dangerous applications, including applications that may create a risk of personal injury. If you use this software or hardware in dangerous applications, then you shall be responsible to take all appropriate fail-safe, backup, redundancy, and other measures to ensure its safe use. Oracle Corporation and its affiliates disclaim any liability for any damages caused by use of this software or hardware in dangerous applications.

    This software or hardware and documentation may provide access to or information on content, products and services from third parties. Oracle Corporation and its affiliates are not responsible for and expressly disclaim all warranties of any kind with respect to third-party content, products, and services. Oracle Corporation and its affiliates will not be responsible for any loss, costs, or damages incurred due to your access to or use of third-party content, products, or services.

    Rosette® Linguistics Platform Copyright © 2000-2011 Basis Technology Corp. All rights reserved.

    Teragram Language Identification Software Copyright © 1997-2005 Teragram Corporation. All rights reserved.

    Oracle® Endeca Information Discovery: Integrator Components Guide Version 2.3.0 • June 2012 • Revision A

  • Table of Contents

    Copyright and disclaimer ..........................................................ii

    Preface .........................................................................ix About this guide ................................................................ix Who should use this guide ........................................................ix Conventions used in this guide .....................................................ix Contacting Oracle Customer Support.................................................ix

    Chapter 1: Integrator Overview .....................................................1 Integrator UI ...................................................................1 List of Information Discovery connectors...............................................2 Integrator Server ................................................................5

    Chapter 2: Before You Begin .......................................................6 Data loading strategies and concepts .................................................6

    Which updates to run.........................................................6 When to use outer transactions .................................................7

    Recommended order of loading data .................................................8 Supported data types.............................................................8 Default values for new attributes.....................................................9 Additional documentation.........................................................11

    Chapter 3: Integrator Configuration ................................................12 Endeca-specific parameters in workspace.prm .........................................12 Creating mdexType Custom properties...............................................14 Setting a default time zone for incoming data ..........................................17 Verifying installed Web service versions ..............................................18 Specifying multiple record delimiters.................................................19 Configuring SSL ...............................................................20

    Chapter 4: Working with Outer Transaction Graphs...................................22 About outer transactions .........................................................22 Requirements for running graphs within a transaction ....................................22 Wrapping existing graphs in an outer transaction .......................................23 Creating a Transaction RunGraph graph..............................................24

    Format of the steps input file ..................................................24 Adding components to the transaction graph.......................................25 Configuring the Reader for the transaction input file .................................26 Configuring the Edge for Reader component.......................................26 Configuring the Transaction RunGraph connector ...................................27 Running the transaction graph .................................................29

    Committing or rolling back an outer transaction.........................................29

    Oracle® Endeca Information Discovery: Integrator Components Guide Version 2.3.0 • June 2012 • Revision A

  • Performance impact of transactions .................................................31

    Chapter 5: Full Initial Load of Records ..............................................32 Overview of the full initial load .....................................................32 Creating a project ..............................................................33 Source data format .............................................................34

    Adding the source data to the project ............................................36 Creating a graph ...............................................................37 Adding components to the graph ...................................................38 Configuring the components.......................................................39

    Configuring the Reader component .............................................39 Configuring metadata for the Reader Edge ....................................41

    Configuring the Reformat component ............................................42 Configuring metadata for the Reformat Edge ..................................45

    Configuring the Bulk Add/Replace Records connector................................45 Running the graph to load records ..................................................47

    Chapter 6: Incremental Updates....................................................48 Overview of incremental updates ...................................................48 Adding components to the incremental updates graph....................................49 Configuring the Reader and the Edge for incremental updates..............................50 Configuring the Add/Update Records connector ........................................50 Running the incremental updates graph ..............................................51

    Chapter 7: Loading the Attribute Schema ...........................................53 About attribute schema files.......................................................53 Loading the standard attribute schema ...............................................53

    Format of the PDR input file...................................................54 Adding components to the standard attributes schema graph...........................55 Configuring the Reader for the PDR input file ......................................56

    Configuring the Reader Edge..............................................56 Configuring the Reformat component for standard attributes ...........................57

    Configuring the Reformat Edge ............................................59 Configuring the Denormalizer component .........................................61

    Configuring the Denormalizer Edge .........................................63 Configuring the WebServiceClient component for standard attributes .....................64

    Loading the managed attribute schema ..............................................67 Format of the DDR input file.........................................

View more