endeca content acquisition system - oracle 2012. 7. 19.¢  exploration and endeca content

Download Endeca Content Acquisition System - Oracle 2012. 7. 19.¢  exploration and   Endeca Content

Post on 07-Oct-2020

2 views

Category:

Documents

0 download

Embed Size (px)

TRANSCRIPT

  • Endeca Content Acquisition System

    Developer's Guide

    Version 3.0.2 • March 2012

  • Contents

    Preface.............................................................................................................................9 About this guide............................................................................................................................................9 Who should use this guide............................................................................................................................9 Conventions used in this guide...................................................................................................................10 Contacting Oracle Support.........................................................................................................................10

    Part I: Introduction to CAS and Crawling Data Sources.......................11

    Chapter 1: Introduction............................................................................13 Overview of the Endeca Content Acquisition System..........................................................................13 About the Endeca CAS Service...........................................................................................................15 About the CAS Server ........................................................................................................................15 About the Component Instance Manager............................................................................................16 About the Record Store.......................................................................................................................16 About the Dimension Value Id Manager..............................................................................................20 Security requirements..........................................................................................................................21

    Chapter 2: Creating and configuring a crawl ........................................23 About creating and configuring crawls.................................................................................................23 Configuring a crawl to write to a Record Store instance......................................................................27 Configuring a crawl to write to an MDEX compatible format...............................................................30 Configuring a crawl to write to an output file........................................................................................33 Setting document conversion options..................................................................................................35 About filters..........................................................................................................................................37

    Chapter 3: Configuring a Record Store instance...................................41 Configuring a Record Store instance...................................................................................................41 Configuration properties for a Record Store instance.........................................................................42 Change properties and new Record Store instances..........................................................................46 Disabling automatic management of a Record Store instance............................................................47

    Chapter 4: Running a crawl.....................................................................49 Running a crawl...................................................................................................................................49 Order of execution in a crawl configuration..........................................................................................49 Full and incremental crawling modes..................................................................................................50 Crawls and archive files.......................................................................................................................51 About writing records to a Record Store instance...............................................................................54 About the record output file.................................................................................................................54

    Chapter 5: Running the CAS sample applications................................57 About the sample CAS applications....................................................................................................57

    Part II: Loading data into an MDEX Engine............................................73

    Chapter 6:Creating a Forge pipeline to read from or write to a Record Store.75 Overview of a Forge pipeline...............................................................................................................75 Creating a Forge pipeline ...................................................................................................................76

    Chapter 7: Creating a CAS crawl to write MDEX compatible output...83 Overview of a CAS crawl that produces MDEX compatible output......................................................83 Loading dimension values into Record Store instances......................................................................84 Loading data records into Record Store instances..............................................................................87

    iii

  • Creating and configuring a crawl to write MDEX compatible output....................................................89

    Part III: CAS Command Line Utilities......................................................91

    Chapter 8: CAS Server Command-line Utility........................................93 Overview of the CAS Server Command-line Utility..............................................................................93 About CAS capabilities........................................................................................................................95 Saving passwords in a crawl configuration file....................................................................................95 Inspecting installed modules...............................................................................................................96 Managing crawls................................................................................................................................101 Managing dimension value Ids..........................................................................................................111 Viewing crawl status and results........................................................................................................116

    Chapter 9: Component Instance Manager Command-line Utility.......121 Overview of the CIM Command-line Utility........................................................................................121 Creating a Record Store....................................................................................................................122 Deleting a Record Store....................................................................................................................123 Listing components............................................................................................................................124 Listing types.......................................................................................................................................125

    Chapter 10: Record Store Command-line Utility..................................127 Overview of the Record Store Command-line Utility.........................................................................127 Writing tasks......................................................................................................................................129 Reading tasks....................................................................................................................................130 Utility tasks........................................................................................................................................133

    Part IV: Administering CAS....................................................................145

    Chapter 11: Running CAS components................................................147 About running CAS components.......................................................................................................147 Running the Endeca CAS Service from the Windows Services console...........................................148 Starting the Endeca CAS Service from a command prompt.............................................................148 Stopping the Endeca CAS Service from a command prompt............................................................150

    Chapter 12: Backing up and restoring CAS ........................................151 Coordinating backups and restore operations...................................................................................151 Online backup and restore operations...............................................................................................151 Offline backup and restore operations...............................................................................................154

    Chapter 13: Configuring SSL.................................................................155 About configuring SSL in the Content Acquisition System................................................................155 Enabling SSL for the Endeca CAS Service.......................................................................................156 Enabling SSL for CAS Console

View more