buldoser user guide v3.3

111
Crown Software Buldoser Buldoser User Guide Version 3.3 September, 2006 © Copyright 2003 - 2006 - Crown Partners, LLC

Upload: malinishanmuganat1

Post on 03-Oct-2014

113 views

Category:

Documents


6 download

TRANSCRIPT

Page 1: Buldoser User Guide v3.3

Crown Software Buldoser

Buldoser User Guide Version 3.3 September, 2006

© Copyright 2003 - 2006 - Crown Partners, LLC

Page 2: Buldoser User Guide v3.3

Buldoser User Guide

© Copyright 2003 - 2006 - Crown Partners, LLC 2

Table of Contents Introduction......................................................................................................................... 1

Purpose of this Manual ................................................................................................... 1 Intended Audience .......................................................................................................... 1 Document Conventions................................................................................................... 1

Getting Started .................................................................................................................... 2 Starting Buldoser ............................................................................................................ 2 Using the Docbase Browser............................................................................................ 3

Opening cabinets and folders.................................................................................. 4 Getting an object’s Object ID ................................................................................. 4 Unlocking objects ................................................................................................... 4 Deleting objects ...................................................................................................... 4 Refreshing the Browser........................................................................................... 4 Showing and Hiding User Cabinets........................................................................ 5 Logging into another Docbase ................................................................................ 5 Exiting the Docbase Browser.................................................................................. 5 Viewing the current version of Buldoser ................................................................ 5

Docbase to Docbase Overview ........................................................................................... 6 The Buldoser Methodology ............................................................................................ 6 Supported Object Types.................................................................................................. 8 ETL Process .................................................................................................................... 8

Check in content objects ....................................................................................... 10 Mandate a content freeze ...................................................................................... 10 Move supporting objects....................................................................................... 10 Create Batch Folder .............................................................................................. 10 Extract Content ..................................................................................................... 10 Finish Extract ........................................................................................................ 11 Switch Docbase..................................................................................................... 11 Transform.............................................................................................................. 11 Load Content......................................................................................................... 11 Finish Load ........................................................................................................... 11 Resolve Errors....................................................................................................... 12 Reprocess Errors ................................................................................................... 12 Load Relationships................................................................................................ 13 Finish Relationship Load ...................................................................................... 13 Resolve Relationship Errors ................................................................................. 14 Reprocess Relationship Errors.............................................................................. 14 Test Content .......................................................................................................... 14 Undo Load ............................................................................................................ 14

Extracting Content from a Docbase.................................................................................. 15 Starting an Extract......................................................................................................... 15 Finishing a Stopped Extract .......................................................................................... 21

Mapping Data Values ....................................................................................................... 22

Page 3: Buldoser User Guide v3.3

Buldoser User Guide

© Copyright 2003 - 2006 - Crown Partners, LLC 3

Mapping Attributes ....................................................................................................... 25 Creating ACLs and Accessing Groups ....................................................................... 25

Creating Folders............................................................................................................ 27 Preview Folder Creation ............................................................................................... 29

Loading Content into a Docbase....................................................................................... 31 Starting a New Load ..................................................................................................... 31 Finishing a Stopped Load ............................................................................................. 38 Reprocessing Load Errors............................................................................................. 39 Starting a Relationship Load......................................................................................... 40 Finishing a Stopped Relationship Load ........................................................................ 42 Reprocessing Relationship Load Errors........................................................................ 43 Undoing a Load............................................................................................................. 43

Scheduling an ETL Operation .......................................................................................... 46 Scheduling Overview.................................................................................................... 46

Extract Jobs........................................................................................................... 46 Load Jobs .............................................................................................................. 47

Scheduling an Extract Job............................................................................................. 47 Scheduling a Load Job .................................................................................................. 49 Editing an Existing Extract or Load Job ....................................................................... 51 Deleting an Extract or Load Job ................................................................................... 51

Database to Docbase Overview ........................................................................................ 53 Connecting to a Data Source......................................................................................... 53 Mapping a Data Source to Documentum’s Data Model............................................... 53

The Object View................................................................................................... 54 Supporting Views.................................................................................................. 54 Object Configurations........................................................................................... 55 Inline Data Transformation................................................................................... 56 Attribute Configuration......................................................................................... 57 Content Configuration .......................................................................................... 57 Folder Configuration............................................................................................. 58 Security Configuration.......................................................................................... 58 Versioning Configuration ..................................................................................... 59 Pre- and Post-Processing....................................................................................... 59 Multi-threaded Loading Algorithm....................................................................... 60

Loading Content from a Database .................................................................................... 61 Creating a New Configuration or Configuring an Existing Configuration................... 61 Finishing a Stopped Database Load.............................................................................. 96 Undoing a Database Load............................................................................................. 97

Appendix – EDMS98 Operations ................................................................................... 100 Content View .............................................................................................................. 100 Defining Object View................................................................................................. 100

Page 4: Buldoser User Guide v3.3

Buldoser User Guide

© Copyright 2003 - 2006 - Crown Partners, LLC 4

............. 102

Page 5: Buldoser User Guide v3.3

Buldoser User Guide

© Copyright 2003 - 2006 - Crown Partners, LLC 5

.......... 103

Page 6: Buldoser User Guide v3.3

Buldoser User Guide

© Copyright 2003 - 2006 - Crown Partners, LLC 6

.......... 105

Page 7: Buldoser User Guide v3.3

Introduction This section describes the purpose of this manual and its intended audience.

Purpose of this Manual The Buldoser User Guide provides instructions for performing Extract, transform and load (ETL) operations in a Documentum environment using the Crown Partners Buldoser products. This guide provides instructions on how to use Buldoser for both Docbase to Docbase operations as well as Database to Docbase operations. For more general information on ETL operations, read the last chapter entitled, “ETL Overview.”

Intended Audience Movement of content between Docbases requires knowledge of the Documentum repository and of the specific content being moved. This manual is for administrators of Docbases and assumes the user is familiar with basic Documentum skills, including:

Documentum Query Language (DQL) (for more information, see the Documentum DQL Reference)

Documentum object model (for more information, see the Documentum Object Reference)

Documentum security structures such as Users, Groups, and ACLs Using Documentum Administrator Documentum Cabinets and Folders

Document Conventions Table 1-2. Conventions used in this Guide

Convention Where used

This type is used For emphasis, for support documentation titles, and for text found in tables.

This type is used To indicate keyboard keys, button names, or menu items that needed to press, click, or select.

“This type is used” To indicate text needed in a field. The quotes are not typed; only the information within the quotes is typed.

<<This type is used>> To indicate a variable type of information. The double less than / greater than signs and the actual text is not typed. The specific information represented by the variable within the double less than / greater than signs is typed.

Page 8: Buldoser User Guide v3.3

Buldoser User Guide

© Copyright 2003 - 2005 - Crown Partners, LLC 2

Getting Started This section provides instructions for getting Buldoser running the first time. It gives an overview of logging in and operating the Docbase Browser.

Starting Buldoser Buldoser can be run from the installed shortcuts (Windows only) or the supplied batch files (LaunchBuldoser.bat or LaunchBuldoser.sh for Windows or UNIX, respectively). The first time Buldoser is run it will ask for a license key. See Figure 1.

Figure 1: Buldoser license key challenge dialog

License keys can be obtained from Crown Partners. License keys are provided by installation machine to customers who have purchased a license. To purchase a license or obtain a license key, contact [email protected]. The license key is stored in a file named “BuldoserLicense.txt” in the installation directory. If an upgraded license is purchased, the new key must be entered in this file. After the license key has been entered successfully, the End User License Agreement is presented. Click the Accept option followed by the “Ok” button to proceed. The login dialog will then be displayed. See Figure 2.

Figure 2: Buldoser user login dialog

Page 9: Buldoser User Guide v3.3

Buldoser User Guide

© Copyright 2003 - 2005 - Crown Partners, LLC 3

If no Docbase is currently available, Buldoser will present an error message. The dmcl.ini file should be checked for correct connection information and that the Docbases are currently available. To start Buldoser, enter a valid username and password, and then click Login. It is suggested to use a Documentum super user to perform ETL operations. The Buldoser Docbase Browser will open. See Figure 3.

Figure 3: Buldoser Docbase Browser

Using the Docbase Browser The Docbase Browser provides an easy way to select items for Extract, as well as an interface to view the results of a load. Each feature is described in the following paragraphs with a description for use.

Page 10: Buldoser User Guide v3.3

Buldoser User Guide

© Copyright 2003 - 2005 - Crown Partners, LLC 4

Opening cabinets and folders

To open a cabinet or folder, double-click the cabinet or folder from either the tree view on the left, or the table on the right. The cabinet or folder may also be single-clicked, and then opened or closed using the +/- sign in the tree view. If there are a large number of items in a cabinet or folder, Buldoser will warn that opening may take a long time.

Getting an object’s Object ID

To copy an item’s Object ID into the Clipboard, select the item, then right-click and select Get Object ID. This can be useful for pasting into IAPI, IDQL, or another administrator tool for doing further research on an object. The Get Object ID menu item may also be reached from the File menu.

Unlocking objects To unlock checked out objects, select the items then right-click and select Unlock. This can be useful for unlocking objects so that they may be deleted. Multi-select is enabled, so items may be selected using Ctrl + click for individual items or Shift + click for a range of items. The Unlock menu item may also be reached from the File menu.

Deleting objects To delete objects, select the items then right-click and select Delete Current Version. This will delete only the current version of the objects. To delete all versions of the selected objects, select Delete All Versions. Multi-select is enabled, so items may be selected using Ctrl + click for individual items or Shift + click for a range of items. The Delete menu items may also be reached from the File menu.

Refreshing the Browser

To refresh the current view, select Refresh from the File menu. This is useful for viewing new cabinets that are created during a load. Cabinets or folders may also be clicked again in the tree view to refresh their contents.

Page 11: Buldoser User Guide v3.3

Buldoser User Guide

© Copyright 2003 - 2005 - Crown Partners, LLC 5

Showing and Hiding User Cabinets

By default Buldoser will only show the current user’s personal cabinet and all non-personal user cabinets. To show all users’ cabinets, select Show User Cabinets from the File menu. The menu item will then toggle to Hide User Cabinets, which can be selected to hide other user’s personal cabinets.

Logging into another Docbase

To log into a different Docbase, select Switch Docbase from the File menu. The Login dialog will appear and a different Docbase may be selected.

Exiting the Docbase Browser

To exit the Docbase Browser and Buldoser, select Exit from the File menu.

Viewing the current version of Buldoser

To view the current version of Buldoser, select About Buldoser… from the Help menu.

Page 12: Buldoser User Guide v3.3

Buldoser User Guide

© Copyright 2003 - 2005 - Crown Partners, LLC 6

Docbase to Docbase Overview This section gives an overview of the features and high-level steps for performing ETL operations between two Documentum Docbases using Buldoser.

The Buldoser Methodology When content is moved from one Docbase to another, that content relies on supporting and related objects to exist in the target Docbase for the operation to be successful. Figure 4 below illustrates these relationships.

Figure 4: Content to Supporting Object Relationship When Buldoser moves Content Objects, it assumes all supporting and related objects already exist in the target Docbase. They must be moved separately before the Content Objects are moved. This approach is different from other tools such as Dump & Load and DocApps, which will move any supporting or related objects in addition to the Content Objects to ensure success.

Page 13: Buldoser User Guide v3.3

Buldoser User Guide

© Copyright 2003 - 2005 - Crown Partners, LLC 7

The following reasons support the Buldoser approach:

Moving these supporting and related objects adds much more time to the Extract and load process, which is not practical for very large volumes.

These objects usually already exist in a controlled implementation, making it unnecessary to move them.

In some scenarios, the administrator does not wish to use the same supporting objects in the target that is used in the destination. For instance, the user who owns a particular document in the source Docbase should be changed in the target Docbase.

With other tools, these supporting and related objects are frequently duplicated, causing clutter and inefficiency in the target Docbase.

Other tools tend not to provide an exhaustive list of what was automatically moved, making it difficult for administrators to determine what was moved in the batch.

The Buldoser methodology for ensuring success is to identify for the administrator what objects must exist in the target Docbase, and allowing the user to map supporting objects from the source to the target. Figure 5 illustrates the mapping concept.

Figure 5: Buldoser Mapping

Page 14: Buldoser User Guide v3.3

Buldoser User Guide

© Copyright 2003 - 2005 - Crown Partners, LLC 8

The idea of mapping supporting and related objects is very powerful. In addition to making ETL operations much more efficient, it also gives the administrator complete control over what is moved, as well as guaranteeing there is no duplication of supporting objects. The mapping feature also allows the administrator to “clean up” the data as it’s moved by streamlining the object model, security model, folder structure, etc. during ETL operations.

Supported Object Types Buldoser was developed to handle the task of moving large volumes of content objects. In the current version, Buldoser only moves objects of type dm_document and its subtypes. To move other types such as lifecycles, object types, workflows, alias sets, etc., it is recommended to use DocApps with Documentum’s Application Builder and Application Installer.

ETL Process Figure 6 below describes the overall process for moving content from Docbase to Docbase using Buldoser.

Page 15: Buldoser User Guide v3.3

Buldoser User Guide

© Copyright 2003 - 2005 - Crown Partners, LLC 9

Figure 6: Docbase to Docbase Process

Page 16: Buldoser User Guide v3.3

Buldoser User Guide

© Copyright 2003 - 2005 - Crown Partners, LLC 10

Check in content objects

To ensure all content is moved, any outstanding updates should be checked into the Docbase using the appropriate client application.

Mandate a content freeze

To keep updates from being lost during the process, all users should be notified that an ETL operation is scheduled to take place, and that updates should be postponed until after the operation is complete.

Move supporting objects

Any configuration objects that exist in the source Docbase should be moved to the destination Docbase before the content movement begins, as content depends on these objects existing in the target. Supporting objects include, but are not limited to:

Object Types Users ACLs Alias Sets Lifecycles Folders XML Applications Formats Storage Locations

Create Batch Folder

A batch folder is simply the location where content objects will be Extracted. This location is remembered by Buldoser as the batch name, so every effort should be made to make these batches uniquely named. A calculation of content size should be made to ensure enough space exists in the batch folder before the Extract step.

Extract Content Using Buldoser, Extract the objects to be moved to the batch folder. Buldoser will also Extract a list of the supporting objects that must exist in the target Docbase. See the section, Extracting Content from a Docbase for more information. Buldoser allows Extract operations to be stopped and restarted from the stopping point at a later time. If the Extract is stopped, proceed to the Finish Extract step. Once the Extract is complete, proceed to the Switch Docbase step.

Page 17: Buldoser User Guide v3.3

Buldoser User Guide

© Copyright 2003 - 2005 - Crown Partners, LLC 11

Finish Extract If the Extract was stopped, Buldoser may restart the Extract from the stopping point. See the section, Extracting Content from a Docbase for more information. Once the Extract is complete, proceed to the Switch Docbase step.

Switch Docbase After the Extract is finished, login to the target Docbase to map the supporting objects. See the section Using the Docbase Browser for more information.

Transform Once logged into the target Docbase, the administrator will Transform. Mapping data values identifies any potential issues due to supporting objects not existing in the target Docbase, and allows the administrator to resolve these issues before the load. At this step the ability also exists to create custom Folders and any ACLs and groups that correspond to that existing folder structure. This concept is core to Buldoser’s ETL process; it is described in more detail in the section, The Buldoser Methodology. For more information on performing data mapping, see the section, Mapping Data Values.

Load Content Using Buldoser, load the Extracted objects from the batch folder. Buldoser will use the mappings stored in the batch folder to transform the content objects during the load. See the section, Loading Content into a Docbase for more information. Buldoser allows loads to be stopped and restarted from the stopping point at a later time. If the load is stopped, proceed to the Finish Load step. If the load completes but has errors, proceed to the Resolve Errors step. If the load is not stopped and has no errors, Buldoser will proceed immediately to the Load Relationships step.

Finish Load If the load was stopped, Buldoser may restart the load from the stopping point. See the section, Loading Content into a Docbase for more information. If the load completes but has errors, proceed to the Resolve Errors step. If the load is not stopped and has no errors, Buldoser will proceed immediately to the Load Relationships step.

Page 18: Buldoser User Guide v3.3

Buldoser User Guide

© Copyright 2003 - 2005 - Crown Partners, LLC 12

Resolve Errors Should errors exist in the load, Buldoser will set aside the objects in an error log to be reprocessed once the issue is fixed. Usually errors are due to Docbases being stopped or incorrect or incomplete mappings. Review the log file for the load to determine the problem. Once the issue is resolved, proceed to the Reprocess Errors step.

Reprocess Errors Buldoser allows for only the errors in a load to be reprocessed. This way the administrator does not have to remove and reload successful objects to try the load again. Buldoser attempts to always move forward during the load process to be as efficient as possible. See the section, Loading Content into a Docbase for more information on reprocessing load errors. Once all errors are reprocessed, proceed to the Load Relationships step.

Page 19: Buldoser User Guide v3.3

Buldoser User Guide

© Copyright 2003 - 2005 - Crown Partners, LLC 13

Load Relationships This step refers to the process of loading dm_relations and virtual document links, generically referred to as “relationships.” Relationships are loaded in a separate phase from the core Content Objects for two reasons:

Both the parent and child objects in a relationship must exist before the relationship can be created. If relationships were created at the same time as the content objects, it would force the administrator to load the content objects in separate batches and in a particular order.

If any errors exist in the initial phase, they can be resolved before relationships are created. If relationships were loaded immediately after, errors would be duplicated in both phases creating twice the number of issues for the administrator to resolve.

If there are no errors during the initial load, relationships will automatically be started immediately after the first phase. If relationships exist but were not created immediately after the first phase, use Buldoser to create the relationships. See the section, Loading Content into a Docbase for more information. Buldoser allows relationship loads to be stopped and restarted from the stopping point at a later time. If the relationship load is stopped, proceed to the Finish Relationship Load step. If the load completes but has errors, proceed to the Resolve Relationship Errors step. If the load is not stopped and has no errors, proceed to the Test Content step.

Finish Relationship Load

If the relationship load was stopped, Buldoser may restart the load from the stopping point. See the section, Loading Content into a Docbase for more information. If the load completes but has errors, proceed to the Resolve Relationship Errors step. If the load is not stopped and has no errors, proceed to the Test Content step.

Page 20: Buldoser User Guide v3.3

Buldoser User Guide

© Copyright 2003 - 2005 - Crown Partners, LLC 14

Resolve Relationship Errors

Should errors exist in the relationship load, Buldoser will set aside the objects in an error log to be reprocessed once the issue is fixed. Usually errors are due to child objects not existing in the target Docbase. To resolve them, locate the child objects in the source Docbase and move them using Buldoser. Once the child objects are move, proceed to the Reprocess Relationship Errors step.

Reprocess Relationship Errors

Buldoser allows for only the errors in a relationship load to be reprocessed. This way the administrator does not have to remove and reload successful objects to try the load again. See the section, Loading Content into a Docbase for more information on reprocessing relationship load errors. Once all errors are reprocessed, proceed to the Test Content step.

Test Content After any load operation, the content should be tested in the target Docbase using the appropriate client application. Usually it is sufficient to test the first 100 objects or so, then randomly test 5-10% of the remaining population. If testing fails, re-examine the Extract and load logs for any errors, incorrect mappings, or incorrectly handled objects. If the batch was executed incorrectly, proceed to the Undo Load step. If the testing succeeds, the process is complete.

Undo Load Buldoser provides the capability to remove any objects that are loaded. See the section, Loading Content into a Docbase for more information.

Page 21: Buldoser User Guide v3.3

Buldoser User Guide

© Copyright 2003 - 2005 - Crown Partners, LLC 15

Extracting Content from a Docbase This section provides step-by-step instructions for Extracting content from a Docbase using Buldoser. If this is the first time moving content using Buldoser, see the section, Docbase to Docbase Overview for an important description of how Buldoser moves content objects.

Starting an Extract This section describes starting a new Extract. Extracts may be stopped and completed at a later time from the stopping point. To complete a stopped Extract, see the section, Finishing a Stopped Extract. To start Extracting content from a Docbase, follow these steps:

1. Create a location on the file system to contain the Extracted content and attributes. Make sure the location has enough space.

2. Select New Extract from the Docbase Extract menu. The Exract dialog will appear.

Page 22: Buldoser User Guide v3.3

Buldoser User Guide

© Copyright 2003 - 2005 - Crown Partners, LLC 16

3. Enter the location into the Extract Directory text box. Select the […] button to

browse for the location. 4. Create a DQL statement to identify the objects to Extract. The DQL must be of

the form dm_document where…

Note that the DQL statement may not contain the (all) keyword. Buldoser can also automatically create a DQL statement based on a cabinet, folder, or document that is selected from the Docbase Browser. If documents are selected, the DQL statement will “collect” object IDs as the Docbase is browsed. To close the Extract dialog but save the values in the DQL, click Save Settings.

5. Select Threads to indicate the number of threads that will be used for this Extract operation.

Page 23: Buldoser User Guide v3.3

Buldoser User Guide

© Copyright 2003 - 2005 - Crown Partners, LLC 17

6. Select Extract Content to Extract all renditions along with the attributes. a. Select Local Extract to indicate that the Metadata and any content and

renditions will be stored locally in the Extract Directory indicated in Step 3.

b. Select Links from File Store to designate that all content and renditions will not be Extracted, however, a link relative to the content storage directory of the filestore will be saved in the xml metadata. This is referred to as Contentless Migration and can reduce the time of the overall operation if the Renditions are significantly large.

c. Choose the Metadata only option if content less objects are desired. No content will be Extracted as a result of this option.

7. Select Extract Renditions to indicate if the content has more than just page=0 renditions.

a. Select Page 0 Renditions if all of the objects in the batch have associated dmr_content objects with page=0.

b. Select All Renditions if any renditions in the batch have dmr_content objects associated with page=1. The will also preserve content metadata. Know that this will impose an additional load on the docbase during both the Extract and load operations. It is recommended to verify that there are dmr_content objects with page=1 in the batch prior to selecting this option. For more information on content metadata refer to the Content Server Fundamentals Guide.

8. Select Extract Relationships to Extract any dm_relation objects that refer to the objects being Extracted as the parent in the relationship.

9. Select Extract Lifecycle Setting to Extract what Lifecycle is attached to the objects. Note: This does not Extract the Lifecycle itself. Lifecycle movement should be performed using Documentum’s Application Builder.

10. Select Extract Virtual Docs to Extract virtual document relationships that refer to the objects being extracted as the parent in the relationship.

11. Select Extract All Versions to extract the entire version tree for each object. If this option is not selected, only the current version will be Extracted.

12. Select Extract Audit Trail to Extract any dm_audittrail objects associated to the content within this batch. Note that moved audit trail will be associated to the same user as the existing system.

13. Click Save Settings to save the settings but not perform the Extract immediately. 14. Click Extract Dependencies to create a dependency mapping file only. 15. Click Cancel to close the dialog. 16. Click the Extract button to start Extracting immediately. First the Folders, Groups

and ACLs for the batch will be Extracted. Upon completion the following status dialog will appear.

Page 24: Buldoser User Guide v3.3

Buldoser User Guide

© Copyright 2003 - 2005 - Crown Partners, LLC 18

The dialog is explained below: Item

Description

Title Bar Indicates the operation being performed

Location Current Batch Location

Total Objects Total number of objects anticipated to be processed with this operation. An important distinction to make here is that this number represents individual number of objects across a version tree. Additionally this progress bar is only update after the processing of an entire version tree.

Page 25: Buldoser User Guide v3.3

Buldoser User Guide

© Copyright 2003 - 2005 - Crown Partners, LLC 19

Total Threads Number of threads being used. Each Thread has a Docbase session and separate pool of resources used for processing objects.

Max Objects/Thread Indicates the number of objects that can be allocated to be processed by a single thread at a given time.

Progress Number of objects that remain to be processed for the entire operation

Messages Status Messages for the operation.

Total Throughput Number of objects/second being processed collectively by all of the worker threads.

View Stats for Thread Number

Using the “<” and “>” buttons, each individual threads statistics can be analyzed.

1. Number Waiting – Number of objects that are waiting to be processed by this thread. Note that for a multiple version load that this one object actually accounts for the entire version tree.

2. Number Processed – Number of objects processed by this thread. This includes multiple versions of objects.

3. Number Failed – Number of objects that were failed to be processed.

4. Average Processing Time – Average Processing Time for this thread.

View Log Will open the log file associated to the

currently Selected Thread. Stop Will stop the dealing of objects to each

of the threads. Note that this will not stop the operation; each of the threads will need to process their queue of objects. Only when the “Stopped” message appears is the operation is in a

Page 26: Buldoser User Guide v3.3

Buldoser User Guide

© Copyright 2003 - 2005 - Crown Partners, LLC 20

completed state.

Close Closes the HUD Dialog.

Update Stats Will force the recalculation of the statistics on the currently displayed thread.

Page 27: Buldoser User Guide v3.3

Buldoser User Guide

© Copyright 2003 - 2005 - Crown Partners, LLC 21

Finishing a Stopped Extract This section describes finishing a stopped Extract. Extracts may be stopped and completed at a later time from the stopping point. This operation can only be performed after the Folders, ACLs, and Groups have been Extracted from the docbase. To start a new Extract, see the section, Starting an Extract. To finish exracting content from a Docbase, follow these steps:

1. Identify the location on the file system that contains the previously Extracted content and attributes.

2. Select Finish Extract from the Docbase Extract menu. The Finish Extract dialog will appear.

3. Enter the location of the Extract into the Extract Directory text box. The location

may be browsed by clicking the […] button 4. To close the dialog and save the entries, click Save Settings. 5. To close the dialog, click Cancel. 6. To resume the Extract, click Finish Extract.

Page 28: Buldoser User Guide v3.3

Buldoser User Guide

© Copyright 2003 - 2005 - Crown Partners, LLC 22

Mapping Data Values This section provides step-by-step instructions for mapping data values from a source Docbase to a target Docbase. If this is the first time moving content using Buldoser, see the section, Docbase to Docbase Overview for an important description of the theory behind mapping data values. When Buldoser executes an Extract, a dependency mapping (“buldoser.dep”) file is generated in the Extract location. This file identifies all the supporting objects such as ACLs, Object Types, and Folders that must exist in the target Docbase for a load operation to be successful. Mapping Data Values is the action of mapping these supporting objects to values in the target Docbase. Buldoser provides an interface for mapping these values while connected to the target Docbase. To Transform from the source to the target Docbase, follow these steps:

1. Login or switch to the target Docbase. For more information on connecting to a Docbase with Buldoser or switching Docbases, see the section, Getting Started.

2. Select Transform from the Docbase Load menu. The Transform Dialog will appear.

Page 29: Buldoser User Guide v3.3

Buldoser User Guide

© Copyright 2003 - 2005 - Crown Partners, LLC 23

3. Select Open from the File menu. A file browser will appear. Navigate to the Extract directory, select the dependency mapping file, and click Open.

If an Extract was just performed, Buldoser will present the option to open the dependency mapping file from the previous Extract.

4. After the file is opened, the Transform dialog will evaluate each dependency and

identify whether the supporting object exists in the target Docbase. For non-existent mappings, the value will display a red exclamation point, and the dependency type will display 2 exclamation points to make it easy to identify any potential errors during the load. Initially, the Transform dialog will show Object Type mappings.

Page 30: Buldoser User Guide v3.3

Buldoser User Guide

© Copyright 2003 - 2005 - Crown Partners, LLC 24

5. To resolve a mapping for a missing dependency or to map a value from the source Docbase to a different value in the target Docbase, select the row or rows to be mapped.

6. Select a value from the Target Options drop-down. 7. Click the Map button to map the selected rows to the selected target option. 8. Click the Map All button to map all rows to the selected target option. 9. Click the Clear button to remove mappings from selected rows. 10. Click the Clear All button to remove mappings from all rows. 11. To change to another Dependency Type, select the type from the drop-down. The

screen will update to show dependencies of the selected type. 12. After all mappings are completed, the changes can be saved by selected Save from

the File menu. If changes are not saved and the dialog is closed, Buldoser will warn the user before closing the file.

Page 31: Buldoser User Guide v3.3

Buldoser User Guide

© Copyright 2003 - 2005 - Crown Partners, LLC 25

Mapping Attributes Each object type included in the batch can have the value of the target attribute mapped from a different source prior to loading. This feature is especially useful when consolidating multiple custom types into a single custom type. In order to map attributes first map object types to their desired target value. For more information see the steps outlined in the section Transform. Once the object types are mapped select the button labeled “Map Attributes” In Buldoser version 3.3 and above, dm_owner, and acl_domains are mapped using the dm_dbo alias if the owner_name attribute is the docbase owner. This allows the dependency file to thereby be more portable across docbases by saving the step of mapping to docbase owner.

Creating ACLs and Accessing Groups It is important to know the basic theory of ACLs in order to understand how they are moved using Buldoser. An ACL can grant permission to either:

• Groups (dm_group objects) • Users (dm_user objects) • Aliases ( dm_alias_set objects)

It is important to know that because Buldoser does not import users, only the accessor groups are recreated. The following should be taken into consideration when creating ACLs:

• Upon creating ACLs, only dm_group accessors are granted permission to the ACL. Any User accessors extracted from a repository will not be recreated.

• Upon recreating a group any sub groups in the source repository will also be recreated in the target.

• If the User or Alias of the ACL does not exist, the dbo is used as the value for the owner_name (i.e the acl_domain)

To create an ACL open the Transform dialog and select the “ACL” dependency type. Refer to the Transform section for more information. At this point any number of ACLs can be selected from the table. Upon clicking the “Create Selected ACLs” button these ACLs will be created in the target docbase.

Page 32: Buldoser User Guide v3.3

Buldoser User Guide

© Copyright 2003 - 2005 - Crown Partners, LLC 26

Page 33: Buldoser User Guide v3.3

Buldoser User Guide

© Copyright 2003 - 2005 - Crown Partners, LLC 27

Creating Folders The “Transform” screen is also used to recreate an existing folder structure in the target repository. It is important to know that this function will recreate an entire security structure of a repository. Specifically, ACLs to which a folder refers may be created if not already existing in the repository. Additionally any groups those ACLs reference will be created if not already existing in the docbase. The ACLs of this folder structure should be mapped as desired prior to performing this operation. For more information on how to map ACLs refer to the section Transform. Upon an Extract operation all of the folder objects for a given batch are Extracted to the specified load directory. If a custom folder type (i.e. dm_folder is the supertype) the custom Metadata associated to that folder will additionally be Extracted. Some things to keep in mind while Extracting folders: • Map any desired folder types in the Transform window prior to creating folders. • Map any desired folder object type and attribute mapping prior to creating folders. • Map any desired folder ACL mapping prior to creating folders. It is important to

know that any ACL mapping will supersede any existing ACL value of a folder from the source repository.

To create a Folder, open the Transform dialog and select the “Folder” dependency type. Refer to the Transform section for more information. At this point any number of Folders can be select from the table. Upon clicking the “Create Selected Folders” button to these folder will then be created in the target docbase.

Page 34: Buldoser User Guide v3.3

Buldoser User Guide

© Copyright 2003 - 2005 - Crown Partners, LLC 28

Select the desired folder(s) and click the “Create Selected Folders” button. Note: If a large number of folders were selected it may take some time to process. All of the resultant created folders for this batch will be displayed in the table.

Page 35: Buldoser User Guide v3.3

Buldoser User Guide

© Copyright 2003 - 2005 - Crown Partners, LLC 29

Preview Folder Creation An additional option exists to see which Folders, ACLs, and groups will be created prior to creating them. To preview folder creation first open the Transform dialog and select the “Folder” dependency type. Refer to the Transform section for more information. At this point Folders can be selected from the table. Upon clicking the “Show Folder Dependencies” button all of the folder dependencies will be displayed.

Page 36: Buldoser User Guide v3.3

Buldoser User Guide

© Copyright 2003 - 2005 - Crown Partners, LLC 30

.

Page 37: Buldoser User Guide v3.3

Buldoser User Guide

© Copyright 2003 - 2005 - Crown Partners, LLC 31

Loading Content into a Docbase This section provides step-by-step instructions for loading content that was Extracted from a source Docbase into a target Docbase using Buldoser. Buldoser also provides the capability to load content from a database into Documentum. See the section, Database to Docbase Overview for more information on the feature. If this is the first time moving content using Buldoser, see the section, Docbase to Docbase Overview for an important description of how Buldoser moves content objects.

Starting a New Load This section describes starting a new load. Loads may be stopped and completed at a later time from the stopping point. To complete a stopped load, see the section, Finishing a Stopped Load. The user who logs into the target Docbase must be a super user. It is recommended that the user is the Docbase Owner (dbo). Before loading content, the Transform step should be performed. For more information on this step, see the section, Mapping Data Values. To load content into a target Docbase, follow these steps:

1. Select New Load from the Docbase Load menu. The load dialog will appear:

Page 38: Buldoser User Guide v3.3

Buldoser User Guide

© Copyright 2003 - 2005 - Crown Partners, LLC 32

2. Enter the location of the Extract into the Load Directory textbox, or browse using the […] button. If a ContentLess Extract is Specified, the option will be presented to specify a Base Path. First map a network driver to the existing filestore of the source docbase. An example of this location is: C:\Documentum\data\sbdev\content_storage_01\00002417. Note that in this case the source docbase is sbdev and the docbase id 2417(hex). Using this functionality will only save time only if the renditions are large.

3. Optionally a Processing Class can be specified. By implementing the interface IBuldoserTransform gives the full power of Java and the DFC classes to implement any custom functionality that might be required.

4. Select the Synchronization Setting to tell Buldoser how to react if the CURRENT version of an object being loaded matches a CURRENT object that already exists in the target system.

Page 39: Buldoser User Guide v3.3

Buldoser User Guide

© Copyright 2003 - 2005 - Crown Partners, LLC 33

• First the r_object_id is compared against the r_object_id of the object in the batch. This case would only occur if the extract and load occurred within the same docbase.

• Second the buldoser_audit_trail object is checked to determine if the source r_object_id of this object has been loaded. If so, this object is determined to be loaded.

• Lastly, The object_name and folder_path attributes are used to determine if this object has already been loaded.

The Syncronization Setting is particularly useful for running scheduled jobs against a production docbase. On a nightly docbase this job can be used to Syncronize good content to a development system. a. Select ALWAYS Create Version Tree to create objects regardess of existing

objects (default). The object is never checked to be existing. b. Select CREATE If Previous Object Does Not Exist to only load this version tree

if the CURRENT version of the batch is not found. c. Select REPLACE If Previous Object Exists to delete the existing version tree if

the CURRENT version of the object is found. d. Select CURRENT versions only. If Previous Object Exists Append Version Tree to

append the existing version tree with only the CURRENT version of the object. In other cases only the CURRENT version of the object will get loaded.

5. Select Lifecycle Promotion setting to indicate how many lifecycle promotions need to occur after the lifecycle is attached to this object.

a. Select No Lifecycle Promotion to only attach the lifecycle and do no promotion.

b. Select Promote Desired Number of Cycles to promote all content loaded with lifecycles a set number of times.

i. the number of times to promote any objects that are loaded with Lifecycles in the Promote Cycles drop-down.

c. Select Promote Content to Previous State to run the promote command on the content the same number of times as the value of the r_current_state attribute. Note: Not all lifecycles have an r_current_state that starts at zero. Check the i_state_no and the state_name attributes of the dm_policy objects in the source and target docbase prior to running this operation.

d. Select Set Previous Lifecycle State to set the r_current_state attribute of the content after attaching the lifecycle. This is useful in Web Content Management Systems where the promotion through lifecycle states takes a great deal of time. In this case Simply run the Site Publishing Job after the load to take full advantage of this feature.

6. Select the number of threads to use for the load from the Threads drop-down. Selecting multiple threads helps the content to load faster, but it is dependant on the hardware. Each thread will create an additional Docbase Session, which will impose an additional load on the Content Server. It is best to start with one or two

Page 40: Buldoser User Guide v3.3

Buldoser User Guide

© Copyright 2003 - 2005 - Crown Partners, LLC 34

threads for initial loads, then increase the number of threads for subsequent loads after monitoring network, memory, and processor resources. Additionally the default number of sessions a client can obtain is 10. Set the MAX_SESSION_COUNT attribute in the dmcl.ini file of the client running Buldoser to modify this setting.

7. Check the Verbose box to create a very detailed log file to be created from the load. This option can negatively impact performance for very large loads. If this option is not selected, only errors will be written to the log file.

8. Check the Trace box to have Buldoser write a DMCL-level trace for every 1000th object that is loaded. This option may be used for diagnostic purposes.

9. Check the Chart Performance box to create a tab-delimited data file containing relevant performance information on the load in milliseconds. A file is created for each thread and is named “chart<thread number>.txt.” By opening a chart file using Microsoft Excel, a line graph may be created to display performance trends over the life of the load. This option is typically used during a small trial load for diagnostic purposes. For large loads this option may degrade performance.

10. Check the Auto Create Formats box to automatically create formats if they don’t exist in the destination Docbase. Buldoser will create a format with the correct name only – an administrator must fill in the remaining attributes once the load is complete. It is highly recommended to move formats with a Documentum-provided tool or script outside of Buldoser before performing ETL operations. This option is provided as a fallback mechanism only.

11. Check the Auto Set Owner box to automatically set the owner_name attribute of each object to the Docbase owner of the target Docbase. Having the Docbase owner own content is a standard convention. This allows administrators to quickly map this attribute without using the Transform dialog.

Click Save Settings to save the settings but not perform the load immediately. Click Cancel to close the dialog. Click the Load button to start loading immediately. An instance of the following status dialog will appear.

Page 41: Buldoser User Guide v3.3

Buldoser User Guide

© Copyright 2003 - 2005 - Crown Partners, LLC 35

Page 42: Buldoser User Guide v3.3

Buldoser User Guide

© Copyright 2003 - 2005 - Crown Partners, LLC 36

The above dialog is explained below:

Item

Description

Title Bar Indicates the operation being performed

Location Current Batch Location

Total Objects Total number of objects anticipated to be processed with this operation. An important distinction to make here is that this number represents individual number of objects across a version tree. Additionally this progress bar is only update after the processing of an entire version tree.

Total Threads Number of threads being used. Each Thread has a Docbase session and separate pool of resources used for processing objects.

Max Objects/Thread Indicates the number of objects that can be allocated to be processed by a single thread at a given time.

Progress Number of objects that remain to be processed for the entire operation

Messages Status Messages for the operation.

Total Throughput Number of objects/second being processed collectively by all of the worker threads.

View Stats for Thread Number

Using the “<” and “>” buttons, each individual threads statistics can be analyzed.

5. Number Waiting – Number of objects that are waiting to be processed by this thread. Note that for a multiple version load that this one object actually accounts for the entire version tree.

6. Number Processed – Number of objects processed by this thread. This includes multiple versions

Page 43: Buldoser User Guide v3.3

Buldoser User Guide

© Copyright 2003 - 2005 - Crown Partners, LLC 37

of objects. 7. Number Failed – Number of

objects that were failed to be processed.

8. Average Processing Time – Average Processing Time for this thread.

View Log Will open the log file associated to the

currently Selected Thread. Stop Will stop the dealing of objects to each

of the threads. Note that this will not stop the operation, each of the threads will need to process their queue of objects. Only when the “Stopped” message appears is the operation is in a completed state.

Close Closes the HUD Dialog.

Update Stats Will force the recalculation of the statistics on the currently displayed thread.

Page 44: Buldoser User Guide v3.3

Buldoser User Guide

© Copyright 2003 - 2005 - Crown Partners, LLC 38

Finishing a Stopped Load This section describes finishing a stopped load. Loads may be stopped and completed at a later time from the stopping point. To start a new load, see the section, Starting a New Load. The same number of threads are launched in the finishing load as the original load. Each thread will have its own stopping point and will continue from there. To finish loading content into a Docbase, follow these steps:

1. Identify the location on the file system that contains the previously loaded content and attributes.

2. Select Finish Stopped Load from the Docbase Load menu. The Finish Load from Docbase Extract dialog will appear.

3. Enter the location of the load into the Load Directory text box. The location may be browsed by clicking the […] button

4. To close the dialog and save the entries, click Save Settings. 5. To close the dialog, click Cancel. 6. To resume the load, click Finish Load. 7. To modify the values for promote cycles, verbose logging, tracing, auto ACL

creation, auto Format creation, auto Folder creation, or auto set owner, click the Modify Load Settings button. The dialog will change to display the above options.

Page 45: Buldoser User Guide v3.3

Buldoser User Guide

© Copyright 2003 - 2005 - Crown Partners, LLC 39

Reprocessing Load Errors This section describes reprocessing the errors from a completed load. Reprocessing can only occur after the initial load is completed. To start a new load or finish a stopped load, see their respective sections, Starting a New Load and Finishing a Stopped Load. When Buldoser executes a load, it will write any failed objects to an error file. Reprocessing uses this error file as its input to retry these failed objects. As these retries succeed, the objects are removed from the error log, gradually reducing the size of the error file. After all objects are loaded, the error file will eventually be empty. The same number of threads is executed when reprocessing errors as the original load. Each thread has its own error file named “load_errors_T<thread number>.txt.” To reprocess load errors, follow these steps:

1. Identify the location on the file system that contains the previously loaded content and attributes.

2. Select Reprocess Load Errors from the Docbase Load menu. The Reprocess Load Errors from Docbase Extract dialog will appear.

3. Enter the location of the load into the Load Directory text box. The location may be browsed by clicking the […] button

4. To close the dialog and save the entries, click Save Settings. 5. To close the dialog, click Cancel. 6. To resume the load, click Reprocess.

Page 46: Buldoser User Guide v3.3

Buldoser User Guide

© Copyright 2003 - 2005 - Crown Partners, LLC 40

7. To modify the values for promote cycles, verbose logging, tracing, auto ACL creation, auto Format creation, auto Folder creation, or auto set owner, click the Modify Load Settings button. The dialog will change to display the above options.

Starting a Relationship Load This section describes starting a new relationship load. Relationship loads are the second phase of the load process and load any dm_relations and virtual document links that may exist. If the first phase completes successfully on the first try, the relationship phase will begin automatically. If the initial phase is stopped or has errors, the relationship phase must be kicked off manually. Relationship loads may also be stopped and completed at a later time from the stopping point. To complete a stopped relationship load, see the section, Finishing a Stopped Relationship Load. To load relationships into a target Docbase, follow these steps:

1. Select Create Relationships from the Docbase Load menu. The Create Relationships from Docbase Extract dialog will appear:

2. Enter the location of the load into the Load Directory text box. The location may be browsed by clicking the […] button

3. To close the dialog and save the entries, click Save Settings. 4. To close the dialog, click Cancel. 5. To start the relationship load, click Create Relationships. Relationship loads are

single-threaded only, so only one instance of the following status dialog will appear.

Page 47: Buldoser User Guide v3.3

Buldoser User Guide

© Copyright 2003 - 2005 - Crown Partners, LLC 41

The dialog is explained below: Item

Description

Title Bar Indicates the percent completion and thread number (which is always zero for Relationship loads.).

Operation Type Indicates the operation type. For Relationship Load this is always BuldoserRelationshipLoad

Load Location Indicates the Batch location where this operation is being performed.

Thread Number Indicates the thread number (always zero for relationship loads).

Progress Indicates the number completed.

Messages Describes the current operation.

Successful Relationships Indicates the number of successful relationships.

Failed Relationships

Indicates the number of failed relationships.

Begin Time Timestamp for when the load began.

Average Load Time Indicates the running average time per relationship in milliseconds.

Projected Completion Time

Indicates when the relationship load should complete calculated by

Page 48: Buldoser User Guide v3.3

Buldoser User Guide

© Copyright 2003 - 2005 - Crown Partners, LLC 42

extrapolation.

View Log Opens the log file.

Stop Stops the load. To restart the load, see the section, Finishing a Stopped Relationship Load.

Close Only available after the load is complete or has been stopped.

Finishing a Stopped Relationship Load This section describes finishing a stopped relationship load. Relationship loads may be stopped and completed at a later time from the stopping point. To start a new relationship load, see the section, Starting a Relationship Load. To finish loading relationships into a Docbase, follow these steps:

1. Identify the location on the file system that contains the previously loaded relationships.

2. Select Finish Stopped Relationship Load from the Docbase Load menu. The Finish Relationships from Docbase Extract dialog will appear.

3. Enter the location of the load into the Load Directory text box. The location may be browsed by clicking the […] button

4. To close the dialog and save the entries, click Save Settings. 5. To close the dialog, click Cancel. 6. To resume the load, click Finish Relationships.

Page 49: Buldoser User Guide v3.3

Buldoser User Guide

© Copyright 2003 - 2005 - Crown Partners, LLC 43

Reprocessing Relationship Load Errors This section describes reprocessing the errors from a completed relationship load. Reprocessing can only occur after the relationship load is completed. To start a new relationship load or finish a stopped relationship load, see their respective sections, Starting a Relationship Load and Finishing a Stopped Relationship Load. When Buldoser executes a relationship load, it will write any failed relationship to an error file. Reprocessing uses this error file as its input to retry these failed objects. As these retries succeed, the relationships are removed from the error log, gradually reducing the size of the error file. After all relationships are loaded, the error file will eventually be empty. The error file is named “relationship_errors.txt.” To reprocess relationship errors, follow these steps:

1. Identify the location on the file system that contains the previously loaded relationships.

2. Select Reprocess Relationship Errors from the Docbase Load menu. The Reprocess Relationship Errors from Docbase Extract dialog will appear.

3. Enter the location of the load into the Load Directory text box. The location may be browsed by clicking the […] button

4. To close the dialog and save the entries, click Save Settings. 5. To close the dialog, click Cancel. 6. To resume the load, click Reprocess Relationship Errors.

Undoing a Load Undo will completely eradicate the previous objects, relationships and the audit records that track the loading process. Undo will additionally remove any automatically created Folders, ACLs, and Groups. Undo will not remove any automatically created formats.

Page 50: Buldoser User Guide v3.3

Buldoser User Guide

© Copyright 2003 - 2005 - Crown Partners, LLC 44

To undo a load, follow these steps:

1. Identify the location on the file system that contained the loaded objects. Buldoser uses this location in its audit trail to identify objects loaded from the batch.

2. Select Undo Load from the Docbase Load menu. The Undo dialog will appear.

3. Enter the location of the load into the Load Directory text box. The location may be browsed by clicking the […] button

4. To close the dialog and save the entries, click Save Settings. 5. To close the dialog, click Cancel. 6. To undo the load, click Undo. An instance of the following status dialog will

appear. Undo operations run single-threaded only.

The dialog is explained below:

Page 51: Buldoser User Guide v3.3

Buldoser User Guide

© Copyright 2003 - 2005 - Crown Partners, LLC 45

Item

Description

Title Bar Indicates the percent completion and thread number (which is always zero for Undo.).

Operation Type Indicates that this is a BuldoserUndo.

Load Location Indicates the Batch location where this operation is being performed.

Thread Number Indicates the thread number (always zero for Undo).

Progress Indicates the number completed.

Messages Describes the current operation.

Stop Stops the undo.

Close Only available after the undo is complete or has been stopped.

Page 52: Buldoser User Guide v3.3

Buldoser User Guide

© Copyright 2003 - 2005 - Crown Partners, LLC 46

Scheduling an ETL Operation This section provides an overview of Buldoser’s scheduling capabilities as well as step-by-step instructions for scheduling an Extract or Load between Docbases. Buldoser’s scheduler functionality is based on Docbase to Docbase operations; users should be familiar with its capabilities before working with scheduler. For more information on Docbase to Docbase operations, see the section, Docbase to Docbase Overview.

Scheduling Overview Buldoser Scheduler provides the capability to schedule Docbase Extracts and Docbase Loads. There are a plethora of applications for Scheduler, including:

Scheduling Extracts or loads to be performed during off hours when users are not accessing the source or target Docbase;

Regularly monitoring a location for content that is generated by another system for import into a Docbase;

Creating a regular backup of selected files while the Docbase is running; and, Replicating content from a source to a target Docbase.

Scheduler is implemented as a Documentum job, and requires that Buldoser is loaded on the Content Server. Scheduler uses the exact same installation procedure as normal Buldoser operation. For more information on installation, see the Buldoser Installation Guide and Release Notes. The operation of Buldoser Extract and Load Jobs are described in further detail in the following paragraphs.

Extract Jobs When Extract Jobs are scheduled, they are configured exactly the same as a normal Buldoser Docbase Extract, with the addition of a job name, run frequency, and run mode (e.g., minutes, hours, days, etc.). Normally Extracts will run the DQL statement and Extract the objects directly to the extract location. When Documentum runs an Extract Job, Buldoser will modify the Extract by:

Creating a sub folder underneath the Extract location for the Extract. The name of the folder will follow the convention Backup yyyy-mm-dd hh-mm-ss. An example of a sub folder is, “Backup 2005-04-21 04:00:00.” Objects will be Extracted to this location instead of the configured location.

Page 53: Buldoser User Guide v3.3

Buldoser User Guide

© Copyright 2003 - 2005 - Crown Partners, LLC 47

Modifying the DQL statement to only Extract objects that have changed since the last execution of the job. An example of a modified DQL statement is, “select distinct r_object_id from dm_document where folder(‘/Demo’,descend) and r_modify_date>Date(’04/21/2005’).”

Creating an audit trail entry that records the date and time of the execution. This audit entry is used for modifying the DQL above.

These changes allow Buldoser to run incremental Extracts of objects that are identified by the DQL statement for the job.

Load Jobs When Load Jobs are scheduled, they are configured exactly the same as a normal Buldoser Docbase Load, with the addition of a job name, run frequency, and run mode (e.g., minutes, hours, days, etc.). Normally Loads will load objects that are found in the configured load location. When Documentum runs a Load Job, Buldoser will modify the load by:

Looking for any new batches that have been created in the job’s load location since the last time the job ran. Batches are sub folders with the naming convention, Backup yyyy-mm-dd hh-mm-ss. An example of a sub folder is, “Backup 2005-04-21 04:00:00.”

Copying a master mapping dependency file to the sub folder if it exists. Creating an audit trail entry that records the name of the batch that was executed.

These changes allow Buldoser to poll a location for generated or Extracted content, as well as to keep from duplicating loaded batches.

Scheduling an Extract Job This section describes creating a new Extract job. If this is the first time creating an Extract job, refer to the section, Scheduling Overview for general information on Scheduler applications and capabilities. To create a new Extract Job, follow these steps:

1. Create a location on the Content Server’s file system or a file system that is accessible from the Content Server to contain the Extracted content and attributes. Make sure the location has enough space.

2. Select Extract Job from the Schedule menu. The Create Extract Job dialog will appear:

Page 54: Buldoser User Guide v3.3

Buldoser User Guide

© Copyright 2003 - 2005 - Crown Partners, LLC 48

3. For Extract Directory through Extract Audit Trail, configure the dialog in the same way as a normal Docbase Extract. For more information on how to configure a Docbase Extract, see the section, Starting an Extract.

4. Enter a name for the job in the Job Name text box.

Page 55: Buldoser User Guide v3.3

Buldoser User Guide

© Copyright 2003 - 2005 - Crown Partners, LLC 49

5. Enter how often the job should run in the Run Frequency text box. The value should be an integer.

6. Select the units for the run frequency in the Run Mode drop-down. 7. Click the Cancel button to close the dialog. 8. Click the Create Extract Job button to create the job in the Docbase. The job will

be grouped under the Buldoser category.

Scheduling a Load Job This section describes creating a new load job. If this is the first time creating a load job, refer to the section, Scheduling Overview for general information on Scheduler applications and capabilities. To create a new Load Job, follow these steps:

1. Identify the location on the Content Server’s file system or a file system that is accessible from the Content Server that contains the generated or Extracted content and attributes. The location should contain a sub folder with a name that follows the convention, Backup yyyy-mm-dd hh-mm-ss.

2. Select Load Job from the Schedule menu. The Create Load Job dialog will appear:

Page 56: Buldoser User Guide v3.3

Buldoser User Guide

© Copyright 2003 - 2005 - Crown Partners, LLC 50

3. For Load Directory through Auto Set Owner, configure the dialog in the same way as a normal Docbase Load. For more information on how to configure a Docbase Load, see the section, Starting a New Load.

4. Enter a name for the job in the Job Name text box. 5. Enter how often the job should run in the Run Frequency text box. The value

should be an integer. 6. Select the units for the run frequency in the Run Mode drop-down. 7. Click the Cancel button to close the dialog. 8. Click the Create Load Job button to create the job in the Docbase. The job will be

grouped under the Buldoser category.

Page 57: Buldoser User Guide v3.3

Buldoser User Guide

© Copyright 2003 - 2005 - Crown Partners, LLC 51

Editing an Existing Extract or Load Job Existing Extract and Load Jobs should be configured only with the Buldoser client. Using Documentum Administrator can cause unpredictable results. To edit an existing job, follow these steps:

1. Select Manage Jobs from the Schedule menu. The following dialog will appear.

2. Select the job to be edited from the list. 3. Click the Edit button. 4. Follow the same instructions as creating an Extract or Load Job within the

sections Scheduling an Extract Job and Scheduling a Load Job, respectively.

Deleting an Extract or Load Job Jobs may be deleted through Documentum Administrator, or by using the Buldoser client. To delete an existing job, follow these steps:

1. Select Manage Jobs from the Schedule menu. The following dialog will appear.

2. Select the job to be deleted from the list.

Page 58: Buldoser User Guide v3.3

Buldoser User Guide

© Copyright 2003 - 2005 - Crown Partners, LLC 52

3. Click the Delete button.

Page 59: Buldoser User Guide v3.3

Buldoser User Guide

© Copyright 2003 - 2005 - Crown Partners, LLC 53

Database to Docbase Overview This section gives an overview of the features and high-level steps to perform a load from a Database or 3rd Party System to a Documentum Docbase using Buldoser.

Connecting to a Data Source Buldoser uses Java Database Connectivity (JDBC) to connect to a database to retrieve the metadata, security, logical folder location, and content file locations for the objects to be loaded into a Documentum Docbase. JDBC provides the capability to connect to virtually any RDBMS or ODBC-supported data source, including:

Microsoft Excel spreadsheets, Text files (.csv, .txt), Sybase, Microsoft SQL Server, Oracle, and DB2.

JDBC uses drivers to connect to the above data sources. For products like Sybase, SQL Server, Oracle, and DB2, JDBC drivers are usually provided with the software. For file-based data sources, ODBC is usually used to provide a connection. Before starting an ETL operation, make sure to acquire the correct and most current driver and driver documentation for the data source. For ODBC connections, Windows provides drivers out-of-the-box for most data source types, and Java provides a JDBC driver that bridges a connection to these ODBC drivers. See Windows reference documentation for how to create an ODBC data source. After acquiring the appropriate driver, locate information on the fully qualified class name of the driver and the correct format for connection URLs and SQL queries.

Mapping a Data Source to Documentum’s Data Model When importing from a data source, a mapping must be created from the non-Documentum data model to Documentum’s data model. This mapping tells Buldoser how to turn the tables and columns from the database into Documentum objects. Buldoser approaches mapping in three phases:

1. Identify source tables, columns, and rows; 2. Map source data to Documentum attributes, folders, content, and security; and,

Page 60: Buldoser User Guide v3.3

Buldoser User Guide

© Copyright 2003 - 2005 - Crown Partners, LLC 54

3. Test the mapping to see if it’s correct. The following paragraphs describe the process of creating a mapping.

The Object View After connecting to the data source, the first step in creating a load is to identify the objects to be moved. Buldoser accomplished this via an SQL statement that queries the primary table(s) that contains the objects. This query statement is known as the Object View. The Object View should result in a list of objects – 1 row per object – and be in versioning order. Versioning order is defined as the oldest version to latest version. Usually this is the creation order of the objects. For non-versioned objects, the order is not relevant. Buldoser requires the order to be correct to make sure version trees are correctly re-created during the load. When creating an SQL statement, be sure to use the format supported by the selected driver. If the data source contains multiple tables, a primary key is required in the Object View. This allows any other tables that are registered with Buldoser to link to the Object View.

Supporting Views Supporting Views are other tables or views within the data source that contain information that is necessary to build Documentum objects. These views usually related to the Object View in 1:M relationship. Oftentimes these tables contain all the renditions of the object, repeating attribute values, or folder links. If the data source only has a single table, there will be no supporting views. These views must have a foreign key that establishes a relationship to the Object View. If the foreign key exists within the Object View, a view must be created within the database that joins the two tables together. See the diagram below for an example of a conceptual data model.

Page 61: Buldoser User Guide v3.3

Buldoser User Guide

© Copyright 2003 - 2005 - Crown Partners, LLC 55

In the above scenario, a view could also be created for the Object View that joins data from the Customer table since the relationship is 1:1. Usually understanding which table represents the Object View and which tables represent Supporting Views requires some knowledge of the tables and relationships in the data source. Reference should be made to the design documentation or product literature to determine which tables to use.

Object Configurations Once the source data model is understood and registered with Buldoser, the tables and columns can be mapped to Documentum object attributes, security, folders, and content renditions. These mappings are created by object type in Object Configurations. Buldoser allows for more than one Object Configuration for a particular load; the configuration to apply will be selected on an object-by-object basis by evaluating its “Applies When” criteria. Since configurations may be similar and may take some effort

Page 62: Buldoser User Guide v3.3

Buldoser User Guide

© Copyright 2003 - 2005 - Crown Partners, LLC 56

to create, Buldoser allows for the copying of a configuration to speed the mapping process and reduce errors. Object Configurations consist of

Attribute Configurations, Content Configuration, Folder Configuration, Security Configuration, Version Configuration, and Pre- and Post-Processing Configuration.

Each topic is described in more detail below.

Inline Data Transformation In addition to loading data from a database into Documentum, Buldoser also provides the ability to transform data during the load. For instance, suppose a column contained a status of a particular document in the legacy system. When that column is moved to an attribute, the user wishes to change the value of the status attribute to a new value that reflects a difference in business rules. This example is illustrated in the table below.

Status Column (old value)

Status Attribute (new value)

Work in Process

WIP

Staged Staging Approved Active Archived Expired

Buldoser calls this process Inline Data Transformation. It is offered in Attribute Configuration, Content Configuration (for formats), Folder Configuration, and Security Configuration. Inline Data Transformation is also useful for deriving other configurations than attribute data. For instance, suppose a data source doesn’t have the concept of an Access Control List (ACL), but the administrator wishes to determine the ACL for a particular object based on a column called Department. By mapping the Department column to the ACL Name in Security Configuration, the administrator can use the Department to drive the ACL to be used. This example is illustrated in the table below.

Department Column (old value)

ACL (new value)

Page 63: Buldoser User Guide v3.3

Buldoser User Guide

© Copyright 2003 - 2005 - Crown Partners, LLC 57

HR HR ACL Legal Legal ACL Research and Development

R&D ACL

Sales Sales ACL

Attribute Configuration Attribute Configuration tells Buldoser how to assign values to the attributes of an Object Type for a specific Object Configuration. There are several options depending upon whether the attribute to be assigned is single-valued or repeating. For single-valued attributes, the options are:

No Configuration – no values will be assigned to the object. Static Value – a literal value will be assigned to all objects to which the current

configuration is applied. Map Column – a column is mapped to the attribute. Inline Data Transformation

can be applied to change the value from the database before assignment to the attribute.

For repeating-valued attributes, the same options are available as single-valued attributes, plus:

Map Multiple Columns – multiple columns are mapped to the attribute. The first value from each column will be assigned to the attribute. This option is usually used for single-table data sources where the administrator has created multiple columns to contain repeating attribute values.

Map Column with Delimiter – a single column is mapped to the attribute with the addition of a delimiter. The delimiter is used to parse out multiple values that are stored in the column. This option is usually used for single-table data sources where the administrator has created a single column to contain repeating attribute values and has separated multiple values with a delimiter.

For Date type attributes, values from the source must be in mm/dd/yyyy hh:mm:ss AM format. If the data type of the source column is Date, then the date will be formatted automatically.

Content Configuration Content Configuration tells Buldoser how add content renditions to an object. To add a content rendition, Buldoser needs the full physical location of the file and the Documentum format. The physical location can be broken into a base file path and a

Page 64: Buldoser User Guide v3.3

Buldoser User Guide

© Copyright 2003 - 2005 - Crown Partners, LLC 58

relative file path, since most Content Management Systems – whether homegrown or purchased – store content relative from a base location. Buldoser provides two options for configuring physical location and format:

Identify one-to-many columns that contain file locations and automatically determine format – This option is usually used for single-table data sources where the administrator has created multiple columns to contain file locations. Buldoser will pull the file extension from the file and look up the Documentum format. For formats with the same file extension, Buldoser will use the last file extension. If a specific file extension is desired, Buldoser allows the administrator to configure which format to use.

Identify a column for location and a column for format and map formats - This option is usually used for multi-table data sources where a separate view contains the physical location and format for the rendition. Buldoser provides Inline Data Transformation to map formats from a column to Documentum formats.

Folder Configuration Folder Configuration tells Buldoser which folders will contain an object. Buldoser provides two options for configuring folders:

Select a fixed folder for all objects – The simple option is to put all objects in the same folder location. A folder may be selected or typed in. If the folder doesn’t exist, Buldoser provides the option to create it on-the-fly.

Identify a column that indicates folder location – In this case, a column from the data source either contains a folder path or indicates a folder path. Buldoser provides Inline Data Transformation to map column values to a particular Documentum folder path if there isn’t a direct mapping.

If folders are not configured, all objects will be located in the personal cabinet of the currently logged-in user.

Security Configuration Security Configuration tells Buldoser which ACL will be applied to an object. Buldoser provides two options for configuring security:

Select a fixed ACL for all objects – The simple option is to assign the same ACL to all objects.

Identify a column that indicates the ACL – In this case, a column from the data source either contains an ACL or indicates an ACL. Buldoser provides Inline Data Transformation to map column values to a particular Documentum ACL if there isn’t a direct mapping.

Page 65: Buldoser User Guide v3.3

Buldoser User Guide

© Copyright 2003 - 2005 - Crown Partners, LLC 59

If Security is not configured, ACLs will be applied with the default configuration of the Content Server. See the Documentum Content Server Administrator’s Manual for more information on Content Server configuration.

Versioning Configuration Versioning configuration is a required step for data sources that implement versioning, otherwise it is optional. There are three items to configure for versioning:

Previous version column – The column in the Object View that contains the value of the Primary key for the previous version must be identified.

Version Label column – A column that contains version labels may be optionally identified.

Base of the version tree – A method must be selected that identifies for Buldoser how to identify the base of a version tree. Configuring this item allows Buldoser to implement multi-threading. Three options are available:

o Previous version column = Primary Key – With this option, any time the Previous Version column’s value equals the Primary Key, Buldoser recognizes the base of a version tree.

o Previous version column = Null – With this option, any time the Previous Version column’s value is Null, Buldoser recognizes the base of a version tree.

o Previous version column = Literal Value – With this option, any time the Previous Version column’s value equals a literal string entered by the administrator, Buldoser recognizes the base of a version tree.

Pre- and Post-Processing For those scenarios that require special processing or validation, Buldoser provides the capability to execute external Java processing methods before and after each object is loaded. The custom class file must be identified within the class path so that Buldoser can use it at run-time. These methods are executed from a provided Java Interface named, “IBuldoserTransform.” A set of JavaDocs is provided for use when implementing the interface, and can be found at <Buldoser Install Location>\docs\apidocs\index.html. Full access is given the object in memory before the load, and a handle to the IDfSysObject interface after the object is loaded. When Buldoser executes the method, it will check for a returned value. If one is found, it will fail the object and write the message to the log file.

Page 66: Buldoser User Guide v3.3

Buldoser User Guide

© Copyright 2003 - 2005 - Crown Partners, LLC 60

Multi-threaded Loading Algorithm Buldoser incorporates a multi-threaded load algorithm that is modeled after a card dealer dealing cards to players. A controlling (dealer) thread first makes the one and only connection to the source database. Worker threads (players) are launched which make connections to the target Docbase. Each player monitors its own queue of objects to load. The dealer iterates through the Object View, gathering all the data required to load a particular object and pushes it onto the queue of the player with the least number of objects. When all the queues are full, the dealer will wait until there is availability. When all the queues are empty, the players will wait on the dealer, although this rarely happens.

Page 67: Buldoser User Guide v3.3

Buldoser User Guide

© Copyright 2003 - 2005 - Crown Partners, LLC 61

Loading Content from a Database This section provides step-by-step instructions for loading content from a database into a target Docbase using Buldoser. Buldoser also provides the capability to load content from a Buldoser Extract into Buldoser. See the section, Docbase to Docbase Overview for more information on the feature. If this is the first time moving content using Buldoser, see the section, Database to Docbase Overview for an important description of how Buldoser moves content objects.

Creating a New Configuration or Configuring an Existing Configuration This section describes configuring a new or existing load. Loads may be stopped and completed at a later time from the stopping point. To complete a stopped load, see the section, Finishing a Stopped Load. To load content into a target Docbase, follow these steps:

1. Create a location on the file system to contain the configuration and log files. 2. Download and install the driver that will be used to connect to the data source. 3. Implement the IBuldoserTransform class for Pre- and Post-Processing. 4. Validate that the class path contains both the location of the processing class

and the database driver. 5. Start Buldoser. For more information on this step, see the section Getting

Started. 6. Select New Load from the Database Load menu. Step 1 of the Database Load

Wizard will appear.

Page 68: Buldoser User Guide v3.3

Buldoser User Guide

© Copyright 2003 - 2005 - Crown Partners, LLC 62

7. Enter the log path location into the Log Path text box, or browse for the location

by clicking the […] button. 8. Click Cancel to close the Wizard and stop configuration without saving. 9. Click Next to continue. Step 2 of the Wizard will appear. If a configuration

already exists at this location, Buldoser will update the Wizard with the existing settings.

Page 69: Buldoser User Guide v3.3

Buldoser User Guide

© Copyright 2003 - 2005 - Crown Partners, LLC 63

10. Enter or select the class name of the driver in the Driver drop-down. The value

for the ODBC driver and the Oracle driver are provided. 11. Enter the connection URL for the data source in the Data Source text box. For

ODBC connections, the value is formatted as jdbc:odbc:<name of the ODBC connection>.

12. Enter a user name for the connection in the User Name text box, if applicable. 13. Enter a password for the connection in the Password text box, if applicable.

Buldoser will not store the password in the configuration file so it must be entered every time.

14. Click the Test button to test the connection information. If Buldoser is successful connecting, the Next button will be enabled. If not, Buldoser will

Page 70: Buldoser User Guide v3.3

Buldoser User Guide

© Copyright 2003 - 2005 - Crown Partners, LLC 64

return an error. Continue modifying the connection information until the test is successful.

15. Click Cancel to stop configuring and close the Wizard without saving. 16. Click Previous to go back to Step 1. 17. Click Next to proceed to Step 3.

18. Enter a name for the Object View in the Object View Name text box. This value is

user-defined and can be anything that represents the data set. 19. Enter the SQL statement that identifies the objects to be moved into the SQL text

box. The SQL must be formatted correctly for the driver entered in Step 2. 20. Click the Test SQL button to validate the SQL statement. If Buldoser is

successful executing the SQL, the Next button and Primary Key drop-down will be

Page 71: Buldoser User Guide v3.3

Buldoser User Guide

© Copyright 2003 - 2005 - Crown Partners, LLC 65

enabled. If not, Buldoser will return an error. Continue modifying the SQL statement until the test is successful.

21. Select the Primary Key column for the Object View in the Primary Key drop-down.

22. Click Cancel to stop configuring and close the Wizard without saving. 23. Click Previous to go back to Step 2. 24. Click Next to proceed to Step 4.

25. Step 4 only applies to multi-table data sources. For single-table data sources

such as Excel spreadsheets, click Next to proceed to Step 5.

Page 72: Buldoser User Guide v3.3

Buldoser User Guide

© Copyright 2003 - 2005 - Crown Partners, LLC 66

26. For each Supporting View, select a View from the View Name drop-down and the column that links the view to the Object View from the Linking Column drop-down, then click Add.

27. To remove a Supporting View that has been added, select the rows to remove and click Remove.

28. Click Cancel to stop configuring and close the Wizard without saving. 29. Click Previous to go back to Step 3. 30. Click Next to proceed to Step 5.

31. Click Add to create a new Object Configuration. 32. Click Edit to edit a selected Object Configuration. 33. Click Copy to copy a selected Object Configuration.

Page 73: Buldoser User Guide v3.3

Buldoser User Guide

© Copyright 2003 - 2005 - Crown Partners, LLC 67

34. Click Remove to remove a selected Object Configuration. 35. If Add or Edit was clicked the Object Configuration will appear.

36. Create filters in the Configuration Applies When table to make the configuration

apply to only certain rows in the Object View. a. Click Add to create a new row. b. Click Remove to remove a selected row. c. For each row, select an Object View column from the Source Column drop-down,

select a comparison operator from the Is drop-down, and enter a value for the Source Column in the Value text box.\

Page 74: Buldoser User Guide v3.3

Buldoser User Guide

© Copyright 2003 - 2005 - Crown Partners, LLC 68

37. Select an Object Type from the Object Type drop-down. If the Object Type for the configuration is changed after attributes have been configured, the attribute configuration will be lost.

38. Click Configure Attributes. The Attribute Configuration dialog will appear.

39. Select the attribute to be configured, and then select a Configuration Type from

the Configuration Type drop-down. 40. To map a single column to an attribute, select an attribute, then “Map Column”

from the Configuration Type drop-down.

Page 75: Buldoser User Guide v3.3

Buldoser User Guide

© Copyright 2003 - 2005 - Crown Partners, LLC 69

41. Select a view and column from the View and Column drop-downs, respectively. 42. To map values from the source column to attribute values, click Map Values. The

Attribute Value Mapping dialog will appear.

Page 76: Buldoser User Guide v3.3

Buldoser User Guide

© Copyright 2003 - 2005 - Crown Partners, LLC 70

43. Enter or select a value in the Maps To column for each Column Value from the

data source. If the attribute has value assistance, the values will appear in a drop-down in the Maps To column. Note: SQL Server 2000 has a case insensitive distinct query, which will cause unique entries across case.

44. Click OK to save the mappings or Cancel to close the dialog without saving. 45. The Attribute Configuration dialog will reappear. Click Save to save the

mappings. If another attribute is selected before clicking Save, the configuration will be lost.

46. To enter a static value for a single-valued attribute, select a single-valued attribute, then “Enter Static Value” from the Configuration Type drop-down.

Page 77: Buldoser User Guide v3.3

Buldoser User Guide

© Copyright 2003 - 2005 - Crown Partners, LLC 71

47. Enter a value in the Value text box. Click Save to save the configuration. 48. To enter multiple static values for a repeating-valued attribute, select a

repeating-valued attribute, then “Enter Static Values” from the Configuration Type drop-down.

Page 78: Buldoser User Guide v3.3

Buldoser User Guide

© Copyright 2003 - 2005 - Crown Partners, LLC 72

49. Enter a value in the Value text box, and then click Add to add it to the list. Click

Remove to remove a value from the list. Click Save to save the configuration. 50. To map multiple columns to a repeating-valued attribute, select a repeating-

valued attribute, then “Map Multiple Columns” from the Configuration Type drop-down.

Page 79: Buldoser User Guide v3.3

Buldoser User Guide

© Copyright 2003 - 2005 - Crown Partners, LLC 73

51. Select a view and column from the View and Column drop-downs, and then click

Add to add it to the list. Click Remove to remove an entry from the list. Click Save to save the configuration.

52. To map a column with a delimiter to a repeating-valued attribute, select a repeating-valued attribute, then “Map Column with Delimiter” from the Configuration Type drop-down.

Page 80: Buldoser User Guide v3.3

Buldoser User Guide

© Copyright 2003 - 2005 - Crown Partners, LLC 74

53. Select a view and column from the View and Column drop-downs and a delimiter

in the Delimiter text box. Click Save to save the configuration. 54. Once all attributes are configured, click OK to save the configuration and return

to the Object Configurator. Click Cancel to cancel all changes since the dialog was opened.

Page 81: Buldoser User Guide v3.3

Buldoser User Guide

© Copyright 2003 - 2005 - Crown Partners, LLC 75

55. Click Configure Security. The Security Configuration dialog will appear.

Page 82: Buldoser User Guide v3.3

Buldoser User Guide

© Copyright 2003 - 2005 - Crown Partners, LLC 76

56. To use a fixed ACL for all objects, select the Use a Fixed ACL and ACL Domain

radio button. Next, select an ACL and ACL Domain from the drop-down. Entries are formatted as “ACL Domain.ACL Name.”

57. To use a column from the data source to drive the ACL that is used, select the Use an ACL from a Column radio button. Select a view, ACL column, and ACL Domain column from the drop-downs. To map values from the selected columns to ACLs in the current Docbase, click the Map ACL Values button. The ACL Mapping dialog will appear.

Page 83: Buldoser User Guide v3.3

Buldoser User Guide

© Copyright 2003 - 2005 - Crown Partners, LLC 77

58. For each row, select the ACL in the Maps To drop-down that should be selected

based on the value from the source column. ACLs are formatted as “ACL Domain.ACL Name.”

59. Click OK to save the mappings and close the dialog. Click Cancel to close without saving. The Security configuration dialog will reappear.

60. Click OK to save the configuration and close the dialog. Click Cancel to close without saving. The Object Configurator will reappear.

Page 84: Buldoser User Guide v3.3

Buldoser User Guide

© Copyright 2003 - 2005 - Crown Partners, LLC 78

61. Click Configure Content. The Content Configuration dialog will appear.

Page 85: Buldoser User Guide v3.3

Buldoser User Guide

© Copyright 2003 - 2005 - Crown Partners, LLC 79

62. Enter the base path for all files in the Base Content Path text box, or click […] to

browse for the location. 63. Select the radio button for the Configuration Type to use. 64. For content in column or columns, select each view and column and click Add.

Click Remove to remove a view and column. 65. To identify which format will be used in the case where multiple formats exist

in the target Docbase for the same file extension, click Map Duplicate Formats. The Format Mapping dialog will appear.

Page 86: Buldoser User Guide v3.3

Buldoser User Guide

© Copyright 2003 - 2005 - Crown Partners, LLC 80

66. For each file extension on the left, select the format to be used from the Maps To

drop-down on the right. 67. Click OK to close the dialog and save the mappings. To close without saving,

click Cancel. The Content Configuration dialog will reappear.

Page 87: Buldoser User Guide v3.3

Buldoser User Guide

© Copyright 2003 - 2005 - Crown Partners, LLC 81

68. For content in a separate view, select the view, physical location, and format

columns. If this is a Documentum Docbase and the data_ticket attribute is used for physical location, check the Documentum Ticket checkbox to have Buldoser calculate the location..

69. If the formats identified in the Format column are not Documentum formats, click the Map Formats button. The Format Mapping dialog will appear.

Page 88: Buldoser User Guide v3.3

Buldoser User Guide

© Copyright 2003 - 2005 - Crown Partners, LLC 82

70. For each source format on the left, select a Documentum format to use instead

in the Maps To drop-down. 71. Click OK to save the mappings and close the dialog. Click Cancel to close

without saving. The Content Configuration dialog will reappear. 72. Click OK to save the configuration and close the dialog. Click Cancel to close

without saving. The Object Configurator will reappear.

Page 89: Buldoser User Guide v3.3

Buldoser User Guide

© Copyright 2003 - 2005 - Crown Partners, LLC 83

73. Click the Configure Folders button. The Folder Configuration dialog will appear.

Page 90: Buldoser User Guide v3.3

Buldoser User Guide

© Copyright 2003 - 2005 - Crown Partners, LLC 84

74. To use a fixed folder path for all objects, select the Use a Fixed Folder Path radio

button. Next, enter a folder path in the drop-down. To list folders in the target Docbase in the drop-down, click the List Folders button.

75. To use a column from the data source to drive the folder that is used, select the Use Folder Path from a Column radio button. Select a view and folder column from the drop-downs. Optionally enter a folder path prefix and suffix. To map values from the selected column to folders in the current Docbase, click the Map Folder Paths button. The Path Mapping dialog will appear.

Page 91: Buldoser User Guide v3.3

Buldoser User Guide

© Copyright 2003 - 2005 - Crown Partners, LLC 85

76. For each row, select the Folder in the Maps To drop-down that should be selected

based on the value from the source column. 77. Click OK to save the mappings and close the dialog. Click Cancel to close

without saving. The Folder configuration dialog will reappear. 78. Click OK to save the configuration and close the dialog. Click Cancel to close

without saving. The Object Configurator will reappear.

Page 92: Buldoser User Guide v3.3

Buldoser User Guide

© Copyright 2003 - 2005 - Crown Partners, LLC 86

79. Click the Configure Versioning button. The Version Configuration dialog will

appear.

Page 93: Buldoser User Guide v3.3

Buldoser User Guide

© Copyright 2003 - 2005 - Crown Partners, LLC 87

80. Select the column that contains the primary key value of the previous version in

the Previous Version Column drop-down. 81. Select the view and column for the version labels from the Version Label View and

Version Label Column drop-downs, respectively. 82. Select the method for determining the base of a version tree from the three radio

buttons. This must be configured correctly for multi-threading to function properly.

83. Click OK to save the configuration and close the dialog. Click Cancel to close without saving. The Object Configurator will reappear.

Page 94: Buldoser User Guide v3.3

Buldoser User Guide

© Copyright 2003 - 2005 - Crown Partners, LLC 88

84. Click the Configure Pre- and Post-Processing button. The Processing

Configuration dialog will appear.

Page 95: Buldoser User Guide v3.3

Buldoser User Guide

© Copyright 2003 - 2005 - Crown Partners, LLC 89

85. Enter the name of the class that implements IBuldoserTransform into the

Processing Class textbox. The class must exist within the classpath or Buldoser will return an error. Leave the box blank for no processing.

86. Click OK to save the configuration and close the dialog. Click Cancel to close without saving. The Object Configurator will reappear.

87. Click OK to save the Object Configuration and close the dialog. Click Cancel to close without saving.

88. After all configuration is complete, Step 5 will reappear.

Page 96: Buldoser User Guide v3.3

Buldoser User Guide

© Copyright 2003 - 2005 - Crown Partners, LLC 90

89. Click Cancel to stop configuring and close the Wizard without saving. 90. Click Previous to go back to Step 4. 91. Click Next to proceed to Step 6.

Page 97: Buldoser User Guide v3.3

Buldoser User Guide

© Copyright 2003 - 2005 - Crown Partners, LLC 91

92. Step 6 provides the ability to preview objects as they will be built using the

configurations created in Step 5. Click Next Object to advance through the data set. Each object will display each configuration in sections, showing values for attribute, security, folders, content, versioning, and processing class. When the end of the data set is reached, Buldoser will start over from the beginning.

93. Click Previous to go back to Step 5 and correct any errors in configuration. 94. Click Next to proceed to Step 7.

Page 98: Buldoser User Guide v3.3

Buldoser User Guide

© Copyright 2003 - 2005 - Crown Partners, LLC 92

95. Select the number of threads to use for the load from the Threads drop-down.

Selecting multiple threads helps the content to load faster, but it is dependent on the hardware that is being used. It is best to start with one or two threads for initial loads, then increase the number of threads after monitoring network, memory, and processor resources.

96. Check the Verbose box to create a very detailed log file to be created from the load. This option can negatively impact performance for very large loads. If this option is not selected, only errors will be written to the log file.

97. Check the Trace box to have Buldoser write a DMCL-level trace for every 1000th object that is loaded. This option may be used for diagnostic purposes.

98. Check the Chart Performance box to create a tab-delimited data file containing relevant performance information on the load in milliseconds. A file is created for each thread and is named “chart<thread number>.txt.” By opening a chart

Page 99: Buldoser User Guide v3.3

Buldoser User Guide

© Copyright 2003 - 2005 - Crown Partners, LLC 93

file using Microsoft Excel, a line graph may be created to display performance trends over the life of the load. This option is typically used during a small trial load for diagnostic purposes. For large loads this option may degrade performance.

99. Check the Auto Create ACLs box to automatically create ACLs if they don’t exist in the destination Docbase. Buldoser will create a System ACL with the correct name only – an administrator must fill in the correct Users, Groups, and permission levels once the load is complete. It is highly recommended to move ACLs with a Documentum-provided tool or script outside of Buldoser before performing ETL operations. This option is provided as a fallback mechanism only.

100. Check the Auto Create Formats box to automatically create formats if they don’t exist in the destination Docbase. Buldoser will create a format with the correct name only – an administrator must fill in the remaining attributes once the load is complete. It is highly recommended to move formats with a Documentum-provided tool or script outside of Buldoser before performing ETL operations. This option is provided as a fallback mechanism only.

101. Check the Auto Create Folders box to automatically create folders if they don’t exist in the destination Docbase. Buldoser will create the entire folder path with the correct name only – an administrator must fill in the remaining attributes once the load is complete. It is highly recommended to move folders with a Documentum-provided tool or script outside of Buldoser before performing ETL operations. This option is provided as a fallback mechanism only.

102. Check the Auto Set Owner to DBO box to automatically set the owner_name attribute of each object to the Docbase owner of the target Docbase. Having the Docbase owner own content is a standard convention.

103. Check the Dealer-Side Database Load box to have the heavy Docbase operations executed on the Dealer side instead of at the thread level. It is best to have this option checked when doing loads with content. If the loads are without content then it is best to leave this option unchecked. This option indicates whether a bulk of the database operations will occur on the Controlling thread or on each of the worker threads.

104. Click Previous to go back to Step 6. 105. Click Next to proceed to Step 8.

Page 100: Buldoser User Guide v3.3

Buldoser User Guide

© Copyright 2003 - 2005 - Crown Partners, LLC 94

106. Configuration is now complete and Buldoser is ready to load. Click Save and Close to save the configuration and close without initiating the load.

107. Click Previous to return to Step 7. 108. Click Cancel to close the Wizard without saving. 109. Click Load Now to start the load. The Dealer Heads-Up Display (HUD) will

appear.

Page 101: Buldoser User Guide v3.3

Buldoser User Guide

© Copyright 2003 - 2005 - Crown Partners, LLC 95

The dialog is explained below:

Item Definition Location: Indicates the name Log Path for the batch. Total Objects: Indicates the total number of objects for the

batch. Total Threads: Indicates the number of threads that was selected.

The number may be increased or decreased during the load by clicking the + and – buttons.

Max Objects/Thread: Indicates the number of objects that each worker thread will cache. This number can be increased or decreased during the load by clicking the + and – buttons.

Dealing Progress: Indicates how many objects have been dealt by the Dealer thread.

Page 102: Buldoser User Guide v3.3

Buldoser User Guide

© Copyright 2003 - 2005 - Crown Partners, LLC 96

Messages: Provides messages that indicate the current operation of the load.

Dealing Speed: Indicates the average speed for dealing an object. Total Throughput: Indicates how fast on average objects are being

loaded across all threads. View Stats: Indicates for which thread statistics are being

displayed. The thread number is indicated. Use the < and > buttons to cycle through the threads to view each thread’s statistics.

Number Waiting: Indicates how many objects are in the selected threads queue.

Number Processed: Indicates the number of objects that have been processed by the selected thread.

Number Failed: Indicates the number of objects that have failed for the selected thread.

Average Load Time: Indicates the average load speed for the selected thread.

View Log: Opens the load log for the selected thread. Stop: Allows the dealing to be stopped. The load may

be restarted from the stopping point by selecting Finish Load from the Database Load menu. Note that each worker thread will finish the objects that are currently in its waiting queue.

Close: Once the load has been stopped or has completed, the Close button allows the dialog to be closed.

Update Stats: Updates the per thread statistics for the currently selected thread.

Finishing a Stopped Database Load This section describes finishing a stopped database load. Database loads may be stopped and completed at a later time from the stopping point. To finish loading from a database into a Docbase, follow these steps:

1. Identify the location on the file system that contains the database configuration. 2. Select Finish Load from the Database Load menu. The Finish Load from Database

dialog will appear.

Page 103: Buldoser User Guide v3.3

Buldoser User Guide

© Copyright 2003 - 2005 - Crown Partners, LLC 97

3. Enter the location of the database configuration into the Load Directory text box. The location may be browsed by clicking the […] button

4. Enter the password to connect to the database in the Password text box. 5. To close the dialog and save the entries, click Save Settings. 6. To close the dialog, click Cancel. 7. To resume the load, click Finish Load.

Undoing a Database Load Undo will completely eradicate the previous objects and the audit records that track the loading process. Undo will not remove any automatically created Folders, ACLs, or Formats. To undo a database load, follow these steps:

1. Identify the location on the file system that contained the database configuration. Buldoser uses this location in its audit trail to identify objects loaded from the batch.

2. Select Undo Load from the Database Load menu. The Undo dialog will appear.

Page 104: Buldoser User Guide v3.3

Buldoser User Guide

© Copyright 2003 - 2005 - Crown Partners, LLC 98

3. Enter the location of the load into the Load Directory text box. The location may be browsed by clicking the […] button

4. To close the dialog and save the entries, click Save Settings. 5. To close the dialog, click Cancel. 6. To undo the load, click Undo. An instance of the following status dialog will

appear. Undo operations run single-threaded only.

The dialog is explained below: Item

Description

Title Bar Indicates the percent completion and thread number (which is always zero for Undo.).

Operation Type Indicates that this is a BuldoserDbUndo.

Load Location Indicates the Batch location where this operation is being performed.

Page 105: Buldoser User Guide v3.3

Buldoser User Guide

© Copyright 2003 - 2005 - Crown Partners, LLC 99

Thread Number Indicates the thread number (always zero for Undo).

Progress Indicates the number completed.

Messages Describes the current operation.

Stop Stops the undo.

Close Only available after the undo is complete or has been stopped.

Page 106: Buldoser User Guide v3.3

Buldoser User Guide

© Copyright 2003 - 2005 - Crown Partners, LLC 100

Appendix – EDMS98 Operations Buldoser currently supports EDMS 98 Operations through the Database to Docbase functionality. In order for this to occur a JDBC driver to the underlying database on which the Docbase runs will need to be installed.

Content View 1. A view in the database that runs the Docbase the following query:

create view content_view as select c.data_ticket, full_format, parent_id_i from dmr_content_s a, dmr_content_r b where a.r_object_id_i = b.r_object_id_i

2. A network drive will need to be mapped to the filestore directory on the source

Docbase. In Windows this directory is typically c:\Documentum\Data\<Docbase>

Defining Object View The following query can be used to Extract dm_document objects from the Docbase. For subtypes simply swap out dm_document for the type name.

Page 107: Buldoser User Guide v3.3

Buldoser User Guide

© Copyright 2003 - 2005 - Crown Partners, LLC 101

Next, link this query (in Step 3 of 8) with the Supporting View defined as “content_view”

Page 108: Buldoser User Guide v3.3

Buldoser User Guide

© Copyright 2003 - 2005 - Crown Partners, LLC 102

Page 109: Buldoser User Guide v3.3

Buldoser User Guide

© Copyright 2003 - 2005 - Crown Partners, LLC 103

Page 110: Buldoser User Guide v3.3

Buldoser User Guide

© Copyright 2003 - 2005 - Crown Partners, LLC 104

The Content and Versioning buttons are particularly important for this operation. Set them according to the below screenshots. *Note that the <Mapped Filestore Path> (within the “Base Content Path:” field) should actually be the Drive and path to where the filestore is mapped on the machine running Buldoser.

Page 111: Buldoser User Guide v3.3

Buldoser User Guide

© Copyright 2003 - 2005 - Crown Partners, LLC 105