sap conversion agent by itemfield · conversion agent studio is the design and configuration...

281
SAP Conversion Agent by Itemfield Conversion Agent Studio User's Guide Version 4

Upload: others

Post on 12-Mar-2020

65 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

SAP Conversion Agent byItemfield

Conversion Agent StudioUser's Guide

Version 4

Page 2: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Legal Notice

Conversion Agent Studio User's Guide

Copyright © 2004-2005 Itemfield Inc. All rights reserved.

Itemfield may have patents, patent applications, trademarks, copyrights, or other intellectual propertyrights covering subject matter in this document. Except as expressly provided in any written licenseagreement from Itemfield, the furnishing of this document does not give you any license to thesepatents, trademarks, copyrights, or other intellectual property.

The information in this document is subject to change without notice. Complying with all applicablecopyright laws is the responsibility of the user. No part of this document may be reproduced ortransmitted in any form or by any means, electronic or mechanical, for any purpose, without theexpress written permission of Itemfield Inc.

SAP AGhttp://www.sap.com

Publication Information:

Version: 4Date: September 2006

Page 3: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide Contents

i

Contents

1. Designing Conversion Agent Data Transformations .....................1Before You Continue .............................................................................................................................1Overview of Transformation Architecture .............................................................................................. 1

Conversion Agent Components ...................................................................................................... 2Data Holders ................................................................................................................................... 3Documents ...................................................................................................................................... 4Conversion Agent Services ............................................................................................................. 4

Project Architecture ............................................................................................................................... 5Required Project Files ..................................................................................................................... 5Additional Project Files.................................................................................................................... 5

Workflow for Designing Transformations............................................................................................... 6Analyzing the Inputs and Outputs ................................................................................................... 6Creating and Configuring a Project ................................................................................................. 7Deploying the Project as a Conversion Agent Service ....................................................................8

How to Learn More ................................................................................................................................ 8Online Samples ............................................................................................................................... 9Contacting Product Support ............................................................................................................ 9

2. Using Conversion Agent Studio ....................................................10Conversion Agent Studio for Eclipse................................................................................................... 10For Detailed Information ......................................................................................................................11

3. Parsers ............................................................................................12Creating a Parser ................................................................................................................................ 12

Using the New Parser Wizard ....................................................................................................... 12Creating a Parser by Editing the IntelliScript................................................................................. 15

Running a Parser.................................................................................................................................15Platform-Independent Parsers.............................................................................................................16Parser Component Reference .............................................................................................................17

Parser ............................................................................................................................................17Input Port Component Reference........................................................................................................20

DocList .......................................................................................................................................... 21FileSearch .....................................................................................................................................21LocalFile ........................................................................................................................................21Text ...............................................................................................................................................22URL ...............................................................................................................................................22

Page 4: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide Contents

ii

4. Document Processors....................................................................24Installation ...........................................................................................................................................24Defining Document Processors ...........................................................................................................24

Display of Document Processor Output ........................................................................................ 25Document Processor Quick Reference ............................................................................................... 25Document Processor Component Reference......................................................................................26

AFPToXML....................................................................................................................................26ExcelToDataXml............................................................................................................................ 27ExcelToHtml ..................................................................................................................................27ExcelToTextML .............................................................................................................................27ExcelToTxt ....................................................................................................................................27ExcelToXml ................................................................................................................................... 28ExpandFrameSet ..........................................................................................................................29ExternalCOMPreProcessor ...........................................................................................................29ExternalJavaPreProcessor ............................................................................................................29ExternalPreProcessor ...................................................................................................................31PdfToTxt_3_00..............................................................................................................................32PowerpointToHtml.........................................................................................................................33PowerpointToTextML ....................................................................................................................33ProcessByTransformers ................................................................................................................33ProcessorPipeline .........................................................................................................................34RtfToTextML..................................................................................................................................34WordPerfectToTextML ..................................................................................................................34WordToHtml ..................................................................................................................................34WordToRtf .....................................................................................................................................34WordToTextML..............................................................................................................................35WordToTxt.....................................................................................................................................35WordToXml ................................................................................................................................... 35XmlToExcel ................................................................................................................................... 35

TextML XML Schema ..........................................................................................................................36

5. Formats ...........................................................................................37Defining the Document Format............................................................................................................37Standard Properties of Formats .......................................................................................................... 38Format Component Reference ............................................................................................................39

BinaryFormat.................................................................................................................................39CustomFormat............................................................................................................................... 40HtmlFormat....................................................................................................................................40RtfFormat ...................................................................................................................................... 41TextFormat ....................................................................................................................................41XmlFormat .....................................................................................................................................42

Delimiters Component Reference ....................................................................................................... 42CommaDelimited ...........................................................................................................................43DelimiterHierarchy.........................................................................................................................44HL7................................................................................................................................................45Positional....................................................................................................................................... 45PostScript ...................................................................................................................................... 46RTF ...............................................................................................................................................46SGML ............................................................................................................................................46SpaceDelimited .............................................................................................................................46

Page 5: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide Contents

iii

TabDelimited .................................................................................................................................47Delimiter Subcomponent Reference....................................................................................................48

Delimiter ........................................................................................................................................48EnclosingDelimiters .......................................................................................................................49

Format Preprocessor Component Reference......................................................................................49HtmlProcessor ............................................................................................................................... 50RtfProcessor..................................................................................................................................50

6. Data Holders ...................................................................................51About XSD.....................................................................................................................................51Where To Learn XSD ....................................................................................................................53How to Create XSD Schemas ....................................................................................................... 53Encoding of the XSD Schema....................................................................................................... 53Included XSD Files ........................................................................................................................ 54Namespaces .................................................................................................................................54Mixed Content ............................................................................................................................... 54Unsupported XSD Features .......................................................................................................... 54Precision of Numerical Data.......................................................................................................... 55

Adding XSD Schemas to a Conversion Agent Project ........................................................................55Adding an Existing Schema .......................................................................................................... 56Creating a New Schema ...............................................................................................................56Editing a Schema ..........................................................................................................................56Reloading a Schema after Editing .................................................................................................56

Viewing a Schema ............................................................................................................................... 57Schema View ................................................................................................................................ 57Displaying an XML Sample of a Schema ......................................................................................58

Using a Schema to Map Anchors ........................................................................................................59IntelliScript Representation of Data Holders ................................................................................. 59Mapping Mixed Content ................................................................................................................59

Generating Valid XML .........................................................................................................................61Role of XSD in Parsing..................................................................................................................61Role of XSD in Serialization .......................................................................................................... 62

Variables.............................................................................................................................................. 62User-Defined Variables ................................................................................................................. 63System Variables ..........................................................................................................................63Mapping Anchors to Variables ...................................................................................................... 64Using Variables in Actions.............................................................................................................65Initializing Variables at Runtime ....................................................................................................65

Variable Component Reference .......................................................................................................... 65Variable .........................................................................................................................................65

Multiple-Occurrence Data Holders ...................................................................................................... 66

7. Anchors...........................................................................................69Marker and Content Anchors...............................................................................................................69Other Anchor Types ............................................................................................................................ 69How Anchors and Delimiters Work Together ......................................................................................69Mapping Content Anchors to Data Holders.........................................................................................70

Mapping to Variables..................................................................................................................... 71

Page 6: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide Contents

iv

Mapping to Multiple-Occurrence Data Holders .............................................................................71Mapping to Mixed-Content Elements............................................................................................ 72

Defining Anchors .................................................................................................................................72Where to Define Anchors .............................................................................................................. 72Sequence of Anchors ....................................................................................................................73Select-and-Click Procedure for Marker and Content Anchors ......................................................74Drag-and-Drop Procedure for Content Anchors ............................................................................74Using the IntelliScript to Define Anchors .......................................................................................74

Standard Anchor Properties ................................................................................................................75How a Parser Searches for Anchors ................................................................................................... 76

Search Phases ..............................................................................................................................76Search Scope and Search Criteria................................................................................................ 77Adjusting the Search Phase.......................................................................................................... 78Adjusting the Search Scope.......................................................................................................... 78Adjusting the Search Criteria.........................................................................................................81Using XSD Data Types to Narrow the Search Criteria ..................................................................81Anchors that Contain Nested Anchors ..........................................................................................83

Anchor Quick Reference ..................................................................................................................... 84Anchor Component Reference ............................................................................................................85

Alternatives....................................................................................................................................85Content .......................................................................................................................................... 87DelimitedSections..........................................................................................................................90EmbeddedParser ..........................................................................................................................93EnclosedGroup..............................................................................................................................94FindReplaceAnchor.......................................................................................................................95Group ............................................................................................................................................97HtmlForm....................................................................................................................................... 99Marker ......................................................................................................................................... 101RepeatingGroup.......................................................................................................................... 103

Searcher Component Reference ....................................................................................................... 107AttributeSearch............................................................................................................................ 107LearnByExample .........................................................................................................................108NewlineSearch ............................................................................................................................ 109OffsetSearch ............................................................................................................................... 109PatternSearch ............................................................................................................................. 109SegmentSearch........................................................................................................................... 110TextSearch .................................................................................................................................. 110TypeSearch................................................................................................................................. 112

Anchor Subcomponent Reference .................................................................................................... 112AddField ...................................................................................................................................... 112Connect ....................................................................................................................................... 113ImageClick................................................................................................................................... 114ModifyField .................................................................................................................................. 114RemoveField ............................................................................................................................... 115SegmentIndex ............................................................................................................................. 115SegmentSize ............................................................................................................................... 115SubmitAll ..................................................................................................................................... 115SubmitClick ................................................................................................................................. 116

8. Transformers ................................................................................117Defining Transformers....................................................................................................................... 117

Page 7: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide Contents

v

Using Transformers in Anchors ................................................................................................... 117Sequences of Transformers........................................................................................................ 118Default Transformers................................................................................................................... 118Using Transformers as Document Processors............................................................................ 119Using Transformers in Serialization Anchors .............................................................................. 119Using Transformers in Actions .................................................................................................... 119Using Transformers as Runnable Components .......................................................................... 119

Standard Transformer Properties ...................................................................................................... 120Transformer Quick Reference ...........................................................................................................120Transformer Component Reference.................................................................................................. 123

AbsURL ....................................................................................................................................... 124AddEmptyTagsTransformer ........................................................................................................ 124AddString..................................................................................................................................... 124Base64Decode ............................................................................................................................ 125Base64Encode ............................................................................................................................ 125BidiConvert.................................................................................................................................. 126BigEndianUniToUni ..................................................................................................................... 126CDATADecode ............................................................................................................................ 126CDATAEncode ............................................................................................................................ 127ChangeCase ............................................................................................................................... 127CreateGuid .................................................................................................................................. 128CreateUUID................................................................................................................................. 128DateFormat ................................................................................................................................. 128Dos96HebToAscii........................................................................................................................ 130EbcdicToAscii.............................................................................................................................. 131EncodeAsUrl ............................................................................................................................... 131Encoder ....................................................................................................................................... 131ExternalTransformer.................................................................................................................... 132FormatNumber ............................................................................................................................ 133FromFloat .................................................................................................................................... 134FromInteger................................................................................................................................. 134FromPackDecimal ....................................................................................................................... 135FromSignedDecimal.................................................................................................................... 135hebrewBidi................................................................................................................................... 135HebrewDosToWindowsTransformer ........................................................................................... 136HebrewEBCDICOldCodeToWindows .........................................................................................136hebUniToAscii ............................................................................................................................. 136hebUtf8ToAscii ............................................................................................................................ 136HtmlEntitiesToASCII.................................................................................................................... 136HtmlProcessor ............................................................................................................................. 137InjectFP ....................................................................................................................................... 137InjectString .................................................................................................................................. 137JavaTransformer .........................................................................................................................138LookupTransformer ..................................................................................................................... 139NormalizeClosingTags ................................................................................................................ 140ODBCLookup .............................................................................................................................. 140RegularExpression ...................................................................................................................... 141RemoveMarginSpace ..................................................................................................................142RemoveRtfFormatting ................................................................................................................. 142RemoveTags ............................................................................................................................... 143Replace ....................................................................................................................................... 143Resize ......................................................................................................................................... 144ReverseTransformer ................................................................................................................... 144

Page 8: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide Contents

vi

RtfProcessor................................................................................................................................ 144RtfToASCII .................................................................................................................................. 145SubString..................................................................................................................................... 145ToFloat ........................................................................................................................................ 145ToInteger..................................................................................................................................... 146ToPackDecimal ........................................................................................................................... 146ToSignedDecimal ........................................................................................................................ 147TransformationStartTime............................................................................................................. 147TransformByParser ..................................................................................................................... 148TransformByService.................................................................................................................... 149TransformerPipeline .................................................................................................................... 149WestEuroUniToAscii ................................................................................................................... 150XSLTTransformer........................................................................................................................ 150

Transformer Subcomponent Reference ............................................................................................ 150InlineTable ................................................................................................................................... 151ODBC_Text_Connection............................................................................................................. 151XMLLookupTable ........................................................................................................................ 152

9. Actions ..........................................................................................153How Actions Work ............................................................................................................................. 153

Comparison between Actions and Transformers ........................................................................ 154Defining Actions................................................................................................................................. 154Standard Action Properties................................................................................................................ 154Action Quick Reference..................................................................................................................... 155Action Component Reference ...........................................................................................................156

AddEventAction........................................................................................................................... 157AppendListItems.......................................................................................................................... 157AppendValues ............................................................................................................................. 158CalculateValue ............................................................................................................................ 159CombineValues ........................................................................................................................... 160CreateList .................................................................................................................................... 161DateAdd ...................................................................................................................................... 162DateDiff ....................................................................................................................................... 163DownloadFile............................................................................................................................... 163DownloadFileToDataHolder ........................................................................................................ 164DumpValues ................................................................................................................................ 165EnsureCondition.......................................................................................................................... 165ExcludeItems............................................................................................................................... 167ExternalCOMAction ..................................................................................................................... 167JavaScriptFunction ...................................................................................................................... 171Map ............................................................................................................................................. 172ODBCAction ................................................................................................................................ 172ResetVisitedPages ...................................................................................................................... 174RunMapper.................................................................................................................................. 175RunParser ................................................................................................................................... 175RunSerializer ............................................................................................................................... 177SetValue...................................................................................................................................... 178Sort .............................................................................................................................................. 179SubmitForm................................................................................................................................. 180SubmitFormGet ........................................................................................................................... 181WriteValue ................................................................................................................................... 182

Page 9: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide Contents

vii

XSLTMap .................................................................................................................................... 183Action Subcomponent Reference...................................................................................................... 184

COMClass ................................................................................................................................... 184MSMQOutput .............................................................................................................................. 185ODBC_XML_Connection ............................................................................................................ 185OpenURL .................................................................................................................................... 186OutputCOM ................................................................................................................................. 186OutputDataHolder ....................................................................................................................... 188OutputFile .................................................................................................................................... 188ResultFile .................................................................................................................................... 189

10. Serializers....................................................................................190Creating a Serializer from a Parser ................................................................................................... 190

Controlling How the Create Serializer Command Works ............................................................ 191Troubleshooting an Auto-Generated Serializer ...........................................................................193

Creating a Serializer by Using the New Serializer Wizard................................................................. 195Creating a Serializer by Editing the IntelliScript................................................................................. 196Creating a Serializer within a RunSerializer Action ...........................................................................196Running a Serializer .......................................................................................................................... 196Serialization Anchors .........................................................................................................................197

Example of Serialization Anchors................................................................................................ 197Defining Serialization Anchors .................................................................................................... 198Sequence of Serialization Anchors ............................................................................................. 199

Standard Serializer Properties...........................................................................................................199Serializer Quick Reference................................................................................................................ 200Serializer Component Reference ...................................................................................................... 200

Serializer ..................................................................................................................................... 200Serialization Anchor Component Reference ..................................................................................... 201

AlternativeSerializers................................................................................................................... 201ContentSerializer .........................................................................................................................202DelimitedSectionsSerializer......................................................................................................... 203EmbeddedSerializer .................................................................................................................... 206GroupSerializer ........................................................................................................................... 207RepeatingGroupSerializer ...........................................................................................................207

11. Mappers....................................................................................... 210Creating a Mapper ............................................................................................................................. 210Creating a Mapper within a RunMapper Action................................................................................. 211Components Nested within a Mapper ............................................................................................... 211Mapper Example ............................................................................................................................... 211Running a Mapper ............................................................................................................................. 212Standard Mapper Properties ............................................................................................................. 213Mapper Quick Reference................................................................................................................... 213Mapper Component Reference ......................................................................................................... 214

Mapper ........................................................................................................................................ 214Mapper Anchor Component Reference ............................................................................................. 215

AlternativeMappings.................................................................................................................... 215

Page 10: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide Contents

viii

EmbeddedMapper....................................................................................................................... 216GroupMapping............................................................................................................................. 217RepeatingGroupMapping ............................................................................................................ 218

12. Locators, Keys, and Indexing ....................................................219Example of Locators.......................................................................................................................... 219Example of Indexing by Key .............................................................................................................. 222Source and Target Properties ...........................................................................................................225

Source Property .......................................................................................................................... 226Target Property ........................................................................................................................... 231

Standard Locator and Key Properties ............................................................................................... 233Locator and Key Component Quick Reference ................................................................................. 233Locator and Key Component Reference ........................................................................................... 233

Key .............................................................................................................................................. 234Locator ........................................................................................................................................ 237LocatorByKey .............................................................................................................................. 237LocatorByOccurrence..................................................................................................................238

13. Project Properties.......................................................................240Properties versus Preferences .......................................................................................................... 240Setting the Project Properties............................................................................................................ 240Info Page ...........................................................................................................................................241Authentication Page .......................................................................................................................... 241Encoding Page .................................................................................................................................. 241External Tools Builders Page ............................................................................................................ 245Namespaces Page ............................................................................................................................ 245Output Control Page.......................................................................................................................... 245Project References Page................................................................................................................... 246XML Generation Page....................................................................................................................... 246

14. Running and Testing Projects ................................................... 249Color-Coding the Example Source .................................................................................................... 249Running in Conversion Agent Studio................................................................................................. 250

If the Output File is not Displayed ............................................................................................... 251Running on Additional Source Documents..................................................................................252

Viewing the Event Log....................................................................................................................... 252Event-Log Properties................................................................................................................... 252Event Display Preferences .......................................................................................................... 252Understanding the Event Log...................................................................................................... 253Using Named Components ......................................................................................................... 254Cross-Identifying Events ............................................................................................................. 255Effect of Failure Events ............................................................................................................... 255Opening a Conversion Agent Engine Event Log......................................................................... 256

Disabling a Component ..................................................................................................................... 257

Page 11: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide Contents

ix

15. Deploying Conversion Agent Services .....................................258Preparing a Project for Deployment .................................................................................................. 258

Setting the Startup Component ................................................................................................... 258Reviewing Development Configurations ..................................................................................... 259

Conversion Agent Repository............................................................................................................ 259Deploying a Service in a Development Environment ........................................................................ 260

Updating a Deployed Service...................................................................................................... 261Removing a Deployed Service .................................................................................................... 261

Deploying a Service to a Production Server ...................................................................................... 261Running a Service ............................................................................................................................. 262

Index.................................................................................................. 263

Page 12: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 1. Designing Conversion Agent Data Transformations

1

Designing Conversion Agent DataTransformations

Conversion Agent Studio is the design and configuration environment of the SAPConversion Agent system. Using Conversion Agent Studio, you can design andimplement transformations that operate on any kind of data.

This book, the Conversion Agent Studio User's Guide, is a complete learning andreference manual for designing Conversion Agent transformations. The bookcontains:

Instructions for using the Conversion Agent Studio tools and windows

Explanations of the Conversion Agent concepts

Details on how to use all the Conversion Agent components, such as parsers,serializers, transformers, mappers, anchors, and actions

Examples and tips on how to design transformations that work with manydifferent kinds of input and output

Instructions for deploying transformations that you have designed inConversion Agent Studio to the Conversion Agent Engine runtimeenvironment

Before You Continue

Before you continue in this book, we recommend that you read or skim the bookGetting Started with Conversion Agent. The Getting Started book introduces theConversion Agent concepts and working methods.

Conversion Agent Studio is hosted in the Eclipse development environment. Youshould examine the book Conversion Agent Studio in Eclipse, to learn about thewindows, menus, toolbars, etc., that are available in Eclipse.

Overview of Transformation Architecture

When you construct a Conversion Agent transformation, you build it in modularfashion from components of the Conversion Agent system. The components are

1

Page 13: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 1. Designing Conversion Agent Data Transformations

2

arranged in a hierarchy or tree, which you can view in the IntelliScript editor ofConversion Agent Studio.

The components work with input and output documents, and with the data holdersthat store Conversion Agent data.

This section provides a brief overview of the components and terminology that areused in this architecture. For detailed information about each component type, seethe following chapters of this book.

Conversion Agent Components

Top-Level Components

At the top level of the hierarchy, a Conversion Agent transformation can run aparser, serializer, mapper, or transformer. These components are defined asfollows:

ParserA components that converts source documents, which can be in any format, toXML.

SerializerA component that converts XML documents to output documents, which canbe in any format.

TransformerA component that modifies data. The input and output can be in any format.

MapperA component that converts XML documents to a different XML structure orschema.

Of these four component types, parsers and serializers are the most powerful andgenerally useful. By running a parser and serializer in sequence, for example, youcan convert any format to any other format. Using these components, you canperform conversions of unlimited complexity.

As top-level components, transformers are useful for relatively simple dataconversions, such as replacing predefined strings. Usually, the input and outputdocuments have the same, non-XML format. Because of this limitation,transformers are more often used as nested components, and not as top-levelcomponents (see below).

Please notice the distinction between transformation, which is the generic term for theoperations that Conversion Agent performs on data, and transformer, which is a specifictype of Conversion Agent component.

Nested Components

Within a parser, serializer, or mapper component, you can nest components suchas:

Page 14: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 1. Designing Conversion Agent Data Transformations

3

FormatsDefine the overall format of documents, such as the delimiters that ConversionAgent should use to interpret the documents.

Document processorsOperate on a document as a whole, performing preliminary conversionsbefore parsing, or final operations after serializing.

AnchorsDefine the data in a source document, which a parser should process andextract. The anchors specify how a parser should search for the data, andwhere it should store the data that it extracts.

Serialization anchorsDefine how a serializer should write XML data to an output document.Serialization anchors are the inverse of anchors; an anchor writes data from asource document to XML, whereas a serialization anchor writes data fromXML to an output document.

Mapper anchorsDefine how a mapper should write XML data to another XML structure orschema. The anchors specify where to find the data in the source XML, andwhere to write the data in the output XML.

ActionsPerform operations on data in the scope of a data transformation, for example,concatenating strings that a parser has extracted from a source document,summing numbers that a serializer finds in an XML input document, orquerying a database for additional data.

TransformersIn addition to their use as top-level components, you can nest transformerswithin a parser or a serializer. For example, within a parser, you can nest atransformer that modifies the output of the anchors.

As a nested component, a transformer operates on a portion of a document,and not on the complete document.

Indirectly, you can also nest parsers and serializers within each other. For example,within a parser, you can nest an action that runs another parser on a portion of thesame document or on a second document.

SubcomponentsIn addition to the main components that are described above, Conversion Agenthas a large number of subcomponents , which are used for special purposes withinthe main components. It is also possible to develop custom components, such ascustom document processors or custom transformers, to serve special needs.

Data Holders

Data holders are the XML elements, XML attributes, and variables that yourtransformations use for data storage.

Page 15: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 1. Designing Conversion Agent Data Transformations

4

The elements and attributes are defined in XSD schemas, which are standard XMLschema definitions. Conversion Agent uses XSD to define data holders, to help itprocess XML input, and to help it construct valid XML output.

The variables are defined in the Conversion Agent configuration, using XSD datatypes.

Data holders and XSD schemas are described in Chapter 6 of this book, DataHolders.

Documents

The input and output of a Conversion Agent transformation are called documents.

A document can have any size. It can contain any text or binary data. It can bestored or accessed in a file, buffer, stream, database, messaging system, or anyother location.

The input and output of a data transformation are called the source document andthe output document.

For a parser, the source document can have any format. A special source documentis the example source, which is the document that you use to teach the parser how toprocess other source documents. The output document of a parser has an XMLformat.

The source document of a serializer is XML, and the output document can haveany format. For a mapper, both the source and the output are XML. For atransformer, the source and output can have any format.

XML is the common language, which connects Conversion Agent transformationstogether. For example, you can run a parser that converts a source document fromany format to XML, and a serializer that converts the XML to any output format.By chaining the parser and serializer together, you can convert any input format toany output format.

Document formats are described in Chapter 5 of this book, Formats.

Conversion Agent Services

To configure a Conversion Agent transformation, you use Conversion AgentStudio. Then you should deploy the transformation as a Conversion Agent service.This lets Conversion Agent Engine execute the transformation.

A Conversion Agent service can run a parser, serializer, mapper, or a transformeras its top-level component.

For more information, see Chapter 15, Deploying Conversion Agent Services.

Page 16: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 1. Designing Conversion Agent Data Transformations

5

Project Architecture

A Conversion Agent data transformation is stored in a project. Each project has aproject folder, which contains the project files.

Required Project Files

There are two required types of files in a project:

CMW fileThe CMW file (which has a *.cmw filename) is the main project configurationfile. Every project contains exactly one CMW file.

TGP filesA TGP file (which has a *.tgp filename) contains a script, which defines theparsers, serializers, etc. that perform a data transformation. A project cancontain one or more TGP files.

The scope of a TGP file is the entire project. This means, for example, that aparser in one TGP file can use a variable that is defined in another TGP file.

You can import or copy the TGP files between projects. You can organize theproject components in different TGP files.

Additional Project Files

In addition to the CMW and TGP files, a project can contain many other files andfolders, for example:

XSD schemasMost Conversion Agent transformations require an XSD schema, whichdefines the XML elements and attributes that the transformation can use.Parsers, serializer, and mappers use XSD schemas to define the structure oftheir XML output or input.

The main schema is usually stored in the top-level project folder. If the mainschema includes other schema files, they are stored in a subfolder.

Results folderAs you develop and test a project, Conversion Agent Studio creates asubfolder, within the project folder, where it stores the parser or serializeroutput. By default, the name of the subfolder is Results.

When you deploy a service and run it in Conversion Agent Engine, you cancontinue to use the Results folder, or you can instruct the service to store itsoutput in other locations.

Example source documentTo design a parser, you typically select and configure the text in an examplesource document, which is a sample of the input that you want the parser toprocess.

Page 17: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 1. Designing Conversion Agent Data Transformations

6

You can provide the example source as a file, a text string, or any otherdocument type that Conversion Agent can process (see Documents above). Ifthe example source is a file, we recommend that you store it in the projectfolder. This helps keep the example source together with the project.

Test documentsIn addition to the example source, you can store other input documents thatyou use to test a parser or serializer.

Any other desired filesThe project folder is a convenient location to store any other files or foldersthat are associated with a project, for example, documentation or readme files.

Workflow for Designing Transformations

The following procedure is a summary of a typical workflow for using ConversionAgent Studio and developing a transformation. The example is a workflow thatyou might use to configure a new parser or serializer.

The main workflow steps are:

1. Analyze the transformation requirements, such as the required inputs andoutputs.

2. Create and configure a Conversion Agent project that implements thetransformation.

3. Deploy the transformation as a Conversion Agent service, which runs inConversion Agent Engine.

The following paragraphs provide more information on the steps.

Analyzing the Inputs and Outputs

The first step in any design or development project is to examine the inputs andoutputs carefully, and determine how they are related.

For a parser, some of the analysis steps are to:

Examine whether the document structure is amenable to parsing. Sometimes,a simple step such as converting the document to an alternative format (whichyou can do by applying a Conversion Agent document processor) makes thedocument much easier to parse.

Plan which data you need to extract from the source document, and whereyou will insert the extracted data in the XML. In Conversion Agent, youimplement the data extraction by using the Content anchor.

Analyze the structure of the source document and identify the features youcan use to locate the data fields. In Conversion Agent, these features translateto Marker anchors, or to various other types of anchors.

Page 18: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 1. Designing Conversion Agent Data Transformations

7

Find repetitive or structured features of the documents that may help extractthe data. You can implement such features using anchors such as Group,EnclosedGroup, RepeatingGroup, or DelimitedSections.

Decide whether you need to transform any of the data during or after theextraction process. The Conversion Agent components that operate on theextracted data are transformers and actions.

Determine whether there are any additional data sources, such as a linkeddocument or a database, that you need to access to prepare the output. Youcan access such data by using certain anchors, transformers, or actions.

For a serializer or a mapper, you can invert the steps: plan the data that you needto extract from the source XML, and where to insert it in the output document. It isusually easier to design a serializer or a mapper than a parser because the inputdata is fully structured XML.

Creating and Configuring a Project

After you analyze the inputs and outputs, it is time to implement the processingsteps in Conversion Agent. A typical procedure is as follows:

1. Create a new Conversion Agent project.

2. Add one or more XSD schemas to the project. The schemas must define theXML elements and attributes with which you will work.

3. Create a parser, serializer, etc., and define its properties.

For a parser, you must define an example source document and a formatcomponent. The format component can contain features such as a delimitersdefinition and transformers.

For a serializer, there are no required properties that you must set, but thereare some optional properties such as the extension of the output filename. Youcan also create a serializer automatically, by asking Conversion Agent Studioto invert a parser that you have already created.

The parser or serializer is displayed in the IntelliScript editor of ConversionAgent Studio. The example source is displayed in the example pane of theIntelliScript editor.

4. Configure the subcomponents of the parser or serializer.

For a parser, the main components are anchors. You can define the anchors byselect-and-click or drag-and-drop procedures, or you can edit the IntelliScript.Conversion Agent Studio helps you do this by highlighting and color-codingthe anchors in the example source.

For a serializer, the main components are serialization anchors, which you cancreate in the IntelliScript.

5. Use the Conversion Agent Studio tools to test and execute the datatransformation. Correct any configuration errors that you detect during thetesting.

Page 19: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 1. Designing Convers ion Agent Data Transformations

8

Deploying the Project as a Conversion Agent Service

When the parser or serializer is operating correctly, use the Deploy command ofConversion Agent Studio to deploy it as a Conversion Agent service, which canrun in Conversion Agent Engine.

How to Learn More

As you continue in this book, you will learn the details of all the Conversion Agentconcepts and procedures. The chapters are organized more or less according to theworkflow steps, which are outlined above.

Chapter Explains how to:

2. Using Conversion AgentStudio

Use Conversion Agent Studio in the Eclipse developmentenvironment

3. Parsers Create a parserConfigure parser features such as the example source

4.Document Processors Configure components that pre-process a document beforeparsing, or post-process a document after serializing

5. Formats Configure the format component of a parserDefine format components such as delimiters and defaulttransformers

6. Data Holders Use XSD schemas to define element and attribute data holdersDefine variables

7. Anchors Define the anchors that a parser uses to extract data from asource document

8. Transformers Use transformers to modify the data

9. Actions Use actions to operate on the data

10. Serializers Create a serializerDefine the serialization anchors that write the serialized data to anoutput document

11. Mappers Create a mapperDefine the mapper anchors that transfer data between the inputand output XML

12. Locators, Keys, andIndexing

Construct data transformations that retrieve, match, and storerecords by sequential, non-sequential, or keyed criteria

13. Project Properties Configure the properties of a project, such as the input and outputencoding and the XML validation features

Page 20: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 1. Designing Conversion Agent Data Transformations

9

Chapter Explains how to:

14. Running and TestingProjects

Test and troubleshoot a data transformation

15. Deploying ConversionAgent Services

Deploy a data transformation as a service that runs in ConversionAgent engine

Online Samples

As you use this book, you can view online samples that illustrate many ConversionAgent features. The samples are Conversion Agent projects, which are located inthe Samples folder under your main Conversion Agent installation folder (bydefault, c:\Program Files\SAP\ConversionAgent\samples). If you do not have aSamples folder, run the Conversion Agent setup again and select the option toinstall the samples.

To view the samples, you should import the projects to Conversion Agent Studio.For instructions on how to do this, see the book Conversion Agent Studio in Eclipse.

In addition to the project samples, you can find sample program code for featuressuch as custom processors and transformers.

Contacting Product Support

If you experience any problems or have questions about Conversion Agent, youshould first check the documentation for information.

If you do not find the information that you need, please contact SAP support. Ifyou have a question about a particular project, please send a zip file containing:

The project folder

Any additional required files, such as the example source document

The execution log file

Page 21: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 2. Using Conversion Agent Studio

10

Using Conversion Agent Studio

Conversion Agent Studio is the design and configuration environment ofConversion Agent. You use it to develop and edit Conversion Agent projects.

If you have done the exercises in the book Getting Started with Conversion Agent,you already have experience using Conversion Agent Studio. The exercises teachmany aspects of the Conversion Agent Studio operation.

Conversion Agent Studio for Eclipse

Conversion Agent Studio runs as a plug-in module, which is hosted in the Eclipsedevelopment environment.

Eclipse is a versatile platform, which is designed to support both Java developmentand plug-in development tools. Conversion Agent Studio works seamlessly withinEclipse, letting you develop Conversion Agent projects for all purposes.

The Eclipse platform is supplied at no additional cost with the Conversion Agentsoftware. You do not need any previous experience with Eclipse in order to useConversion Agent Studio.

Conversion Agent Studio for Eclipse

2

Page 22: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 2. Using Conversion Agent Studio

11

For Detailed Information

For detailed information and complete instructions on using Conversion AgentStudio within the Eclipse environment, please see the manual Conversion AgentStudio in Eclipse. That manual explains how to all the Conversion Agent Studiofeatures, such as the views, editors, menus, and toolbars.

The Conversion Agent Studio in Eclipse manual does not discuss the functionalcomponents that you can insert in Conversion Agent projects, such as parsers,serializers, data holders, anchors, transformers, etc. For that information, pleaseread the following chapters of this book.

Page 23: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 3. Parsers

12

Parsers

Parsers are the Conversion Agent components that convert a source document toXML.

The output of a parser is always XML. The input can have any format, for example,text, HTML, Word, PDF, or HL7. The input can even be an XML document, whichthe parser processes as string data (this is different from a serializer or a mapper,which treats its input as an XML object).

This chapter provides basic instructions on how to create and run a parser. Thedetails, such as how to support specific document formats, how to define theanchors that process the text of a source document, etc., are in the succeedingchapters.

For tutorial exercises on how to define and run a parser, see the book GettingStarted with Conversion Agent.

Creating a Parser

You can create a parser by either of the following methods:

By using the New Parser wizard, or

By editing the IntelliScript and inserting a Parser component

Using the New Parser Wizard

The easiest way to create a parser is by using the New Parser wizard. The wizardlets you set up the basic parser configuration, with typical options.

After you finish the wizard, you can edit the parser properties as required. Nestedwithin the parser, you can insert components such as document processors,anchors, transformers, and actions, which perform the parsing operations. Thefollowing chapters of this book explain how to do this.

Opening the Wizard

To create a new project that contains a parser, choose File > New > Project on theConversion Agent Studio menu. In the left pane of the New Project window, selectConversion Agent. In the right pane, select Parser Project.

3

Page 24: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 3. Parsers

13

To create a new parser in an existing project, choose File > New > Parser on themenu.

On each page of the wizard, enter the desired options. Click the Next button toadvance to the following page. At any stage, you may click Finish to skip thesubsequent pages and accept the defaults for those pages.

Wizard OptionsThe New Parser wizard prompts you for options such as the following:

Project nameAn identifier for the project.

Project contentsThe storage location of the project folder. The default is the Eclipse workspacefolder.

Parser nameA name for the parser.

Script nameA name for a TGP script file, where the wizard stores the parser definition.

Schema file path(Optional) The name of an XSD schema, which defines the data holders wherethe parser will store its output.

If you omit this step, you can add an XSD schema to the project afterwards, byright-clicking in the Conversion Agent Explorer view.

Source type, source pathSelect the example source document, which you will use to configure theparser.

The example source should illustrate all the features (or as many features aspossible) that you expect the parser to process in production documents.

Choose the example source carefully. After you configure a parser, it may bedifficult to change the example source.

You may select the following types of example source:

File: Browse to a file on the local computer or network.URL: Specify the URL of a document on a network or web site.Text: Type a text string, which the parser will use as an example source. You

might use this option, for example, for short source documents that consistof a single text line.

None: Do not use an example source. In that case, you must configure theanchors completely in the IntelliScript, instead of parsing by example.

Content typeSelect the content type of the source documents, for example, ASCII or Binary.If you don't find the exact document type, you can refine your choice later byediting the IntelliScript.

Page 25: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 3. Parsers

14

Document preprocessor(Optional) Select a document processor, which converts the source documentsto a type that is amenable for parsing.

The wizard suggests processors that seem appropriate for the content type.For example, if you select a Microsoft Word example source, the wizardsuggests processors that converts Word documents to text, HTML, or XMLformats.

Afterwards, you can edit the IntelliScript and select among the full set ofprocessors. For more information, see Chapter 4, Document Processors.

FormatSelect the format of the source documents, for example, tab-delimited orHTML.

The wizard suggest formats that seem appropriate for the content type. If youselected a document processor, the format is that of the processor output,rather than the original source document.

After you complete the wizard, you can edit the IntelliScript and select amongthe full range of formats. For information, see Chapter 5, Formats.

Completing the Parser Configuration

After you have entered the wizard options, click Finish to create the parser. Theparser is displayed in the Conversion Agent Explorer view and the Componentview of Conversion Agent Studio.

To complete the parser configuration:

1. Display the parser in the IntelliScript editor. You can do this by double-clicking the parser name in the Component view, or by double-clicking theTGP script file in the Conversion Agent Explorer.

IntelliScript editor

2. Under the contains line, add a sequence of nested anchors and actions. Fordetailed instructions, see the following chapters of this book.

Page 26: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 3. Parsers

15

3. Run and test the parser (see Running a Parser below), and modify theIntelliScript as required.

Creating a Parser by Editing the IntelliScript

Instead of using the New Parser wizard, you can create a parser by editing theIntelliScript directly. The result is identical to a parser created by using the wizard.

1. At the top (global) level of the IntelliScript, select the three-dots (...) symbol.Press Enter and type a name for the parser.

2. To the right of the name, press Enter. Select a Parser component from thedrop-down list.

3. Expand the tree under the Parser component. Assign its properties, such asthe example_source and the format. For an explanation of the properties, seethe Parser Component Reference below.

4. Under the contains line, add a sequence of nested anchors and actions. Fordetailed instructions, see the following chapters of this book.

5. Run and test the parser (see Running a Parser below), and modify theIntelliScript as required.

Running a Parser

To run a parser in Conversion Agent Studio, follow this procedure. For additionalinformation, see Chapter 14, Running and Testing Projects.

1. In the IntelliScript editor or in the Component view, right-click the parser thatyou want to run, and choose Set as Startup Component.

Alternatively, you can set the startup component in the Run > Run command.

Page 27: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 3. Parsers

16

2. On the Conversion Agent Studio menu, choose the Run > Run command orthe Run > Run MyParser command (where MyParser is the name of the parserthat you have set as the startup component).

3. After a few seconds, Conversion Agent Studio displays the Events view.Examine it for any warnings, failures, or errors.

4. To display the parsing results, double-click the file Results\output.xml in theConversion Agent Explorer view.

If you are using Windows XP SP2 or higher, Internet Explorer may display a warningabout active content in the XML results file. You can safely ignore the warning.

Platform-Independent Parsers

Conversion Agent runs on both Windows and Unix systems. Most parser featuresrun equally well on both platforms.

There are a few exceptions to this rule, however. If you plan to run a parser onboth Windows and Unix, here are a few tips that can help ensure platformindependence.

Document ProcessorsUse document processors that do not have platform-specific system requirements.For information, see Chapter 4, Document Processors.

For example, use the ExcelToXml processor instead of ExcelToHTML . The former isplatform-independent; the latter requires that Microsoft Excel be installed on thecomputer.

Custom Components

Conversion Agent supports custom document processors, transformers, andactions. You should use platform-independent versions of the custom components,such as:

ExternalJavaPreProcessor (programmed in Java)

Page 28: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 3. Parsers

17

ExternalPreProcessor or ExternalTransformer (programmed in C++ andcompiled for both Windows and Unix)

Do not use ExternalCOMPreProcessor and ExternalCOMAction components, whichare supported only on Windows.

Newline Markers

When defining Marker anchors, avoid platform-specific newline sequences such as\n\r (a newline character followed by a carriage return character, which iscommonly used in Windows).

Instead, configure a Marker with the built-in NewlineSearch component, whichsearches for both the \n\r sequence and the \n or \r character alone.

EncodingConfirm that the input, output, and working encoding are supported on theplatforms. For lists of the supported encodings, see Chapter 13, Project Properties.

File Paths

Use relative (as opposed to absolute) file paths. Remember that file paths on Unixare case-sensitive.

Parser Component Reference

The main Parser component is documented in this section. For subcomponentssuch as input ports, see the following sections of this chapter.

Parser

A Parser is a component that converts a source document to XML.

A Parser contains many nested components. Directly under the contains line ofthe Parser , you can nest anchors and actions. In other locations, under variousParser properties, you can assign components such as formats, delimiters,document processors, and transformers. For detailed information about all thesecomponents, see the following chapters of this book.

Example

The following is an example of a parser that processes tab-delimited textdocuments. For a complete description of this parser, see the chapter Basic ParsingTechniques in the book Getting Started with Conversion Agent.

Page 29: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 3. Parsers

18

Basic Propertiesexample_source

The example source document, which you use to configure the parseroperation. The document should be representative of the source documentsthat the parser will process. (For a discussion of the kinds of documents thatConversion Agent can process, see Documents in Chapter 1, DesigningConversion Agent Data Transformations.)

The value of the property is an input port such as LocalFile (a file on yourcomputer) or Text (a text string).

When you create a new Parser, you can assign this property in the New Parserwizard. Alternatively, you can edit the property in the IntelliScript.

Nested within this property, you can assign a preprocessor, which convertsthe source documents to a format that the parser can accept (see Chapter 4,Document Processors).

If you leave the example_source property blank, the parser does not use anexample source. In that case, you must configure the anchors completely in theIntelliScript, instead of learning by example.

To view the document in the example pane, choose IntelliScript > OpenExample Source on the menu, or right-click the parser and choose OpenExample Source.

formatThis property lets you specify the format of the source document, such aswhether the document contains text, HTML, or binary code; the delimitersthat separate data fields in the document; a format preprocessor thatConversion Agent should apply to the document; and transformers that theparser should apply by default to all Content anchors in the parser.

When you create a new Parser, you can assign this property in the New Parserwizard. You can customize the format by editing the IntelliScript.

Page 30: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 3. Parsers

19

For details of the available format components, see Chapter 5, Formats.

Advanced Propertiesreject_recurring_pages

If selected, the parser does not parse the same page twice in the sameexecution. This is useful, for example, if a parser is following the links on aweb site, and you want to prevent it from parsing duplicate links to the samepage.

The ResetVisitedPages action resets the history list and allows a parser toprocess a page again, even if reject_recurring_pages is selected (see Chapter9, Actions). You might do this, for example, if you want to post different inputdata to the same web page.

no_initial_phaseIf selected, the parser runs without an initial phase. Components that areconfigured to run in the initial phase run in the main phase, instead (see How aParser Searches for Anchors in Chapter 7, Anchors).

sources_to_extractA specification of the source documents that the parser should process.

The value of the property is an input port such as LocalFile (a file on yourcomputer) or Text (a text string). To specify multiple sources, you can assignthe value DocList (a list of files) or FileSearch (a wildcard search for files).

If you assign sources_to_extract, and you run the parser in ConversionAgent Studio, the parser processes the specified documents. If you leavesources_to_extract blank, the parser processes the example_source.

When you deploy a parser as a service, an application that runs the service canoverride sources_to_extract. See the Conversion Agent Engine Developer'sGuide for details.

serialization_modeThis property specifies how the Create Serializer command should process theportions of the example source that the parser does not output to XML, whenyou create a serializer from a parser. For a full explanation, see Controlling Howthe Create Serializer Command Works in Chapter 10, Serializers.

The possible values of the serialization_mode are:

serialization_mode Explanation

Full The Create Serializer command copies the non-XML text to theserializer configuration.

Page 31: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 3. Parsers

20

serialization_mode Explanation

Outline The Create Serializer command copies only the delimiters of thenon-XML text to the serializer configuration.Under the Outline option, you can select the use_markersoption. This causes the Create Serializer command to copy thecontent of the Marker anchors but only the delimiters of other non-XML text.

nameA name that you assign to the parser. The name is used in the event log.

remarkA comment describing the parser.

example_valuesThis property contains simulated values (ExampleValue components) thatanother parser might pass to this parser. The property is useful whendesigning a parser that is to be activated by another parser. Conversion Agentuses the property only when it learns the example source; it ignores theproperty when it parses a source document.

Under this property, specify the data holders (see Chapter 6, Data Holders) thatthe main parser passes to this parser, and their simulated values.

The following properties are useful in situations where the parser must selectspecific occurrences of data holders. For an explanation, see Chapter 12, Locators,Keys, and Indexing.

source

target

Online Samples

In the Samples folder, you can find many online samples of parsers.

For online tutorial exercises on creating parsers, see Getting Started with ConversionAgent.

Input Port Component Reference

An input port is a component that specifies an input to Conversion Agent, such as asource document. The input can be a document that is stored on the localcomputer, on a network, or in a string. The input can also be a list of documents.

In a Parser, the values of the example_source and sources_to_extract propertiesare input ports.

Page 32: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 3. Parsers

21

DocList

A document list.

This input port is used, for example, in the sources_to_extract properties of aParser. It lets you specify multiple source documents that the parser shouldprocess.

Within this component, you can nest multiple input ports such as FileSearch,LocalFile, or Text, each of which specifies a single file.

FileSearch

The criteria for a file search.

This input port is used, for example, in the sources_to_extract properties of aParser. It lets you specify source documents using wildcards.

Basic Propertiesdirectory

The folder to be searched.

wildcardThe search criterion. You may use * as a wildcard character. For example,*.txt finds all text files. The default is *.*, which finds all files in thedirectory.

Advanced Properties

recursiveIf selected, the search includes subfolders of the specified directory.

pre_processorThe name of a preprocessor that the parser should apply to the files (seeChapter 4, Document Processors).

LocalFile

A file on the local computer.

This input port is used, for example, in the example_source andsources_to_extract properties of a Parser.

Basic Propertiesfile_name

Browse to the file.

Page 33: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 3. Parsers

22

Advanced Properties

simulated_urlA URL that Conversion Agent should assign to the file. This property instructsConversion Agent to treat the file as if it were located on a web server. If thefile contains relative links, Conversion Agent resolves the links relative to theURL.

The host-name portion of the URL is not case sensitive. Internally, ConversionAgent processes HTTP host names as lower case.

pre_processorThe name of a preprocessor that the parser should apply to the files (seeChapter 4, Document Processors).

Text

A text string.

This input port is used, for example, in the example_source andsources_to_extract properties of a Parser.

Basic Propertiesquote

The text string.

Advanced Properties

simulated_urlA URL that Conversion Agent should assign to the string. This propertyinstructs Conversion Agent to treat the string as if it were a file located on aweb server. If the string contains relative links, Conversion Agent resolves thelinks relative to the URL.

pre_processorThe name of a preprocessor that the parser should apply to the files (seeChapter 4, Document Processors).

sizeA static size for the text buffer. This property is typically used when workingwith binary sources. The default is -1, which means that the buffer isdynamically sized.

URL

The URL of a document that is available on a web server.

This input port is used, for example, in the example_source andsources_to_extract properties of a Parser.

Page 34: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 3. Parsers

23

Basic Properties

stable_urlThe URL address, for example, http://www.example.com/index.html.

The host name (www.example.com) is not case sensitive. Internally, ConversionAgent processes HTTP host names as lower case.

Advanced Properties

post_dataData that the parser should post to the URL. To determine the correct formatof the data string, you can use the technique described in the SubmitFormaction.

retriesIf the parser cannot access the URL on the first attempt, the number of retriesthat it performs before reporting a failure (default = 0).

seconds_to_waitThe number of seconds to wait between retries (default = 60).

pre_processorThe name of a preprocessor that the parser should apply to the files (seeChapter 4, Document Processors).

Page 35: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 4. Document Processors

24

Document Processors

Document processors are components that convert the format of a completedocument, to another format that is desired for processing.

You can use a document processor as a pre-processor, which converts the format of asource document prior to a data transformation. For example, if the sourcedocument of a parser is in the PDF format, you might apply the PdfToTxt_3_00processor. This converts the source document to text, which is much easier to parsethan the binary PDF format.

Do not confuse document processors with format preprocessors, which are described inChapter 5, Formats.

Installation

The document processors are supplied in an optional setup component. If youwish to use the processors, be sure to select the option to install the DocumentProcessors when you run the Conversion Agent setup.

Additional processors, which are not documented in this chapter, may be available. Forinformation about processors that work with PostScript, Microsoft Project, and otherdocument types, please contact SAP support.

Defining Document Processors

You can define a document processor that pre-processes the source document of aparser, serializer, or mapper. To do this:

1. Assign the example_source property of the data transformation (if it isn'talready assigned).

2. The value of the example_source is an input port, such as LocalFile or Text(see the Input Port Component Reference in Chapter 3, Parsers).

3. Assign the pre_processor property of the input port.

Conversion Agent applies the processor that you define under example_source toall sources on which you run the data transformation.

4

Page 36: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 4. Document Processors

25

Note that the New Parser wizard automates this procedure for you. It prompts youto define an example source document and a pre-processor.

You can also define a pre-processor in the sources_to_extract property of aparser. The processor that you define there applies only to the source documentsthat you define in sources_to_extract, and not to any other document that theparser processes.

Display of Document Processor Output

If you assign a document processor to the example source, the example pane of theIntelliScript editor displays the processor output.

Document Processor Quick Reference

AFPToXMLConverts the IBM Advanced Function Presentation print-stream format toXML.

ExcelToDataXmlConverts Microsoft Excel data to XML, without preserving Excel formulas,formatting, or code.

ExcelToHtmlConverts Microsoft Excel documents to HTML.

ExcelToTextMLConverts Microsoft Excel files to the TextML XML schema.

ExcelToTxtConverts Microsoft Excel documents to plain text.

ExcelToXmlConverts Microsoft Excel documents to XML, while preserving the Excelformulas, formatting, and optionally macro code.

ExpandFrameSetOpens an HTML frameset, letting a parser run on the content of the frames.

ExternalCOMPreProcessorRuns a custom document processor, implemented as a COM DLL.

ExternalJavaPreProcessorRuns a custom document processor, implemented in Java.

ExternalPreProcessorRun a custom document processor, implemented as a C++ DLL.

PdfToTxt_3_00Converts Adobe Acrobat (PDF) documents to plain text.

Page 37: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 4. Document Processors

26

PowerpointToHtmlConverts Microsoft PowerPoint documents to HTML.

PowerpointToTextMLConverts Microsoft PowerPoint presentations to the TextML XML schema.

ProcessByTransformersRuns transformers as document processors.

ProcessorPipelineRuns a sequence of document processors on a single document.

RtfToTextMLConverts RTF files to the TextML XML schema.

WordPerfectToTextMLConverts Corel WordPerfect documents to the TextML XML schema.

WordToHtmlConverts Microsoft Word documents to HTML.

WordToRtfConverts Microsoft Word documents to RTF.

WordToTextMLConverts Microsoft Word files to the TextML XML schema.

WordToTxtConverts Microsoft Word documents to plain text.

WordToXmlConverts Microsoft Word documents to XML.

XmlToExcelConverts XML documents to Microsoft Excel.

Document Processor Component Reference

This section describes the document processor components, which are available inConversion Agent.

AFPToXML

This document processor converts the IBM Advanced Function Presentation print-stream format to XML.

The processor output is in the UTF-8 encoding. Therefore, if a parser receives inputfrom the processor, you must set the input encoding of the project to UTF-8 (seeEncoding Page in Chapter 13, Project Properties).

Page 38: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 4. Document Processors

27

ExcelToDataXml

This document processor converts Microsoft Excel documents (version 97 orhigher) to XML.

The XML contains the data and the results of formulas that existed in the originalExcel document. It does not preserve the formulas themselves, formattinginformation, or macro code. In cases where the latter information is required, youshould use ExcelToXml rather than ExcelToDataXml.

The XML representation conforms to a subset of the ExcelToXml.xsd schema,which you can find in the doc subdirectory of the Conversion Agent installationdirectory. The schema file is provided for your information; you can use theprocessor without adding the schema to your project.

The processor output is in the UTF-8 encoding. Therefore, if a parser receives inputfrom the processor, you must set the input encoding of the project to UTF-8 (seeEncoding Page in Chapter 13, Project Properties).

The processor accesses its input directly, not via Microsoft Excel. You do not needto install Excel on the computer.

The processor is implemented in Java. If you experience any difficulty using theprocessor, confirm that you have configured the Java Runtime Environmentcorrectly (see the System Requirements and Testing and Troubleshooting chapters ofthe Conversion Agent Administrator's Guide).

ExcelToHtml

This document processor converts Microsoft Excel documents to HTML.

The processor uses the Excel save-as-HTML feature to perform the conversion.Therefore, it operates only on a Microsoft Windows platform where Excel (version97 or higher) is installed. Due to Excel limitations, the processor does not supportmultithreading.

ExcelToTextML

This document processor converts Microsoft Excel files (version 97 or higher) to theTextML XML schema.

The processor accesses its input directly, not via Microsoft Excel. You do not needto install Excel on the computer.

The processor is implemented in Java. If you experience any difficulty using theprocessor, confirm that you have configured the Java Runtime Environmentcorrectly (see the System Requirements and Testing and Troubleshooting chapters ofthe Conversion Agent Administrator's Guide).

ExcelToTxt

This document processor converts Microsoft Excel documents to plain text.

Page 39: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 4. Document Processors

28

The processor uses the Excel save-as-text feature to perform the conversion.Therefore, it operates only on a Microsoft Windows platform where Excel (version97 or higher) is installed. Due to Excel limitations, the processor does not supportmultithreading.

ExcelToXml

This document processor converts Microsoft Excel documents (version 97 orhigher) to XML.

The XML preserves the data, formulas, formatting, and optionally the macro codethat existed in the original Excel document. If only the data is required, you shouldconsider using the ExcelToDataXmlprocessor, which offers smaller output andbetter performance.

The XML representation conforms to the ExcelToXml.xsd schema, which you canfind in the doc subdirectory of the Conversion Agent installation directory. Theschema file is provided for your information; you can use the processor withoutadding the schema to your project.

The processor output is in the UTF-8 encoding. Therefore, if a parser receives inputfrom the processor, you must set the input encoding of the project to UTF-8 (seeEncoding Page in Chapter 13, Project Properties).

The processor accesses its input directly, not via Microsoft Excel. You do not needto install Microsoft Excel on the computer.

The processor is implemented in Java. If you experience any difficulty using theprocessor, confirm that you have configured the Java Runtime Environmentcorrectly (see the System Requirements and Testing and Troubleshooting chapters ofthe Conversion Agent Administrator's Guide).

Basic Properties

include_sheetsDefines the sheets of the Excel workbook to include in the XML. In the XML,each sheet is represented by a <sheet> element.

In the list under this property, you may enter any of the following values:

All: includes all sheetsThe sheet namesData holders, which contain the sheet names

If you list a sheet that doesn't exist in the workbook, the processor generates a<sheet> element containing a warning message. The other sheets areprocessed normally.

include_empty_cellsDeselect this property to omit empty cells from the XML.

include_macro_informationSelect this property to include Excel macro code in the XML.

Page 40: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 4. Document Processors

29

ExpandFrameSet

This document processor opens all the frames of an HTML document.

This processor is appropriate if the source document of a parser is an HTMLframeset. The parser runs on the content of all the frames.

ExternalCOMPreProcessor

This component lets you run a custom document processor.

Because this component uses the Microsoft COM architecture to activate theprocessor, it runs only on Microsoft Windows platforms.

Creating a Custom COM ProcessorYou should program the custom processor as an ActiveX DLL, containing thefollowing function:

function pre_process(ByVal in_file As String) As String

Here, in_file is the content of the source document. The function returns theprocessed text.

Register the DLL on the Conversion Agent computer.

You may then use the DLL in the ExternalCOMPreProcessor component.

Optionally, you can add the processor to the drop-down component list thatConversion Agent displays (for instructions, see the chapter about Using theIntelliScript Editor in the book Conversion Agent Studio in Eclipse).

Basic Properties

ProgIDThe ProgID of the class.

ExternalJavaPreProcessor

This component lets you run a custom document processor, which is implementedin Java.

The processor requires a Java Runtime Environment. If you experience anydifficulty using the processor, confirm that you have configured the Java RuntimeEnvironment correctly (see the System Requirements and Testing and Troubleshootingchapters of the Conversion Agent Administrator's Guide).

This component is supported for backwards compatibility with existing custom processors.For new custom processors, we recommend using the approach described in the chapter onExternal Components in the Conversion Agent Engine Developer's Guide.

Page 41: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 4. Document Processors

30

Creating a Custom Java Processor

To implement the custom document processor:

1. Create a new Java project and package, for example, namedMyJavaPreprocessor.

2. Create a class, for example, named JavaDemoPreprocessor.

3. In the class, define a method having the following syntax. The method canhave any name.

public static String main(String input_file, String output_file)

The input_file parameter is the path of the source document, on which theprocessor should operate. The output_file parameter is the path of atemporary file, where the processor should write its output.

The function should return an appropriate extension, which Conversion Agentappends to the name of the temporary file. For example, if the output of theprocessor is XML, the function should return the string "xml".

4. Create a jar file containing the class.

5. Store the jar file in the externLibs\user subfolder of your Conversion Agentinstallation folder.

You may then use the jar file in the ExternalJavaPreProcessor component.

Optionally, you can add the processor to the drop-down component list thatConversion Agent displays (for instructions, see the chapter about Using theIntelliScript Editor in the book Conversion Agent Studio in Eclipse).

Example

The following sample is the source code of a processor that repairs numeric valuesby removing commas between the numbers. You can copy the sample code intoyour implementation and edit it as required.

package MyJavaPreprocessor;

import java.io.*;

public class JavaDemoPreprocessor {

private static final int MAX_SIZE = 4096;public static String main(String input_file, String output_file){

try {FileInputStream in = new FileInputStream(input_file);FileOutputStream out = new FileOutputStream(output_file);int bytes_read=0;while(bytes_read != -1){

byte [] in_buf = new byte[MAX_SIZE];byte [] out_buf= new byte[MAX_SIZE];

Page 42: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 4. Document Processors

31

bytes_read = in.read(in_buf);int j = 0;for (int i=1;i<bytes_read;i++){

if (in_buf[i] == ','){

if (Character.isDigit((char)in_buf[i-1]) &&Character.isDigit((char)in_buf[i+1]))

{// Do Nothing

}else

out_buf[j++] = in_buf[i];}else

out_buf[j++] = in_buf[i];}out.write(out_buf, 0, j);in.close();out.close();

}} catch (FileNotFoundException e) {

e.printStackTrace();} catch (IOException e) {}//return output file extension typereturn “txt”;

}}

Basic Properties

jclassThe path of the Java class, for example,MyJavaPreprocessor/JavaDemoPreprocessor.

jmethodThe method to run, for example, main.

Online Sample

For an online sample of the Java code, which is similar to the above example, seethe following file in the Conversion Agent installation folder:

Samples\SDK\ExternalPreprocessor\External_JavaPreprocessor.java

ExternalPreProcessor

This component lets you run a custom document processor, which is implementedas a C++ DLL.

Page 43: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 4. Document Processors

32

This component is supported for backwards compatibility with existing custom processors.For new custom processors, we recommend using the approach described in the chapter onExternal Components in the Conversion Agent Engine Developer's Guide.

Creating a Custom Processor

To create a processor that is suitable to run with the ExternalPreProcessorcomponent, follow these steps.

The instructions are for the Microsoft Visual C++ compiler, running on a MicrosoftWindows platform. For compilation instructions on non-Windows platforms,please contact SAP support.

1. Copy the online-sample file External_Preprocessor.cpp. You can find the fileunder the Conversion Agent installation folder, at the location:

Samples\SDK\ExternalPreprocessor\External_Preprocessor.cpp

2. Using the Visual C++ compiler, create a Win32 dynamic-link library project,and insert the C++ file into the project.

3. The C++ file contains the following function, which implements thepreprocessing:

declspec(dllexport) bool process_buffer(istream& in, ostream& out)

In the sample implementation, the function repairs numeric values, removingcommas between the values. Replace the sample code with yourimplementation.

4. Compile the DLL.

5. Store the DLL in the externLibs\user subfolder of your Conversion Agentinstallation folder.

You may then use the DLL in the ExternalPreProcessor component.

Optionally, you can add the processor to the drop-down component list thatConversion Agent displays (for instructions, see the chapter about Using theIntelliScript Editor in the book Conversion Agent Studio in Eclipse).

Basic Properties

import_dllBrowse to the custom processor DLL in the ExternLib\Users folder.

PdfToTxt_3_00

This document processor converts Adobe Acrobat (PDF) files to text.

The processor output is in the UTF-8 encoding. Therefore, if a parser receives inputfrom the processor, you must set the input encoding of the project to UTF-8 (seeEncoding Page in Chapter 13, Project Properties).

The processor does not require any Adobe Acrobat software on the computer.

Page 44: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 4. Document Processors

33

In the New Parser wizard, this processor is called PDF to Unicode (UTF-8).

If you open a project that was created in a previous Conversion Agent version, you mayobserve that it uses the older PdfToTxt_2_02 processor, which generates ASCII output.PdfToTxt_2_02 is supplied for backwards compatibility only; in new projects, you shoulduse PdfToTxt_3_00.

PowerpointToHtml

This document processor converts Microsoft PowerPoint documents to HTML.

The processor uses the PowerPoint save-as-HTML feature to perform theconversion. Therefore, it operates only on a Microsoft Windows platform wherePowerPoint (version 97 or higher) is installed. Due to PowerPoint limitations, theprocessor does not support multithreading.

PowerpointToTextML

This document processor converts Microsoft PowerPoint (*.ppt) presentations(version 97 or higher) to the TextML XML schema.

The processor accesses its input directly, not via PowerPoint. You do not need toinstall PowerPoint on the computer.

The processor is implemented in Java. If you experience any difficulty using theprocessor, confirm that you have configured the Java Runtime Environmentcorrectly (see the System Requirements and Testing and Troubleshooting chapters ofthe Conversion Agent Administrator's Guide).

ProcessByTransformers

This component lets you run transformers as document processors.

The component runs a transformer or a sequence of transformers on the entiredocument (instead of the normal transformer usage, which is to run on the textretrieved by an anchor). The parser then operates on the output of thetransformers.

Conversion Agent offers a large number of transformers (see Chapter 8,Transformers). Hence, the ProcessByTransformers component greatly expands theset of processing operations that you can apply to a document.

Basic Propertiestransformers

The sequence of transformers that the component should run.

Page 45: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 4. Document Processors

34

ProcessorPipeline

This component lets you run a sequence of document processors on a document.The parser runs on the output of the sequence.

Within this component, enter the sequence of processors.

RtfToTextML

This document processor converts RTF files to the TextML XML schema.

The processor output is in the UTF-8 encoding. Therefore, if a parser receives inputfrom the processor, you must set the input encoding of the project to UTF-8 (seeEncoding Page in Chapter 13, Project Properties).

WordPerfectToTextML

This document processor converts Corel WordPerfect documents to the TextMLXML schema.

The processor output is in the UTF-8 encoding. Therefore, if a parser receives inputfrom the processor, you must set the input encoding of the project to UTF-8 (seeEncoding Page in Chapter 13, Project Properties).

WordPerfect does not need to be installed on the computer.

WordToHtml

This document processor converts Microsoft Word documents to HTML.

The processor uses the Word save-as-HTML feature to perform the conversion.Therefore, it operates only on a Microsoft Windows platform where Word (version97 or higher) is installed. Due to Word limitations, the processor does not supportmultithreading.

Online SampleFor a tutorial exercise illustrating how to use this processor, see the chapter ParsingWord and HTML Documents in Getting Started with Conversion Agent.

WordToRtf

This document processor converts Microsoft Word documents to RTF.

The processor uses the Word save-as-RTF feature to perform the conversion.Therefore, it operates only on a Microsoft Windows platform where Word (version97 or higher) is installed. Due to Word limitations, the processor does not supportmultithreading.

Page 46: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 4. Document Processors

35

WordToTextML

This document processor converts Microsoft Word files (version 97 or higher) tothe TextML XML schema.

The processor accesses its input directly, not via Microsoft Word. You do not needto install Word on the computer.

The processor is implemented in Java. If you experience any difficulty using theprocessor, confirm that you have configured the Java Runtime Environmentcorrectly (see the System Requirements and Testing and Troubleshooting chapters ofthe Conversion Agent Administrator's Guide).

WordToTxt

This document processor converts Microsoft Word documents to plain text.

The processor uses the Word save-as-text feature to perform the conversion.Therefore, it operates only on a Microsoft Windows platform where Word (version97 or higher) is installed. Due to Word limitations, the processor does not supportmultithreading.

The output is encoded according to the system code page.

WordToXml

This document processor converts Microsoft Word documents (version 97 orhigher) to XML.

The processor output is in the UTF-8 encoding. Therefore, if a parser receives inputfrom the processor, you must set the input encoding of the project to UTF-8 (seeEncoding Page in Chapter 13, Project Properties).

The processor accesses its input directly, not via Microsoft Word. You do not needto install Word on the computer.

The processor is implemented in Java. If you experience any difficulty using theprocessor, confirm that you have configured the Java Runtime Environmentcorrectly (see the System Requirements and Testing and Troubleshooting chapters ofthe Conversion Agent Administrator's Guide).

XmlToExcel

This document processor converts XML documents to Microsoft Excel format.

The processor operates on an XML representation of an Excel workbook. The XMLrepresentation must be in the UTF-8 encoding and it must conform to theExcelToXml.xsd schema. You can find the schema in the doc subdirectory of theConversion Agent installation directory. The schema file is provided for yourinformation; you can use the processor without adding the schema to your project.

Page 47: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 4. Document Processors

36

The processor reverses the operation of ExcelToXml. For example, you can useExcelToXml to convert an Excel workbook to XML. You can then alter some of theXML data and use XmlToExcel to convert the data back to an Excel workbook.

The processor writes its output directly, not via Microsoft Excel. You do not needto install Excel on the computer.

The processor is implemented in Java. If you experience any difficulty using theprocessor, confirm that you have configured the Java Runtime Environmentcorrectly (see the System Requirements and Testing and Troubleshooting chapters ofthe Conversion Agent Administrator's Guide).

TextML XML Schema

Some of the document processors convert documents to an XML vocabulary calledTextML. This is a simple XML vocabulary for saving document content withoutlayout.

The following is a sample TextML file.

<?xml version="1.0" encoding="UTF-8"?><document>

<docinfo><title>TextML Sample</title><author>Tex Tomiller</author><company>Acme Gizmos, Inc.</company><modified>2004-03-14T14:39:00</modified><created>2004-03-12T09:15:00</created><last_author>Tex Tomiller</last_author><word_count>16</word_count><char_count>105</char_count><version>2</version>

</docinfo><docbody>

<p>This is a sample of the TextML XML vocabulary.</p><p>TextML saves document content without layout information.<p>

</docbody></document>

You can find the TextML schema (textML.xsd) in the doc subfolder of theConversion Agent installation folder.

Page 48: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 5. Formats

37

Formats

When you define a parser, you should specify the format of the documents that theparser should process. For example, you can select TextFormat or HtmlFormat.

The following components are nested within the format:

Component Description

Delimiters A hierarchy of characters or strings that organize the information in thedocument, such as newlines and tabs.

Formatpreprocessor

A component that cleans up the source before the parser starts searching foranchors.

Defaulttransformers

A list of transformations that the parser applies to the output of each anchor.

This chapter describes the formats, delimiters, and format preprocessors that areavailable for your use. The subject of default transformers is discussed briefly here;for the details, see Chapter 8, Transformers.

Defining the Document Format

When you use the New Parser wizard, you are prompted to define the basicdocument format. This assigns the format with its default properties. You cancustomize the format by editing the Parser/format property in the IntelliScript.

The format is a component that has properties of its own, such as the delimiters,format preprocessor, and default transformers. By configuring the properties, youcan support an extraordinarily broad range of source documents.

5

Page 49: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 5. Formats

38

The supported format components are:

BinaryFormatCustomFormatHtmlFormatRtfFormatTextFormatXmlFormat

For the details of these formats, see the Format Component Reference below.

Standard Properties of Formats

The format components have a standard set of properties, which are explainedbelow.

Basic Properties

delimitersA hierarchy of characters or strings that organize the information in thedocument, such as newlines, spaces, tabs, commas, or vertical bars. You canalso use a regular expression (a wildcard pattern) to define the delimiters.

The delimiter concept is applicable both to rigidly structured documents,which use delimiter characters to separate the data fields, or to looselystructured text or HTML documents, which can use delimiters such asnewlines and syntactic markup.

The delimiter concept also encompasses positionally-structured data, wherethe fields are located at fixed offsets from one another.

The value of this property is a delimiters component, which contains apredefined list of delimiters. For example, the TabDelimited componentdefines the newline and tab characters as delimiters. Some delimiterscomponents, such as TabDelimited or DelimiterHierarchy, let you edit the listof delimiters. The Positional delimiters component defines an offset-basedstructure.

For a description of the delimiters components, see the Delimiters ComponentReference below.

Page 50: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 5. Formats

39

pre_processorAn optional format preprocessor, which converts the source to the format thatthe parser can process.

The format preprocessor acts on the source after any document processor thatyou may have defined (see Chapter 4, Document Processors). The purpose of theformat preprocessor is to clean up whitespace or markup, before the parserstarts to search for anchors.

By default, several of the predefined formats are configured with theHtmlProcessor format preprocessor, which collapses multiple whitespacecharacters to a single space.

For a detailed description of the available preprocessors, see the FormatPreprocessor Component Reference below.

default_transformersA list of transformers that the parser applies in sequence to the output of eachanchor. The purpose of the transformers is typically to clean up the output andremove markup codes.

For example, the RemoveTags transformer is typically one of the defaulttransformers of an HTML parser. The transformer removes HTML tags fromthe output of each anchor.

For a detailed description of the transformers, see Chapter 8, Transformers.

Advanced Properties

nameA name that you assign to the format. The name is used in the event log.

remarkA comment describing the format.

Format Component Reference

This section documents the format components that you can assign to the formatproperty of a Parser.

BinaryFormat

This format is suitable for parsing binary files. It is also suitable for text files, whichyou want to treat as a buffer of binary bytes.

PropertiesSee Standard Properties of Formats above.

By default, the delimiters property has a value of Positional . The pre_processorand default_transformers properties are empty.

Page 51: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 5. Formats

40

CustomFormat

This is a generic format, which you can use to process any type of sourcedocument. You must define the delimiters, default transformers, etc., yourself.

Example

A source document has the following typical structure:

Ron Lehrer && 547329876:27Evelyn Kern && 9875424: 53

Each line of the document is a record containing a person's name, ID number, andage. The fields are separated by the symbols && and :. The fields contain multiplespace characters at random locations.

One way to parse this document is by using a CustomFormat. In the delimitersproperty of the format, you might assign a DelimiterHierarchy containing thesymbols:

newline&&:

In the default_transformers property, you might assign theCMSUserGd08.doc#Ch7HtmlProcessor, which removes the extra spaces from theoutput.

PropertiesSee Standard Properties of Formats above.

By default, the delimiters, pre_processor, and default_transformers propertiesare empty. You should configure them yourself.

HtmlFormat

This format is suitable for parsing HTML files.

Page 52: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 5. Formats

41

This format is also recommended for processing Microsoft Office documents. Forthis purpose, you should assign a document processor such as WordToHtml orExcelToHtml, which converts the Office document to HTML.

Properties

See Standard Properties of Formats above.

By default, the delimiters property has a value of SGML, which recognizes theHTML delimiters such as < and >. The pre_processor is HtmlProcessor. Thedefault_transformers are:

RemoveTags: removes HTML tags from the output

HtmlEntitiesToASCII: converts HTML entities such as &lt; and &quot; to theirplain text equivalents (< and ", respectively)

HtmlProcessor: normalizes the whitespace, changing any sequence of tabs,newlines, and spaces to a single space character

RemoveMarginSpace: removes leading and trailing space

RtfFormat

This format is suitable for parsing RTF files.

PropertiesSee Standard Properties of Formats above.

By default, the delimiters property has a value of RTF, which recognizes thestandard RTF delimiter characters such as \. The pre_processor is RtfProcessor.The default_transformers are:

RtfToASCII: removes RTF control words from the output

RemoveRtfFormatting: removes RTF formatting instructions from the text.

HtmlProcessor: normalizes the whitespace, changing any sequence of tabs,newlines, and spaces to a single space character

RemoveMarginSpace: removes leading and trailing space

TextFormat

This format is suitable for parsing text files.

In combination with a document processor, this format is also suitable forprocessing other types of documents. For example, you can use the PdfToTxt_3_00,WordToTxt, or ExcelToTxt processor to process Adobe Acrobat, Microsoft Word, orMicrosoft Excel documents with this format.

Properties

See Standard Properties of Formats above.

Page 53: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 5. Formats

42

By default, the delimiters property has a value of DelimiterHierarchy, which letsyou define your own set of delimiters. The pre_processor is empty. Thedefault_transformers are:

HtmlProcessor: normalizes the whitespace, changing any sequence of tabs,newlines, and spaces to a single space character

RemoveMarginSpace: removes leading and trailing space

XmlFormat

This format is suitable for parsing XML files.

Parsing an XML file means converting an XML source document to an XML outputdocument. To do this, Conversion Agent treats the source XML as ordinary text.You can define delimiters, anchors, etc. just as you do for any other kind of sourcedocument.

This is different from serialization, where an XML source document is converted tonon-XML output. In serialization, Conversion Agent uses the XSD schema and theformal XML syntax rules to interpret the source document (see Chapter 10,Serializers).

PropertiesSee Standard Properties of Formats above.

By default, the delimiters property has a value of SGML, which recognizes the XMLdelimiters such as < and >. The pre_processor is HtmlProcessor . Thedefault_transformers are:

RemoveTags: removes XML tags from the output

HtmlEntitiesToASCII: converts XML entities such as &lt; and &gt; to theirplain text equivalents (< and >, respectively).

HtmlProcessor: normalizes the whitespace, changing any sequence of tabs,newlines, and spaces to a single space character

RemoveMarginSpace: removes leading and trailing space

Delimiters Component Reference

This section documents the delimiters components, which you can assign to thedelimiters property of a format.

Conversion Agent uses the delimiters for purposes such as determining the searchcriteria of Content anchors. Specifically, this applies to Content anchors that areconfigured with the LearnByExample option (which is the default).

For example, suppose you configure a format with the TabDelimited delimiterscomponent. This defines a hierarchy using the following characters as delimiters:

Page 54: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 5. Formats

43

NewlineTab

You might define a Content anchor that is located two tab characters after thepreceding Marker anchor in the example source, like this:

MARKER<tab>abc<tab>CONTENT

When Conversion Agent processes a source document, it searches for the Contenttwo tabs after the Marker .

In a second example, you might define a Content anchor that is located threenewlines and one tab after a Marker anchor, in the example source.

MARKERabc<tab>defghi<tab>jkl<tab>mnoppqrst<tab>CONTENT

Within the intermediate lines, the tabs are not counted because the newlines arehigher in the hierarchy.

Editing the Delimiter Hierarchy

Many of the delimiters components, such as TabDelimited or CommaDelimiteddisplay a predefined hierarchy of delimiters, which you can edit as required.

The DelimiterHierarchy component does not have a predefined hierarchy. Youcan insert whatever delimiters you need.

Some delimiter components, such as SGML or PostScript, have a built-in hierarchy,which you cannot edit.

CommaDelimited

This delimiters component defines the following delimiter hierarchy:

NewlineComma

CommaDelimited is suitable, for example, if each line of a text file contains a record,and each record contains data fields separated by commas.

You can add additional delimiters or edit the predefined hierarchy. The procedureis the same as for the DelimiterHierarchy component.

Page 55: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 5. Formats

44

Example

In the example source document, suppose that a Content anchor follows a Markeranchor by two lines. In the third line, there are three commas (plus any other text)before the Content anchor, like this:

MARKERabcdef, ghijabc, def,ghi,CONTENT

If you assign the CommaDelimited component, the parser learns from the examplesource that the Content anchor always follows the Marker by two newlines andthree commas. In another source document, the parser will successfully find thefollowing Content anchor:

MARKERxyz, uvw, rst

,,,CONTENT

DelimiterHierarchy

This delimiters component lets you define a custom delimiter hierarchy.

Under DelimiterHierarchy, you can add any number of delimiters. For eachdelimiter, select either Delimiter (located between anchors) orEnclosingDelimiters (a pair of delimiters that surround an anchor).

Example

In the example source document, suppose that the anchors are separated bycommas and surrounded by brackets, like this:

MARKER,,[CONTENT]

You might define a DelimiterHierarchy that contains:

comma (defined as a Delimiter component)[] (defined as an EnclosingDelimiters component)

From this example, the parser learns that the Content anchor follows the Marker bytwo commas and is surrounded by brackets. In another source document, theparser will successfully find the following Content anchor:

MARKER,abc,def[CONTENT]

Online Sample

For an online sample, see Samples\Projects\EDI\EDI.cmw. The sample uses aDelimiterHierarchy to define the newline and * characters as delimiters, in an EDIsource document.

Page 56: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 5. Formats

45

HL7

This delimiters component defines the delimiter hierarchy that is used for parsingHL7 messages:

newlinevertical bar (|)caret (^) or tab

You can add additional delimiters or edit the predefined hierarchy. The procedureis the same as for the DelimiterHierarchy component.

The HL7 messaging standard permits a message to define its own delimiters. Youcan parse the delimiter declaration of an HL7 message and create a dynamicdelimiter definition. To do this:

1. Use Content anchors to retrieve the delimiter characters from the HL7 messageheader. Store the characters in variables.

2. Add Delimiter components under the HL7 component.

3. To each Delimiter component, assign TextSearch.

4. Under the TextSearch component, assign one of the variables to the textproperty.

Online SampleFor a tutorial exercise illustrating HL7 parsing, see the chapter Defining an HL7Parser in Getting Started with Conversion Agent.

Positional

This delimiters component specifies that the parser should interpret the sourcedocument without using delimiters. Instead, it should locate each anchor bycounting the characters from the beginning of the search scope (see Chapter 7,Anchors, for an explanation of search scope).

Example

In the example source document, suppose that a Content anchor follows a Markeranchor by five characters (including spaces, tabs, etc.), like this:

MARKERab cdCONTENTefg

If you assign the Positional component, the parser learns from the example sourcethat the Content anchor always follows the Marker by five characters, and that it isseven characters long. In another source document, the parser will successfullyfind the following Content anchor:

MARKERd<tab>cbaCONTENTzy,xwv

Page 57: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 5. Formats

46

Online Sample

For a tutorial exercise illustrating the Positional format, see the chapter PositionalParsing of a PDF Document in Getting Started with Conversion Agent.

Using Positional Parsing Together with Delimiters

You cannot add delimiters to the Positional component.

Sometimes, you may wish to define a parser that uses delimiters to locate someanchors, and uses a positional definition for other anchors. To do this, you shouldselect one of the other delimiters components (not Positional). To define thelocation of an anchor positionally, you can assign the OffsetSearch option in theanchor properties. For details, see Chapter 7, Anchors.

PostScript

This delimiters component defines a delimiter hierarchy that is used for parsingAdobe PostScript documents.

You cannot edit the delimiter hierarchy of the PostScript component.

RTF

This delimiters component defines a delimiter hierarchy that is used for parsingRTF documents.

You cannot edit the delimiter hierarchy of the RTF component.

SGML

This delimiter component defines a delimiter hierarchy that is used for parsingSGML documents. It is recommended for parsing HTML and XML, which arederivatives of SGML.

You cannot edit the delimiter hierarchy of the SGML component.

SpaceDelimited

This delimiters component defines the following delimiter hierarchy:

NewlineString of one or more space characters

SpaceDelimited is suitable, for example, if each line of a text file contains a record,and each record contains data fields separated by spaces.

You can add additional delimiters or edit the predefined hierarchy. The procedureis the same as for the DelimiterHierarchy component.

Page 58: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 5. Formats

47

Example

In the example source document, suppose that a Content anchor follows a Markeranchor by two lines. In the third line, there are two single-space characters and onestring containing multiple spaces before the Content anchor, like this:

MARKERabcdefabc def ghi CONTENT

If you assign the SpaceDelimited component, the parser learns from the examplesource that the Content anchor always follows the Marker by two lines and threestrings of spaces. In another source document, the parser will successfully find thefollowing Content anchor:

MARKERxyz

ghi def abc CONTENT

TabDelimited

This delimiters component defines the following delimiter hierarchy:

NewlineTab

TabDelimited is suitable, for example, if each line of a text file contains a record,and each record contains data fields separated by tabs.

You can add additional delimiters or edit the predefined hierarchy. The procedureis the same as for the DelimiterHierarchy component.

Example

In the example source document, suppose that a Content anchor follows a Markeranchor by two lines. In the third line, there are three tab characters (plus any othertext) before the Content anchor, like this:

MARKERabcdefabc<tab> de,f<tab>ghi<tab>CONTENT

If you assign the TabDelimited component, the parser learns from the examplesource that the Content anchor always follows the Marker by two lines and threetabs. In another source document, the parser will successfully find the followingContent anchor:

MARKERxyz

<tab><tab><tab>CONTENT

Online Sample

For a tutorial exercise illustrating tab-delimited parsing, see the chapter BasicParsing Techniques in Getting Started with Conversion Agent.

Page 59: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 5. Formats

48

Delimiter Subcomponent Reference

This section documents subcomponents that are used within delimiterscomponents (see the Delimiters Component Reference).

Delimiter

This subcomponent defines a delimiter character or string, which separatesanchors. You can add Delimiter subcomponents within a delimiter hierarchy (see,for example, the DelimiterHierarchy component).

Example

The TabDelimited component contains two Delimiter subcomponents. The firstuses NewlineSearch to define the newline character as a delimiter. The second usesa TextSearch to define the tab character as a delimiter. (The tab is graphicallyrepresented as a « character.)

The SpaceDelimited component also contains two Delimiter subcomponents. Thefirst is identical to that of TabDelimited. The second uses a PatternSearch to defineany string of one or more spaces as a delimiter. (The regular expression [ ]+means "one or more space characters".)

Page 60: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 5. Formats

49

Basic Properties

SearchThe delimiter definition. The value is one of the following searchercomponents (see the Searcher Component Reference in Chapter 7, Anchors):

Subcomponent Description

NewlineSearch The delimiter is a newline

PatternSearch Uses a regular expression to define the delimiter

TextSearch An explicit character or string, or a string that youretrieve dynamically from the source document

EnclosingDelimiters

This subcomponent defines a pair of delimiter characters or strings, whichsurround anchors. You can add EnclosingDelimiters subcomponents within adelimiter hierarchy (see, for example, the DelimiterHierarchy component).

The component is useful, for example, to define the { } delimiters that surroundblocks of C program code.

Example

See the DelimiterHierarchy component.

Basic Propertiesopening

The opening delimiter.

closingThe closing delimiter.

escape_sequenceA prefix in the source document, such as a backslash character \, which causesthe parser to ignore an instance of the opening or closing delimiter.

Format Preprocessor Component Reference

This section documents the format preprocessor components, which you canassign to the pre_processor property of a format (see the Format ComponentReference above).

Do not confuse format preprocessors with document processors (see Chapter 4,Document Processors). The differences are as follows:

Page 61: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 5. Formats

50

You can assign a document processor to the pre_processor property of aninput port (under the example_source or sources_to_extract property of theparser). You can assign a format preprocessor only to the pre_processorproperty of a format.

Conversion Agent runs a document processor on the source document beforeit performs any other operations. The example pane of the IntelliScript editordisplays the output of the document processor.

Conversion Agent runs a format preprocessor on the text that is displayed inthe example pane, before it searches for anchors. The output of the formatpreprocessor is not displayed.

HtmlProcessor

This format preprocessor (which is also available as a transformer, see Chapter 8,Transformers) normalizes whitespace according to HTML conventions. It convertsany sequence of tabs, line breaks, and space characters to a single space character.

You can use this preprocessor to normalize whitespace in any type of text. It is notrestricted to HTML documents.

RtfProcessor

This format preprocessor normalizes the code of RTF files. It is also available as atransformer (see Chapter 8, Transformers).

Page 62: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 6. Data Holders

51

Data Holders

A data holder is an object that has one of the following types:

An XML element

An XML attribute

A variable

Data holders of the first two types—XML elements and attributes—are used forpermanent storage. A parser, for example, stores its output in data holders of thesetypes.

Data holders of the third type—variables—are used for temporary storage. Forexample, a parser can store data that it extracts from a source document in avariable, for further processing before creating the output.

A common feature of all data holders is that they have XSD data types. In the caseof elements and attributes, the data holders are defined in an XSD schema, whichyou must supply. Variables are defined in an internal schema, which you maycustomize by adding user-defined variables.

This chapter explains how to create data holders and how to use data holders andXSD in Conversion Agent.

XSD Schemas

When you create a parser or serializer, you must supply one or more XSD schemasthat define the structure of the XML. The schema defines the data holders (of theelement and attribute types) that the parser or serializer can use.

You must add the schema to your project. You can then map the content of adocument to elements and attributes that are defined in the schema.

About XSD

XSD is the commonly-used name for XML Schema, which is the industry-standardlanguage for XML schema definitions. XSD originally stood for XML schemadescription, but this term is not used in the official XML Schema standard. Theschema files typically have *.xsd filenames.

The XSD standard is maintained by the World Wide Web Consortium(http://www.w3.org). Since the standard was first published in 2001, XSD has rapidlyreplaced earlier schema description languages, such as DTD and XDR.

6

Page 63: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 6. Data Holders

52

The following is a simple example of an XSD schema:

<?xml version="1.0" encoding="Windows-1252"?><xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">

<xs:element name="Person"><xs:complexType>

<xs:sequence><xs:element name="Name" minOccurs="0">

<xs:complexType><xs:sequence>

<xs:element name="First" minOccurs="0" type="xs:string"/><xs:element name="Last" minOccurs="0" type="xs:string"/>

</xs:sequence></xs:complexType>

</xs:element><xs:element name="Id" minOccurs="0" type="xs:string"/><xs:element name="Age" minOccurs="0" type="xs:string"/>

</xs:sequence><xs:attribute name="gender" type="xs:string"/>

</xs:complexType></xs:element>

</xs:schema>

The schema defines the elements and attributes that can occur in an XMLdocument. The syntax lets a schema author specify the hierarchy and sequence ofelements, whether the elements are mandatory or required, their data types, theirpossible values, and many other features.

The above sample schema defines an XML structure such as the following:

<Person gender="M"><Name>

<First>Ron</First><Last>Lehrer</Last>

</Name><ID>547329876</ID><Age>27</Age>

</Person>

Page 64: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 6. Data Holders

53

If you trace through the schema, you can observe the correspondence betweendefinitions such as

<xs:element name="Person">

or

<xs:attribute name="gender" type="xs:string"/>

and the elements and attributes of the XML.

The elements and attributes have XSD data types, such as xs:complexType (forelements that contain nested elements) or xs:string (meaning that they containstring data). The elements have many other properties, such as their requiredsequence and the minimum number of times that must occur in an XML document(minOccurs).

Where To Learn XSD

For a brief explanation of the XSD syntax, see the chapter on Creating an XSDSchema in Getting Started with Conversion Agent.

For comprehensive information, we recommend the following sources:

http://www.w3.orgThe web site of the World Wide Web Consortium, which created andmaintains the XML Schema standard.

http://www.w3schools.comSee this site for an excellent tutorial introduction to XSD.

How to Create XSD Schemas

You can create a schema in any XSD editor, for example, Altova's XML Spy(http://www.altova.com).

Typically, an XSD editor has a user-friendly interface, which lets you create andedit schemas even if you don't know the XSD syntax. Some editors also let youconvert an existing DTD or XDR schemas to XSD, or to create an XSD schema froma sample XML file.

If you know the XSD syntax, you can also edit XSD schemas in a text editor such asNotepad.

Conversion Agent Studio uses the Xerces 2.2 XML parser to validate XSD schemas. Inrare cases, due to implementation differences between XML parsers, a schema that passesvalidation in Conversion Agent may fail in other tools, or vice versa.

Encoding of the XSD Schema

You should save the schema in one of the supported input encodings (see Chapter13, Project Properties).

Page 65: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 6. Data Holders

54

The schema encoding must be compatible with the work encoding, which you usefor text that you enter in the IntelliScript. This means that:

The schema encoding is identical to the work encoding,

or

Every character in the schema has an equivalent in the work encoding. Forexample, if the schema uses the UTF-8 encoding, and the work encoding isWindows-1252, the schema must not contain Unicode characters that have noWindows-1252 equivalent.

When you add a schema from an external location to a project, Conversion Agenttranslates the project copy of the schema to the work encoding.

Included XSD Files

An XSD file can reference additional XSD files. This feature lets you maintain alarge schema in a modular fashion.

Namespaces

If you plan to work with XML namespaces, you should assign the targetNamespaceattribute of the schema. In Conversion Agent Studio, you can edit the alias that isassigned to the namespace (see Chapter 13, Project Properties).

To prevent ambiguities, Conversion Agent Studio displays a warning if you try toadd two schemas that use the same namespace alias for different namespaces. Itassigns a different alias to one of the namespaces.

Conversion Agent Studio also warns if you try to add two schemas that use anempty alias for different namespaces.

Mixed Content

Conversion Agent supports XML elements that have mixed content (both characterdata and nested elements). You may use the mixed attribute in a schema.

Conversion Agent distinguishes between character data before and after eachelement (see Mapping Mixed Content below).

Unsupported XSD Features

The current Conversion Agent version does not support certain uses of XSDfeatures. The following table lists the limitations, as of the time this book waswritten.

We are working to remove the limitations. If you need any of the unsupported features,please check with SAP support for possibly updated information.

Page 66: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 6. Data Holders

55

XSD feature Description of limitation

redefine Redefining types and groups is not supported and should not be used.

uniquekeykeyref

Identity constraints may be used in a schema, but are ignored.

nilnillable

Nil elements may be used in a schema, but are ignored.

group (in place of anentity)

Model groups in place of XML entities, for example,<city>Montr<c:eacute/>al</city>, are not supported and shouldnot be used.Other uses of groups are supported.

longunsignedLong

Data holders having the XSD type long or unsignedLong currentlysupport integers with absolute values up to 2147483647. Larger valuesare not supported and may give incorrect results.

Precision of Numerical Data

Conversion Agent stores xs:decimal and xs:float data as strings, preserving theprecision of the data.

In calculations, Conversion Agent converts decimal and float data to double-precision floating point, and it rounds the result to 15 decimal digits. This meansthat decimal data may lose some precision. For example, the result of xs:decimal5.28 * 1 may be displayed as 5.28000000000001.

Conversion Agent normalizes xs:decimal values. For example, it stores 0004 as 4, -0 as 0, and 1.200 as 1.2.

Adding XSD Schemas to a Conversion Agent Project

You must add at least one XSD schema to every Conversion Agent parser,serializer, or mapper project. The schema specifies what XML structure the projectneeds to process.

You can store the schema at any network location. Conversion Agent copies theschema to the project folder.

If a schema includes other schema, you should add the main schema. When you dothis, Conversion Agent adds the included schemas automatically.

Page 67: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 6. Data Holders

56

Adding an Existing Schema

To add an XSD schema to a project:

1. In the Conversion Agent Explorer view, right-click the XSD node of a projectand choose Add File.

2. Browse to the XSD file, which can be at any network location. If the XSD file isnot in your project folder, Conversion Agent copies the file to the projectfolder.

3. The XSD folder of the Conversion Agent Explorer displays the schema file. Ifthe schema references any other XSD files, the Include folder of theConversion Agent Explorer displays their names.

4. Optionally, if the schema defines a target namespace, you can edit thenamespace alias in the Conversion Agent project properties. For instructions,see Chapter 13, Project Properties.

Creating a New Schema

To create a new schema file in Conversion Agent Studio:

1. In the Conversion Agent Explorer, right-click the XSD node and choose New >XSD.

2. The Conversion Agent Explorer displays the new file with a default namesuch as untitled1.xsd. You should enter a meaningful name immediatelywhen you create the file. To prevent errors in the references that a projectcreates to the schema, Conversion Agent Studio does not allow you to changethe name of an existing schema file.

Editing a Schema

If you double-click an XSD file in the Conversion Agent Explorer, ConversionAgent Studio opens it for editing.

By default, Conversion Agent Studio provides a simple, text-based schema editor.

Alternatively, you can configure Conversion Agent Studio to open an editor ofyour choice, for example, XML Spy or Notepad. To do this, see the chapter onConversion Agent Studio Preferences in the book Conversion Agent Studio in Eclipse.

Reloading a Schema after Editing

If you edit or modify an existing schema, Conversion Agent Studio may promptyou to reload the schemas. This ensures that the edited schema is availablethroughout the project.

You can reload the schemas at any time by right-clicking the XSD node in theConversion Agent Explorer, and choosing Reload XSDs.

Page 68: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 6. Data Holders

57

Validating Data Holders

If you edit a schema that belongs to an existing project, data holders referenced inthe project may become invalid. For example, an anchor may reference an XPathexpression that no longer element that no longer exists in the schema.

To identify any problems of this sort, run the Conversion Agent > Validatecommand. Any validation errors are displayed in the Tasks view.

Viewing a Schema

In the Schema view of Conversion Agent Studio, you can view the elements andattributes of all schemas that belong to a project.

To help visualize a schema, you can generate a sample XML document thatillustrates the schema elements.

Schema View

To display the content of a schema, use the Schema view. The view displays aheading for each namespace that is defined in the project. Under the heading, itdisplays the elements and attributes that are defined in the schema.

The namespace display includes:

Default entries for the Worldwide Web Consortium schema namespaces(beginning with http://www.w3.org). In most projects, you can ignore theseentries.

A Variables namespace, which is used to define variables that you can use inyour project (see Variables below).

An entry for each target namespace that is defined in the schemas that youhave added to the project.

If you add one or more schemas that do not define a target namespace, they aredisplayed under the no target namespace heading.

Page 69: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 6. Data Holders

58

Schema view

Displaying an XML Sample of a Schema

You can generate and display a sample XML file that illustrates a schema. To dothis, right-click the schema in the Conversion Agent Explorer, and choose CreateExample XML.

The example illustrates features such as:

The data type of an element or attribute: "a" for string data, "1" for integerdata, "1.1" for floating point, etc.

The multiplicity of an element. If an element can occur more than once, theexample displays it more than once.

Page 70: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 6. Data Holders

59

Using a Schema to Map Anchors

When you define a parser, you must map Content anchors to output data holders.When you define a serializer, you must map input data holders toContentSerializer serialization anchors.

When you edit the data_holder property of an anchor, Conversion Agent displaysa Schema view. You should select the appropriate data holder.

Alternatively, when you configure a parser, you can drag text from the examplesource to a data holder in the Schema view. Conversion Agent Studio creates aContent anchor, which maps the example text to the data holder.

IntelliScript Representation of Data Holders

In the IntelliScript, data holders are identified by a modified XPath expression,such as:

data_holder = /Person/*s/Name/*s/First

Do not try to type this value. If you wish to modify the mapping, select thedata_holder property and press Enter. This opens a Schema view, where you canselect the new mapping.

The Conversion Agent XPath syntax is slightly different from the standard XPathsyntax, which is Person/Name/First. Conversion Agent inserts *s, *c, and *a,which refer to the XSD terms sequence, choice, and all. The modifications resolveambiguities when Conversion Agent uses XSD to construct XML output.

Mapping Mixed Content

If the schema supports mixed content, Conversion Agent considers each element tohave before and after data holders. For example, consider the following mixedcontent:

Page 71: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 6. Data Holders

60

<Deal>We are pleased to offer you a price of<Price>34</Price>dollars. This is a special price for<Partner>

<Name>Acme Gizmos, Inc.</Name><ID>98765</ID>

</Partner>valid only until December 31.

</Deal>

Conversion Agent considers this structure to contain data holders in the followinglocations:

Immediately after the <Deal> tag, before any of the sub-elements.

Before the Price element

The Price element

After the Price element

Before the Partner element

The Partner/Name and Partner/ID elements

After the Partner element

Immediately before the </Deal> tag, after all the sub-elements.

You might map the text "We are pleased to offer you a price of" to the dataholder before the Price element. You might map "dollars. " to the data holderafter Price, and "This is a special price for " to the data holder beforePartner.

The Schema view displays the mixed-content data holders.

Page 72: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 6. Data Holders

61

The IntelliScript displays the mixed content in a representation such as:

data_holder = /Deal/*s/Price/$text_before

Generating Valid XML

By default, Conversion Agent generates XML that is valid according to the XSDschema that you have defined.

The Conversion Agent approach to validation differs from the conventionalapproach, which is used in most XML applications. In the conventional approach,an application generates an XML document, and then checks that the output isconforms to a schema. The schema is applied after the generation, when the XMLdocument already exists.

In the Conversion Agent approach, the schema is used as a guide while the XML isbeing generated. The schema is applied during the generation, and not afterwards.This approach helps Conversion Agent data transformations to succeed, byensuring the validity of the output at the relevant locations as the transformationproceeds.

The following section, describing the Role of XSD in Parsing, illustrates theapproach.

Role of XSD in Parsing

This section explains some of the ways in which a parser uses XSD to ensure that itoutputs valid XML.

The discussion presents examples of the behavior. For an explanation of theparameters that control the behavior, see XML Generation in Chapter 13, ProjectProperties.

Sequence of ElementsWhen Conversion Agent runs a parser, it organizes the output in the sequence thatis required by the XSD schema.

For example, a schema may require that a LastName element precede a FirstNameelement. Conversion Agent creates the output in the locations defined by theschema, even if the anchors that produce the output are defined in the oppositesequence.

Number of Occurrences

A parser may attempt to insert multiple instances of an element in the outputXML. Conversion Agent uses the schema to determine whether the new instancesshould be appended or should overwrite the existing elements. The parser deletesany excess elements, beyond those that the schema permits, and writes warnings inthe event log.

Page 73: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 6. Data Holders

62

In another example, suppose the schema defines an element without specifying aminOccurs or maxOccurs attribute. According to the XSD standard, the defaultminOccurs and maxOccurs values are 1, which means that the element must occurexactly once in the parser output. If the element is missing from the output, theparser can add it.

For more information, see Multiple-Occurrence Data Holders below.

Missing or Empty Elements

In the project properties, you can configure whether a parser should insert emptyelements to comply with an XSD schema.

XSD Data Types

Conversion Agent ensures that the text it stores in a data holder has the requiredXSD type. For example, if a Content anchor retrieves the string "oranges 5 for adollar", and the XSD type of the data holder is xs:integer, the anchor stores onlythe integer 5 in the data holder.

For more information, see Using XSD Data Types to Narrow the Search Criteria inChapter 7, Anchors.

Role of XSD in Serialization

A serializer checks that its input is valid according to the XML schema. If the inputis invalid, the serializer attempts to correct the error and process the sourceanyway.

For example, if there are more occurrences of an element than the schema permits,the serializer may ignore the excess elements and process the valid ones. It writesappropriate warnings in the event log.

Variables

Variables are temporary data holders, which you can use in place of XML elementsor attributes. Variables are useful, for example, if you need to store a valuetemporarily during the operation of a parser, but you don't need to output thevalue in the XML.

For example, suppose you want a parser to read two Content anchors andconcatenate their values. You might map each Content anchor to a user-definedvariable. You can then use an action to concatenate the variables and output theresult to an XML element.

In addition to the user-defined variables, Conversion Agent has several pre-defined system variables. The system variables are used, for example, to storeinformation that is needed for certain actions.

Page 74: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 6. Data Holders

63

User-Defined Variables

To define a variable, use the Conversion Agent > Insert > Variable command onthe menu, or add a Variable component at the global level of the IntelliScript.Select the XSD data type that the variable can store, such as xs:string orxs:integer.

The variable is displayed under the Variables namespace in the Schema view.

System Variables

Several system variables are defined in every Conversion Agent project. Thefollowing paragraphs describe the variables and the ways in which they are used.

Variables Used to Access Source DocumentsSeveral of the system variables store data that actions can use when they accesssource documents. For example, the RunParser action can use:

VarLinkURLA file path or a URL address.

VarPostDataA string containing form data that should be submitted.

The following variables are used in the SubmitForm and SubmitFormGet actions:

VarFormActionThe URL to which a form should be posted (the action attribute of the HTML<form> element.

Page 75: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 6. Data Holders

64

VarFormDataA string containing the form data that should be submitted. This is a multiple-occurrence variable (see Multiple-Occurrence Data Holders). Each occurrence is acomplete instance of the form data.

Read-Only Access VariablesThe following variables are read-only. A data transformation can use them torevisit a source document.

VarRequestedURLThe path or URL of the source document that a parser is processing.

VarCurrentURLThe path or URL of the current file that a parser is processing.

Usually, this is the same as VarRequestedURL. If the parser is configured withcertain preprocessors, VarCurrentURL may point to a temporary file rather thanthe original source document. VarRequestedURL always points to the sourcedocument.

VarCurrentPostThe form data that a parser submitted in order to retrieve the current page.

Read-Only System Time Variables

VarSystem is a read-only variable that returns system information. It is a structure(represented in the XPath format), which contains several nested variables:

VarSystem/ExecStartTime/YearVarSystem/ExecStartTime/MonthVarSystem/ExecStartTime/MonthNameVarSystem/ExecStartTime/DayVarSystem/ExecStartTime/DayNameVarSystem/ExecStartTime/HourVarSystem/ExecStartTime/MinuteVarSystem/ExecStartTime/SecondVarSystem/ExecStartTime/Millisecond

The nested variables store the year, month, day, etc., when the parser beganexecution. The variables are useful, for example, if you want a parser to insert atimestamp in its output.

Mapping Anchors to Variables

You can map a Content anchor to a variable, in the same way that you map to anyother data holder.

Do not map an anchor to a read-only system variable.

Page 76: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 6. Data Holders

65

Using Variables in Actions

Variables are often used as the input of actions. You can use a variable in the sameway as you use other data holders. For instructions, see Chapter 9, Actions.

Initializing Variables at Runtime

Optionally, you can initialize the values of variables before a data transformationruns. There are a few ways to do this:

In the IntelliScript: You can configure the initialization property of theVariable component.

The initial values that you set by this approach are used when you run thedata transformation either in Conversion Agent Studio or as a deployedConversion Agent service.

In the Run > Run command of Conversion Agent Studio: In this command,you can click the Details button and set the initial values.

The values that you set in this way are used when you test the datatransformation in Conversion Agent Studio. For testing purposes only, thevalues override the ones that you set in the initialization property. Theyhave no effect when you run the data transformation as a deployed service.

An application can pass the initial values as service parameters to a deployedConversion Agent service at runtime.

For example, an application that calls the Conversion Agent Java API or C++API to run a Conversion Agent service can set the initial values of thevariables. For more information about how to pass the service parameters, seethe Conversion Agent Engine Developer's Guide and the API references.

The service parameters override the initialization property of the variables.If the IntelliScript specifies an initial value, and you also pass a value from anapplication, the latter value is used.

Variable Component Reference

This section documents the Variable component, which you can add to a project.

Variable

A Variable component is a user-defined variable.

You can use variables for temporary storage, in the same locations that you use anXML element or attribute. For example, you can map a Content anchor to avariable, and you can use a variable as the input of an action.

Page 77: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 6. Data Holders

66

Variables have XSD data types. They are displayed in the Schema view and in theIntelliScript. You can define a variable only at the top (global) level of theIntelliScript.

Basic Propertiesval_type

The data type that the variable can store. You can select one of the standardtypes (xs:string, xs:integer, etc.). Alternatively, you can select a custom XSDdata type that is defined in a schema belonging to the project. A custom typecan be either simple (containing only data) or complex (containing nestedelements or attributes).

Advanced Properties

listSelect this option to create a multiple-occurrence variable (see Multiple-Occurrence Data Holders).

initializationAn initial value for the variable, assigned when the data transformation starts.Select InitialValue, and enter the value.

You can initialize variables that have simple data types. Initialization ofcomplex-type variables is not supported.

Service parameters, which an application passes to a Conversion Agent serviceat runtime, override the initialization property. You can use theinitialization property to set a default, which the data transformation usesif an application does not pass the value as a service parameter. For moreinformation, see Initializing Variables at Runtime above.

Multiple-Occurrence Data Holders

In an XSD schema, you can use the maxOccurs attribute to set the maximumnumber of times that sibling elements can occur in an XML document. Likewise,you can define a variable that can occur either once or multiple times. An elementor variable that can occur only once is called a single-occurrence data holder. Anelement or variable that can occur more than once is called a multiple-occurrencedata holder.

Single- and multiple-occurrence data holders behave differently when ConversionAgent stores data in them, for example, when you map Content anchors to a dataholder.

In a single-occurrence data holder, each assignment overwrites the precedingassignment.

In a multiple-occurrence data holder, each assignment generates a newoccurrence of the data holder.

Page 78: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 6. Data Holders

67

To understand this, suppose that your XSD schema defines an XML element calledFirstName. If maxOccurs = 1, this is a single-occurrence data holder. If a parser mapsmore than one Content anchor to the FirstName element, the output contains onlythe final mapping.

For example, suppose the source document is a list of first names, each of which isa Content anchor that is mapped to FirstName:

Jack Jennie Larissa

The output contains only the final mapping:

<FirstName>Larissa</FirstName>

Now suppose that maxOccurs = unbounded. This is a multiple-occurrence dataholder. If you map multiple Content anchors to the element, the parser generates alist of names. The output is:

<FirstName>Jack</FirstName><FirstName>Jennie</FirstName><FirstName>Larissa</FirstName>

The same principle applies to variables. If you map multiple anchors to a multiple-occurrence variable, each anchor generates a new occurrence of the variable. This isuseful, for example, to prepare input for the AppendListItems and CombineValuesactions, which concatenate the occurrences.

AttributesAn XML attribute is always a single-occurrence data holder. An attribute cannot bemultiple-occurrence because XML does not permit the same attribute to appearmore than once in the same element.

An attribute can have an XSD list type, which is a space-separated list. The namesattribute in the following element is an example:

<Countries names=”USA Canada Mexico”/>

Conversion Agent treats the attribute as a single-occurrence data holder with anXSD list type (for an example of how to use this feature, see Using XSD Data Typesto Narrow the Search Criteria in Chapter 7, Anchors).

IndexingBy default, Conversion Agent accesses multiple-occurrence data holdersequentially. You can access a multiple-occurrence data holder non-sequentially byusing the indexing feature. For information, see Chapter 12, Locators, Keys, andIndexing.

Destroying the Occurrences

Sometimes, you may wish to destroy all existing occurrences of a multiple-occurrence data holder, and start creating new occurrences from the beginning ofthe list. This can be useful, for example, if you are parsing an iterative structure,and you want to keep only the last iteration. You want to destroy the occurrencesthat store data from the earlier iterations.

Page 79: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 6. Data Holders

68

You can achieve this effect by defining a single-occurrence data holder, whichcontains a nested, multiple-occurrence element. When you re-use the single-occurrence data holder, the nested occurrences are destroyed.

The following procedure presents an example:

1. Add the following schema to a project. The schema defines a custom XSD datatype called MyListType. The type contains a nested, multiple-occurrenceelement called item.

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"elementFormDefault="qualified" attributeFormDefault="unqualified">

<xs:complexType name="MyListType"><xs:sequence>

<xs:element name="item" type="xs:string"maxOccurs="unbounded"/>

</xs:sequence></xs:complexType>

</xs:schema>

2. Define a single-occurrence variable called MyList, which has the data typeMyListType.

3. Use the variable as the target of an iterative structure (see Chapter 12, Locators,Keys, and Indexing).

Each iteration re-uses the single occurrence of MyList . At the start of theiteration, the nested item elements are destroyed. Anchors within the iterativestructure (for example, a nested RepeatingGroup) start assigning the itemelements from the beginning of the list.

Page 80: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 7. Anchors

69

Anchors

Anchors are the components that let a parser hook into specific locations in asource document, for the purpose of finding data and storing it in data holders. Ananchor is a signpost that you place in a document, indicating the position of thedata.

This chapter explains the different types of anchors and how you can use them inparsers.

Marker and Content Anchors

The most commonly used anchors are called Marker and Content anchors. Theseanchors are often used as a pair: a Marker anchor labels a location in a document,and a Content anchor retrieves text from the location.

To understand these anchors, imagine a printed questionnaire. The first linetypically asks for the person's last name and first name, with each label followedby a blank space to receive the information. In Conversion Agent terminology, theprinted labels Last Name and First Name are Marker anchors, and the blank spacesare Content anchors. The anchors provide a means to home in on the data, for thepurpose of extracting it from the source document.

Other Anchor Types

In addition to Marker and Content anchors, Conversion Agent provides many otheranchor types, which you can use to parse documents in different ways. Forexample, Group and RepeatingGroup anchors help you specify the organization ofthe data fields. An Alternatives anchor lets you specify multiple kinds of data thatmight occur at a particular location in a source document.

How Anchors and Delimiters Work Together

In Conversion Agent Studio, you define the anchors in the example sourcedocument. The parser learns how to parse the document by examining the anchors

7

Page 81: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 7. Anchors

70

and the delimiters that separate them. (For an explanation of delimiters, seeChapter 5, Formats).

For example, suppose you have specified that your document uses a tab-delimitedformat. A line in the example source reads

First name:<tab>Ron

where <tab> is a tab character.

You can define First name: as a Marker anchor. You can define Ron as a Contentanchor. The parser learns from these definitions that it should search a sourcedocument for the string First name:. It should then skip over a single tab delimiterand retrieve the text that follows the tab.

Suppose you run the parser on another source document, which contains thefollowing text:

First name:<tab>Jack

The parser finds the anchors as above and retrieves the text Jack.

Now suppose that the source document reads:

First name:<tab>Jack<tab>Age:<tab>34

The parser still retrieves the text Jack, rather than Jack<tab>Age<tab>34 . Thisworks because you have defined the tab character as a delimiter. Conversion Agentunderstands that the Content anchor starts after the first tab and ends before thesecond tab. Of course, you might define some additional anchors that retrieveJack's age, which is 34.

The above examples describe the default behavior of the anchors and delimiters. The anchorshave many properties that let you override the default. For instance, you can define aContent anchor that skips over tabs, even in a tab-delimited format. Please see the AnchorComponent Reference for details.

Mapping Content Anchors to Data Holders

A Content anchor stores the text that it extracts from a source document in a dataholder. For example, you might configure a Content anchor to store its results in anXML element called FirstName. If the Content anchor retrieves the text Jack, theparser would produce the following output:

<FirstName>Jack</FirstName>

More precisely, you might specify that the anchor should store the retrieved text atthe path /Person/*s/FirstName, which refers to the XSD schema (see Chapter 6,Data Holders). The actual parser output would be:

<Person><FirstName>Jack</FirstName>

</Person>

Page 82: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 7. Anchors

71

On the other hand, suppose that the XSD schema defines FirstName as an attributeof the Person element. You might map the Content anchor to /Person/@FirstName .The output would be:

<Person FirstName="Jack" />

You must map to a data holder that has an appropriate data type. For example, donot map Jack to an XML element that has an XSD integerdata type, or to an XMLelement that has a complex data type containing nested elements. For an exceptionto this rule, see Using XSD Data Types to Narrow the Search Criteria, below.

You do not actually type the path (/Person/*s/FirstName, etc.) in Conversion AgentStudio. When you edit a property whose value is a data holder, Conversion Agent Studiodisplays a Schema view, where you can select the data holder. Conversion Agent Studiodisplays the path for you in the IntelliScript.

Mapping to Variables

You can map an anchor to a data holder that is an XML element, an XML attribute,or a variable. The variable option is useful if you want to use the data in asubsequent processing step, but you do not want the raw data to be included in theparser output.

For example, suppose you want to extract several numbers from a sourcedocument and output their sum in the XML. You don't want the individualnumbers in the output. You can map the Content anchors that retrieve the numbersto variables, and use a CalculateValue action to compute and output the sum (seeChapter 9, Actions).

You might also map to a variable that you use in a subsequent anchor, for example,to define a dynamic search text for a Marker anchor.

Mapping to Multiple-Occurrence Data Holders

If you map Content anchors to a single-occurrence data holder, then eachassignment of the data holder overwrites the previous assignment.

If you map to a multiple-occurrence data holder, then each assignment generates anew occurrence of the element. For example, if each Content anchor retrieves aperson's name, the output is a list of names:

<FirstName>Jack</FirstName><FirstName>Jennie</FirstName><FirstName>Larissa</FirstName>

For more information, see Multiple-Occurrence Data Holders in Chapter 6, DataHolders.

Page 83: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 7. Anchors

72

Mapping to Mixed-Content Elements

If an XML element can contain mixed content (both character data and nestedelements), the Schema view displays before and after data holders for the elements.This lets you map a Content anchor to character data that is located before or aftera particular nested element (see Using a Schema to Map Anchors in Chapter 6, DataHolders).

Defining Anchors

When you define a Parser component, you must add a sequence of anchors. Theparser operates by searching for the anchors in the source document and byrunning the operations that you have configured the anchors to perform.

Where to Define Anchors

In the IntelliScript, the anchors are nested within a Parser configuration.

Under the "contains" line of a Parser, you can nest a sequence of anchors.

If you press Enter at the indicated location, Conversion Agent displays a drop-down list, which includes the anchors (and other components) that you can add.

After you add the anchors, the IntelliScript displays the sequence of anchors. Inaddition, Conversion Agent highlights the anchors in the example source.

Page 84: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 7. Anchors

73

Some types of anchors can contain nested anchors. For example, you can nestanchors within an Alternatives or RepeatingGroup anchor. For details, see theAnchor Component Reference.

Sequence of Anchors

The sequence of the anchors should be the sequence of text in the sourcedocument.

For example, suppose that the source document is:

First Name: RonLast Name: Lehrer

Assuming that you define First Name and Last Name as Marker anchors, and thatyou define Ron and Lehrer as Content anchors, the required sequence of anchors inthe parser configuration is:

Anchor Text in the source document

Marker First Name

Content Ron

Marker Last Name

Content Lehrer

Exception: Variable Source SequenceSome source documents may have a variable sequence. For example, suppose thatthe source document may have either of the following structures:

First Name: RonLast Name: Lehrer

Page 85: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 7. Anchors

74

or

Last Name: LehrerFirst Name: Ron

In such cases, you can use the marking property to change the search scope of theanchors (see How a Parser Searches for Anchors).

Select-and-Click Procedure for Marker and Content Anchors

You can add Marker and Content anchors by a select-and-click procedure. Forinstructions on how to do this, see Getting Started with Conversion Agent.

In brief:

1. Select the anchor text in the example source file.

2. Right-click and choose the option to Insert Marker or Insert Content. Thisopens the New Element window for a Marker or Content anchor, respectively.

3. In the IntelliScript editor or in the IntelliScript Assistant view, set the anchorproperties. For a detailed explanation of the properties, see the AnchorComponent Reference below.

Drag-and-Drop Procedure for Content Anchors

You can define Content anchors by the following drag-and-drop procedure:

1. In the example pane, select the anchor text.

2. Drag the text to a data holder in the Schema view.

3. This creates a Content anchor, which is mapped to the data holder. Edit theIntelliScript and set the other anchor properties.

You can also drag and drop from the example pane to the IntelliScript pane. Forexample, you can drag to the text property of an anchor that is defined with theTextSearch option.

Using the IntelliScript to Define Anchors

You can create any type of anchor by editing the IntelliScript. The procedure isidentical to editing any other component in the IntelliScript:

1. At the anchor location, press Enter.

2. Select or type the anchor name.

3. Press Enter again to confirm your selection.

4. Edit the anchor properties.

Page 86: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 7. Anchors

75

Standard Anchor Properties

In this section, we review certain properties that are found in many anchors. Foradditional properties that are specific to particular anchors, see the AnchorComponent Reference.

nameA name that you assign to the anchor. Conversion Agent includes the name inthe event log. This can help you find an event that was caused by theparticular anchor.

remarkA comment describing the anchor.

disabledIf selected, the parser ignores the anchor. This is useful for testing anddebugging, or for making minor modifications in a parser without deleting theexisting anchors.

Disabling an anchor disables all its nested components (nested anchors,transformers, etc.)

optionalBy default, if an anchor fails, the parent component (such as a Parser in whichthe anchor is nested) fails. If you select the optional property, the parentcomponent does not fail.

You can select the optional property to define an anchor that may or may notexist in a source document. If the anchor does not exist, the Parser continues.

If the anchor is nested within a Group anchor, the optional property keeps theGroup from failing. If the anchor is in a RepeatingGroup, the property keeps aniteration of the RepeatingGroup from failing.

directionThe direction in which Conversion Agent searches for the anchor, within thesearch scope (see How a Parser Searches for Anchors below). If direction =forward, the parser finds the first instance of the anchor within the searchscope. If direction = backward, the parser finds the last instance.

For example, suppose the search scope for a Marker anchor contains fiveinstances of the word Balance. If direction = forward, the parser finds thefirst instance of Balance. If direction = backward, it finds the last instance.

For a Marker anchor, you can modify this behavior by using the countproperty. For example, if direction = backward and count = 2, the parserfinds the second to last instance.

markingSpecifies whether an anchor should be used as a reference point to find thesucceeding anchor. The options are full (places a reference point before andafter the current anchor), begin position (before only), end position (afteronly), and none (neither).

Page 87: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 7. Anchors

76

You can use this property to control the search scope for the succeedinganchor. For an explanation and examples, see How a Parser Searches forAnchors).

phaseThe processing phase during which Conversion Agent should search for theanchor (initial, main, or final). For an explanation, see How a Parser Searchesfor Anchors.

no_initial_phaseThis property applies to components that have nested anchors. If the propertyis selected, the anchor has no initial phase. This overrides the option phase =initial in the immediately nested anchors, and changes it to main.

How a Parser Searches for Anchors

To design a parser correctly, it is important that you understand how ConversionAgent searches for the anchors in the parser configuration. There are three mainconcepts:

Search phase

Search scope

Search criteria

This section explains the concepts, and how you can control each of them bysetting the anchor properties.

Search Phases

Conversion Agent searches for a sequence of anchors in three phases:

Initial

Main

Final

By default, all Marker anchors are in the initial phase and all Content anchors are inthe main phase. This means that Conversion Agent first finds the Marker anchors,and it locates the Content anchors between them.

To understand this, consider a parser that processes the following sourcedocument:

First name: Ron Last name: Lehrer

Suppose you have defined the anchors in the following way, with default anchorproperties:

Page 88: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 7. Anchors

77

Anchor Text in the source document Phase

Marker First name: Initial

Content Ron Main

Marker Last name: Initial

Content Lehrer Main

In the initial phase, Conversion Agent searches for the Marker anchors:

It searches for First name:.

It searches for Last name: at a location that follows First name:.

In the main phase, Conversion Agent searches for the Content anchors:

It searches for the Ron anchor at a location between First name: and Lastname:.

It searches for the Lehrer anchor at a location after Last name:.

Nested PhasesAnchors that have nested anchors, such as Group, have nested phases. For example,if a Group anchor runs in the parser's main phase, a Marker anchor that is nested inthe Group runs in a nested initial phase. The nested initial phase is part of theparser's main phase, but it is before the other anchors in the Group.

Another example is a RepeatingGroup anchor, which searches for both separatorsand for nested anchors. In order to identify the nested anchors correctly, it searchesfor the separators before it searches for the nested anchors.

Search Scope and Search Criteria

The Search Phases example illustrates the concepts of search scope and search criteria.The search scope is the portion of a document in which Conversion Agent searchesfor an anchor. The search criteria are the rules by which Conversion Agent finds theanchor within the search scope.

In the initial phase, Conversion Agent starts searching for the Marker anchorcontaining First name: at the beginning of the document. The search scope for thisanchor is the entire document.

There is a single search criterion: the anchor must contain the text First name:

The search scope for the Last name: anchor starts at the end of First name:, andextends to the end of the document. The search criterion is that the anchor mustcontains the text Last name:.

In the main phase, the parser interpolates the Content anchors between the Markeranchors. The search scope for the Ron anchor extends from the end of the First

Page 89: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 7. Anchors

78

name: anchor to the beginning of the Last name: anchor. Assuming that the parseruses a space-delimited format, the search criteria are to retrieve all the text in thesearch scope, after the leading space character and before the second spacecharacter.

The search scope for the Lehrer anchor is from the end of Last Name: to the end ofthe document. The search criteria are similar to those for the Ron anchor.

Let's add this analysis to the anchor table. The table now describes the completemethod by which the parser finds the anchors.

Anchor Text in thesourcedocument

Phase Search scope Search criteria

Marker Firstname:

Initial Entire document Text = First name:

Content Ron Main End ofFirst name:to start of Lastname:

After the leading spaceBefore the next space

Marker Last name: Initial End ofFirst name:to end of document

Text = Last name:

Content Lehrer Main End ofLast name:to end of document

After the leading spaceBefore the next space

Adjusting the Search Phase

By assigning the phase property of an anchor, you can change the phase in whichConversion Agent searches for the anchor.

Consider the following source document:

CONTENT<10 characters>MARKER

In this example, the Marker anchor is located 10 characters after the Content anchor.

By default, Conversion Agent searches for the Marker in the initial phase, and itsearches for the Content in the main phase. This won't work here, becauseConversion Agent cannot find the Marker unless it has already found the Content!

The solution is to change the phase property of one of the anchors. You can changethe Content to the initial phase, or the Marker to the main phase. In either case,Conversion Agent finds the anchors.

Adjusting the Search Scope

There are two ways to adjust the search scope for an anchor:

Page 90: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 7. Anchors

79

By setting the phase property of the anchor or the surrounding anchors

By setting the markingproperty of the surrounding anchors

Phase Property

If a Content anchor lies between two Marker anchors, then by default, the searchscope for the Content is the segment between the Marker anchors.

If you change all the anchors to the same phase, the search scope of the Content isno longer bounded by the second Marker ; it is from the end of the first Marker tothe end of the document.

As an example, consider the following source document:

Tree Fig Date<tab>October 27, 2003 (pruned)Tree Date Palm Date April 27, 2003<tab>(planted)

The example assumes that the source document has a loose structure, containingvarying numbers of spaces, tabs, or other symbols interspersed in the text, so wecannot easily use the spaces and tabs as delimiters. An example like this mightarise in parsing word-processor documents.

We can parse this document using a RepeatingGroup anchor, which contains nestedMarker and Content anchors. The Marker anchors are the strings Tree and Date. TheContent anchors are everything between the Marker anchors, including the spacesand tabs.

The problem in parsing this document is in the second iteration of theRepeatingGroup, which parses the second line. If we leave the Marker anchors in theinitial phase, Conversion Agent incorrectly considers the first instance of the wordDate to be a Marker. In the main phase, it fails to find Date Palm because the searchscope is between the two Marker anchors, and there is no text between them.

A possible solution is to move the Marker for Date to the main phase, and to definethe Content anchor (Date Palm) using an expression that searches for a tree nameof one or two words. In the initial phase of the RepeatingGroup, Conversion Agentfinds the Marker for Tree. In the main phase, it finds Date Palm followed by theMarker for Date.

With the new phase setting, we have changed the search scope for the tree name.The scope is now from Tree to the end of the iteration, and Conversion Agent findsDate Palm successfully.

Marking Property

Consider the following source-document structure:

MARKER%%%CONTENT A^^^CONTENT B

Let's suppose that the sequence of Content A and Content B varies among thesource documents. In some documents, Content B precedes Content A.

In that case, the search criteria are:

Page 91: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 7. Anchors

80

Content A and Content B both follow the Marker anchor.

Content A begins with %%%, and Content B begins with ^^^.

By default, the search scope for Content A is from the end of the Marker to the endof the document. The search scope for Content B is from the end of Content A tothe end of the document. This won't work because in some source documents,Content A and Content B are reversed.

The solution is to change the search scope for Content B . You can do this by settingthe marking property of Content A. The marking property specifies whereConversion Agent should place the reference points, which determine the start andend of the search scope.

The default setting is marking = full, which means that Conversion Agent placesreference points before and after each anchor. The search scope for Content Bbegins at the last reference point, which is the one following Content A. This leadsto incorrect parsing, as we have seen.

You need to prevent Conversion Agent from placing reference points aroundContent A. You can do that by setting the marking property of Content A to none.As a result, the search scope for Content B starts at the end of the Marker. Thisallows Conversion Agent to find Content B, even if it precedes Content A .

The following table describes all four possible values of the marking property. TheResult column assumes that you assign the marking value to Content A in theabove example.

markingproperty

Explanation Result

full Conversion Agent placesreference marks at the beginningand end of the current anchor.This is the default behavior.

Conversion Agent seeks the next anchorafter the end of the current anchor (ContentB follows Content A).

beginposition

Conversion Agent places areference mark only at the start ofthe current anchor.

Conversion Agent seeks the next anchorafter the start of the current anchor (ContentB overlaps or follows Content A).

endposition

Conversion Agent places areference mark only at the end ofthe current anchor.

Conversion Agent seeks the next anchorafter the end of the current anchor (ContentB follows Content A).

none Conversion Agent does not placeany reference marks at thecurrent anchor.

Conversion Agent seeks the next anchorafter the end of the preceding anchor(Content B follows Marker, without regardto Content A).

Page 92: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 7. Anchors

81

There are a few circumstances where you must use an anchor that marks a reference point.An example is the separator of a RepeatingGroup. If the separator does not mark, it doesnothing. Conversion Agent Studio displays a warning if you attempt to use a non-markinganchor in a location where marking is required.

Online SamplesIn the Conversion Agent Samples folder, openProjects\Marking_Mode\Marking_Mode.cmw. The sample demonstrates the use ofthe marking property to alter the search scope for a Content anchor.

For a second example, see Projects\NonMarker\NonMarker.cmw. This sample usesthe marking = none option, permitting two Content anchors to overlap. The samplealso illustrates the use of direction = backward to search from the end of thescope.

Adjusting the Search Criteria

Conversion Agent can search for anchors according to a large number of searchcriteria, for example:

According to the delimiter locations, which Conversion Agent learns from theexample source

According to the positional offset (number of characters) from a precedingreference point

By searching for particular text

By searching for a pattern (a regular expression, which is an enhancedwildcard search)

By searching for a specified data type

By searching for an attribute value

You can combine these search criteria in almost any way. For example, you mightspecify that a Content anchor begins two tabs after a Marker anchor, and that it is10 characters long. If you do this, you are using a delimiter criterion to define thebeginning of the Content anchor, and an offset criterion to define the end.

The components that perform these searches are called searcher components. Theiruse is described in the Anchor Component Reference (especially in the Content andMarker anchors). The details of the searcher components are explained in theSearcher Component Reference.

Using XSD Data Types to Narrow the Search Criteria

By default, in addition to the other search criteria, Conversion Agent searches for aContent anchor according to the XSD type of its data holder.

For example, suppose that the search scope of a Content anchor is the followingstring.

Page 93: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 7. Anchors

82

The students' grades were 81, 56, and 95, respectively.

Further suppose that you define no other search criteria for the anchor. If you mapthe anchor to a data holder that has a type of xs:string, the anchor retrieves theentire string.

If the data holder has a type of xs:integer, Conversion Agent searches for the firstsubstring that matches the data type. Assuming that you configure the anchor withdirection = forward, the anchor retrieves the integer 81. If direction = backward(in other words, the search is from the end of the search scope), the anchorretrieves 95.

Now suppose the data holder has a type of xs:integer, and the schema restrictsthe data holder to values less than 60. Conversion Agent searches for an integerthat conforms with the restriction, and returns 56.

XSD Types in Combination with Other Search Criteria

You can combine an XSD-type criterion with other search criteria. In the aboveexample, suppose you configure the Content anchor to search for the regularexpression

[",.*,"]

which searches for two commas, separated by any characters other than a newline.The search finds the substring

, 56,

If the XSD type of the data holder is xs:integer, the anchor retrieves 56.

List Types

A data holder can have an XSD list type, which is a space-separated list.Conversion Agent filters the text retrieved by the Content anchor to match the XSDtypes of the list items.

Suppose, for example, that the schema defines an attribute called grades , which isa list of xs:integer items. If you map the above Content anchor to grades, theanchor returns a list of the integers in the string, or 81 56 95. If the grades attributebelongs to an element called Students, the XML output is:

<Students grades=”81 56 95" />

If you define the Content anchor with direction = backward, the list is reversed:

<Students grades=”95 56 81" />

Decimal TypeIf a data holder has the xs:decimal type, Conversion Agent assumes that thedecimal separator is a period. If your locale setting uses a comma as the decimalseparator, an xs:decimal search may fail.

Page 94: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 7. Anchors

83

Type Search with Closing Marker

If a Content anchor has a closing_marker property, but does not have anopening_marker, Conversion Agent returns the substring that is closest to theclosing_marker, which matches the XSD type of the data holder.

In the above example, if you define the word respectively as the closing_marker,and the data holder has a type of xs:integer, the anchor retrieves 95.

Online SampleIn the Conversion Agent Samples folder, open Projects\Pattern\Pattern.cmw.

The sample is a parser containing a single Content anchor. The anchor is mappedto an XML element, which the XSD schema restricts to a pattern (an xs:patternelement, which uses a regular expression to define the acceptable charactersequences). The anchor outputs the portion of the source document that matchesthe pattern.

Disabling the Data-Type SearchYou can disable the data-type search by selecting the disable_XSD_type_searchproperty of the Content anchor. If you do that, the anchor search according to theother criteria, without regard to the XSD type of the data holder.

If the result does not have the proper type, it cannot be stored in the data holderand the anchor fails. You can use transformers to convert the result to the propertype, and prevent the failure (see Chapter 8, Transformers).

For example, suppose that the source document contains a date in the dd-mm-yyyyformat, and you want to store the date in an xs:date data holder. You can do thisin the following way:

1. Define a Content anchor that retrieves the dd-mm-yyyy data, ignoring themismatch with the xs:date type.

2. Configure the anchor with a DateFormat transformer, which converts theresult to xs:date.

Anchors that Contain Nested Anchors

An interesting question is how a parser searches for an anchor that has nestedanchors, such as a Group anchor.

Conversion Agent does not search for a Group, and then search for the nestedanchors. Rather, it searches (within the search scope) for the nested anchors. Theextent of the Group is defined by the nested anchors that Conversion Agent finds.

For example, suppose a parser has the following sequence of anchors. We assumethat the anchors have default phase, marking, and optional properties.

Marker AGroup

Marker BContent C

Page 95: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 7. Anchors

84

Marker DMarker E

Conversion Agent searches first for Marker A and Marker E. The search scope of theGroup is the region between Marker A and Marker E.

Within the search scope of the Group , Conversion Agent then searches for Marker Band Marker D. The region between these Marker anchors is the search scope forContent C.

Within the latter search scope, Conversion Agent searches for Content C .

You can view these relationships in the example pane of Conversion Agent Studio.The example pane highlights the nested anchors, helping you visualize the extentof the Group.

Anchor Quick Reference

The following table briefly describes the anchors that Conversion Agent supports.For complex information, see the Anchor Component Reference below.

We have attempted to categorize the anchors according to their maincharacteristics or purpose. Within each category, the anchors are listed inalphabetical order.

Simple AnchorsThe anchors in this category are used to define simple text elements in a document.

ContentRetrieves text from a specified location in a source document and stores thetext in a data holder

MarkerDefines a reference point in the source text, which the parser uses to search forother anchors.

Grouping Anchors

These anchors group a set of nested anchors together.

DelimitedSectionsDefines sections of a document, which are delimited by a separator.

EnclosedGroupDefines a bounded segment of the source document.

GroupBinds a set of anchors together for processing as a unit.

RepeatingGroupParses a repetitive section of a document.

Page 96: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 7. Anchors

85

Other Anchors

AlternativesSpecifies alternative anchors that may exist at a particular location in a sourcedocument.

EmbeddedParserActivates a secondary parser that runs on a segment of the source document.

FindReplaceAnchorMarks text for replacement (used with the TransformByParser transformer).

HtmlFormDefines an HTML form. The anchor submits the form to a web server and runsa secondary parser on the server response.

Anchor Component Reference

This section describes the anchor components that are available in ConversionAgent.

Alternatives

The Alternatives anchor lets you define a set of alternative, nested anchors. Youcan define a criterion for which alternative the parser should accept. Only theaccepted anchor affects the parser output. The other anchors (whether failed orsuccessful) have no effect on the parser output.

Example

Suppose you are parsing a document in which the date can appear in either of thefollowing patterns:

21/10/03October 21, 2003

To process this content, you can define an Alternatives anchor that contains twoContent anchors, which store their output in different XML elements. Each XMLelement is constrained to accept one of the date patterns. The Alternatives anchoris configured with selector = ScriptOrder .

When the parser runs the Alternatives anchor, it tests the first Content anchor. Ifthe date matches the pattern of the first anchor, the Content anchor succeeds. If thedate does not match the pattern, the Content anchor fails, and the Alternativesanchor tests the second Content anchor. In this way, the parser can process bothdate patterns.

How to Define

Add an Alternatives anchor by editing the IntelliScript. Nested with theAlternatives anchor, add the alternative anchors.

Page 97: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 7. Anchors

86

Basic Properties

selectorThe criterion for deciding which alternative to accept. The options are:

selector property Explanation

ScriptOrder Conversion Agent tests the nested anchors in the sequence thatthey are defined in the IntelliScript. It accepts the first nestedanchor that succeeds.If all the nested anchors fail, the Alternatives anchor fails.

DocumentOrder Conversion Agent tests all the nested anchors. It accepts eitherthe first or last successful nested anchor, according to thelocations of the anchors in the source document.If all the nested anchors fail, the Alternatives anchor fails.

NameSwitch Conversion Agent searches for the nested anchor whose nameproperty is specified in a data holder (select from a Schema view).It ignores the other nested anchors.If the named nested anchor fails, the Alternatives anchor fails.

Advanced PropertiesFor explanations of the following properties, see Standard Anchor Properties.

name

remark

disabled

optional

marking

phase

Using Alternatives to Select a Secondary Parser

You can use an Alternatives anchor to control which of several secondary parsersprocesses a document. The main parser can use this feature to process sourcedocuments of multiple types.

For example, suppose that the home page of a newspaper web site has links tonews articles. Following each link, the article is labeled News, Business, or Sports.You want to parse the articles, using a different parser for each type, like this:

<a href="PrincessWeds.html">Norwegian Princess Weds</a> - News<a href="BanksMerge.html">Local Banks to Merge</a> - Business<a href="HomeTeamWins.html">Bears Trounce Antelopes</a> - Sports

Page 98: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 7. Anchors

87

One way to do this is as follows:

1. The main parser retrieves the filename of an article and stores it in a variable.

2. The main parser contains an Alternatives anchor, which is configured withthe DocumentOrder option.

3. The Alternatives anchor contains nested Group anchors.

4. Each Group anchor is configured with a Marker anchor and a RunParser action,as follows:

- The first Group contains a Marker that searches for the string News. The Groupis configured with a RunParser action, which runs a secondary parser calledNewsParser.

- The second Group contains a Marker that searches for Business and runsBusinessParser.

- The third Group contains a Marker that searches for the Sports and runsSportsParser.

The Alternatives anchor tests all three Group anchors. It accepts the Groupcontaining the first Marker that occurs after the filename. The Group runs theappropriate parser on the file.

Online Sample

In the Conversion Agent Samples folder, openProjects\Alternatives\Alternatives.cmw. The sample uses Alternatives anchorsto parse different name and date formats that may exist in a source document.

Content

A Content anchor retrieves text from the source document. It stores the retrievedtext in a data holder.

ExampleFor many examples and exercises using this anchor, see Getting Started withConversion Agent.

How to Define

You can create a Content anchor by the select-and-click approach, by the drag-and-drop approach, or by editing the IntelliScript directly. For instructions, see DefiningAnchors above.

Basic Propertiesopening_marker

A searcher component, which labels the start of a region, in which ConversionAgent should search for the Content anchor.

Page 99: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 7. Anchors

88

Defining this property is similar to defining a Group, which contains a Markerfollowed by a Content anchor.

The possible property values are NewlineSearch, PatternSearch,OffsetSearch, and TextSearch. For example, a NewlineSearch means thatConversion Agent should search for the anchor after a newline character. ATextSearch means to search after a specified text string. For details, see theSearcher Component Reference below.

closing_markerA searcher component, which labels the end of a region, in which ConversionAgent should search for the Content anchor.

Defining this property is similar to defining a Group, which contains a Contentanchor followed by a Marker. Defining both opening_marker andclosing_marker is similar to defining a Group that contains a Marker ContentMarker sequence.

The property values are the same as for opening_marker.

valueSpecifies a searcher component, which searches for the text retrieved by theContent anchor (see the Searcher Component Reference below).

The search is between opening_marker and closing_marker. If opening_markeris not defined, the search is from the preceding anchor or reference point. Ifclosing_marker is not defined, the search is to the end of the document (or tothe end of a region defined in another way, such as the end of a Group). Formore information about the search algorithm, see How a Parser Searches forAnchors.

The options are:

value Explanation

(empty) The Content anchor retrieves the entire search scope.

AttributeSearch The Content anchor retrieves the value from an expression ofthe type AttributeName = .... This is useful, for example, toretrieve attribute values from an XML or HTML source document.

LearnByExample The parser learns what text to retrieve according to the parserformat and the example source.For example, if the parser has a tab-delimited format, it counts thenumber of tabs from the start of the search scope to the exampletext. It retrieves the text between the corresponding tabs in thesource document.

PatternSearch The Content anchor retrieves the first text that matches aspecified regular expression.

TypeSearch The Content anchor retrieves the first text that matches aspecified XSD data type.

Page 100: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 7. Anchors

89

data_holderA data holder where the anchor should store the retrieved text (select from aSchema view).

In addition to the searcher components, Conversion Agent uses the XSD type of thedata_holder as a search criterion. For information, see Using XSD Data Types toNarrow the Search Criteria above.

Advanced Properties

allow_empty_valuesIf selected, the Content anchor can be empty. The data_holder is assigned anempty value.

This can occur, for example, if the anchor is configured with value =LearnByExample and there is nothing between the delimiters. It can also occurif there is nothing between the opening_marker and the closing_marker.

If allow_empty_values is not selected in these situations, the anchor fails.

disable_XSD_type_searchIf not selected, the anchor searches for data within its search scope thatmatches the XSD type of the data holder (see Using XSD Data Types to Narrowthe Search Criteria).

If selected, the anchor searches without regard to the XSD type. If the result(following the application of any transformers) does not have the proper type,the data cannot be stored in the data holder and the anchor fails.

ignore_default_transformersThis property applies to anchors that retrieve content from the sourcedocument. If selected, the anchor does not apply the parser's defaulttransformers to the content (see Chapter 8, Transformers).

transformersThis property applies to anchors that retrieve content from the sourcedocument. The property is a sequence of transformers that Conversion Agentshould apply to the retrieved text (see Chapter 8, Transformers).

For explanations of the following properties, see Standard Anchor Properties.

Page 101: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 7. Anchors

90

name

remark

disabled

optional

direction

marking

phase

The direction property has multiple effects in a Content anchor. If direction =backward:

Conversion Agent searches backwards from the end of the search scope for theopening_marker and closing_marker (opening_marker still precedesclosing_marker, however).

The searcher component searches backward from the end of the search scope.

If the searcher component is LearnByExample, it counts the delimitersbackward from the end of the search scope.

Online Sample

In the Conversion Agent Samples folder, open Projects\Content\Content.cmw. Thesample illustrates several uses of the opening_marker, closing_marker, and valueproperties to configure Content anchors.

DelimitedSections

The DelimitedSections anchor is used for parsing sectioned data, which aredelimited by a separator.

Within the DelimitedSections, you should nest other anchors. Each nested anchoris responsible for parsing a single section.

ExampleAn employee resume form contains several sections, each of which is preceded bya line of hyphens:

----------------------------Jane PalmerEmployee ID 123456----------------------------Professional Experience...----------------------------Education...

Page 102: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 7. Anchors

91

You can define the sectioned region as a DelimitedSections anchor, with the lineof hyphens as the separator. Because the line of hyphens precedes each section,you should define the separator_position as before.

Within the DelimitedSections anchor, nest three Group anchors. The first Groupparses the Jane Palmer section, the second Group parses the ProfessionalExperience section, and so forth.

Optional SectionsIn the above example, suppose that the second section, Professional Experience,is missing from some source documents. Its separator (the line of hyphens) isalways present.

----------------------------Jane PalmerEmployee ID 123456--------------------------------------------------------Education...

To handle this situation, you should configure the DelimitedSections in thefollowing way:

In the second Group anchor, select the optional property. This means that ifthe Group fails, it should not cause the DelimitedSections to fail.

In the DelimitedSections anchor, set using_placeholders = always. Thismeans that the anchor should look for the separator of the optional section,even if the section itself is missing.

Now suppose that if the Professional Experience section is missing, its separatoris also missing.

----------------------------Jane PalmerEmployee ID 123456----------------------------Education...

In this case, you should configure the DelimitedSections as follows:

In the second Group anchor, select the optional property.

In the DelimitedSections anchor, set using_placeholders = never. Thismeans that the anchor should not look for the separator of a missing section.

How to Define

Add a DelimitedSections anchor by editing the IntelliScript. Nested with theDelimitedSections anchor, add a sequence of anchors that parse the sections.

Page 103: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 7. Anchors

92

Basic Properties

separator_positionPosition of the separator relative to the sections.

The following table explains the values. The examples assume that theseparator is a vertical-line character ( | ).

separator_position Explanation Example

before There is a separator before each section (includingthe first section).

|1|2|3|4

after There is a separator after each section (including thelast section).

1|2|3|4|

between There is a separator between the successivesections (not before the first section and not after thelast section).

1|2|3|4

around There are separators before and after each section(including the first and last sections).

|1|2|3|4|

using_placeholdersThis property specifies whether the DelimitedSections should look for theseparator of an optional section that is missing from the source document.

The following table explains the values. The examples assume that the sourcedocument has the structure |1|2|3|4 . The examples illustrate the structure ifsections 2 and 4 are missing.

using_placeholders Explanation Example

always The separator of a missing section always exists. |1||3|

never The separator of a missing section never exists. |1|3

when necessary The separator of a missing internal section alwaysexists. The separator of a missing terminal sectionnever exists.

|1||3

separatorAn anchor (typically a Marker) that delimits the sections.

Advanced PropertiesFor explanations of the following properties, see Standard Anchor Properties.

Page 104: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 7. Anchors

93

name

remark

disabled

optional

marking

phase

Online Sample

In the Conversion Agent Samples folder, openProjects\DelimitedSections\DelimitedSections.cmw. The sample illustrates aDelimitedSections anchor, which parses sections that are separated by a | symbol.Each section is parsed by a single Content anchor.

EmbeddedParser

The EmbeddedParser anchor uses a secondary parser to parse its search scope.

It is permitted for a parser to call itself recursively.

Example

A document is tab-delimited, except for one section that is comma-delimited.

To parse the document, you can define a main parser that uses the TabDelimitedformat. Define another parser, which uses the CommaDelimited format. Use anEmbeddedParser anchor to run the second parser within the execution of the firstparser.

How to Define

You can define an EmbeddedParser by editing the IntelliScript.

Basic Propertiesparser

The name of the secondary parser, which must be defined in the same project.

schema_connectionsConnects the output of the secondary parser to the output of the main parser.The property contains a list of Connect subcomponents, which specify thecorrespondence between data holders in the output of the two parsers (see theAnchor Subcomponent Reference).

Advanced Properties

source_transformersA sequence of transformers, which the parser should apply to the search scopebefore the secondary parser processes it.

Page 105: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 7. Anchors

94

For explanations of the following properties, see Standard Anchor Properties.

name

remark

disabled

optional

marking

phase

Online Sample

In the Conversion Agent Samples folder, openProjects\EmbeddedParser\EmbeddedParser.cmw. The sample uses a main parser todetermine the location of an address. It then runs an EmbeddedParser, which parsesthe address.

EnclosedGroup

The EnclosedGroup anchor lets you define a bounded region that contains nestedanchors.

The boundaries are specified by two opening and closing anchors. In the case ofnested boundaries (such as parentheses or HTML tags), the EnclosedGroup findsthe matching boundaries.

An EnclosedGroup is similar to a Content anchor with an opening_marker andclosing_marker. However:

The Content anchor retrieves the entire content between the opening andclosing, without further parsing.

The EnclosedGroup lets you further parse the content between the opening andclosing.

Example

You can define an HTML table as an EnclosedGroup, with the <table> and</table> tags as the opening and closing. The nested anchors parse the content ofthe table.

Suppose the <table> element contains a nested <table> element (that is, the nestedtable is located within a cell of the parent table). The EnclosedGroup anchormatches the parent <table> tag with the parent </table> tag. It does not match theparent <table> tag with the nested </table> tag, which would be amisidentification of the table.

For a tutorial exercise that uses this anchor, see the chapter Parsing Word and HTMLDocuments in Getting Started with Conversion Agent.

Page 106: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 7. Anchors

95

How to Define

You can define an EnclosedGroup anchor by editing the IntelliScript. Add thenested anchors that parse the content.

Basic Properties

openingThe opening anchor of the EnclosedGroup (typically a Marker anchor).

closingThe closing anchor of the EnclosedGroup (typically a Marker anchor).

Advanced PropertiesThe following properties are useful in situations where the anchor must selectspecific occurrences of data holders. For an explanation, see Chapter 12, Locators,Keys, and Indexing.

source

target

For explanations of the following properties, see Standard Anchor Properties.

name

remark

disabled

optional

marking

phase

no_initial_phase

FindReplaceAnchor

This anchor is intended for use within a parser that is activated by theTransformByParser transformer. The anchor marks text in the source, and specifiesa replacement for the text. When the parsing is done, the TransformByParsertransformer uses the markings to modify the text.

FindReplaceAnchor identifies the text to replace in the following way:

If FindReplaceAnchor does not contain any nested anchors, it replaces thecomplete text within its search scope. For example, if FindReplaceAnchor isbetween two Marker anchors, it marks the text between them.

If FindReplaceAnchor contains nested anchors, it replaces the text spanned bythe nested anchors. For example, if FindReplaceAnchor contain a Marker , itreplaces the Marker. If it contains two Marker anchors, it replaces the segmentfrom the first Marker to the second (including the Marker anchors themselves).

Page 107: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 7. Anchors

96

You can configure the anchor with a static replacement string, or with a string thatthe parser retrieved dynamically from the source document.

Example

You have a text document, to which you want to add line numbers.

1. Create a parser, and add a RepeatingGroup to it.

2. Within the RepeatingGroup , add a FindReplaceAnchor.

3. Within the FindReplaceAnchor , add a Marker anchor, and set its searchproperty to NewlineSearch . This causes the FindReplaceAnchor to mark everynewline in the document.

4. Configure the RepeatingGroup to store its current_iteration in a variable. Setthe replace_with property of the FindReplaceAnchor to the variable.

5. Add a global instance (not within the parser) of a TransformByParsertransformer. Set the parser property of TransformByParser to the parser.

6. Run the TransformByParser instance as a stand-alone transformer (see UsingTransformers as Runnable Components in Chapter 8 Transformers).

7. The transformer outputs a modified version of the original file, which containsline numbers. You can find the output in the Results folder of the project.

Basic Propertiesreplace_with

Type the replacement string. Alternatively, you may click the browse buttonand select a data holder that contains the text.

Advanced Properties

on_partial_matchIf the FindReplaceAnchor does not find all its nested, non-optional anchors,and on_partial_match has a value of fail, the FindReplaceAnchor fails.

If on_partial_match has a value of skip, the FindReplaceAnchor removes thearea spanned by the successful nested anchors from its search scope and triesto find all the nested anchors again. It may succeed on the second try. Ititerates this procedure, as long as there is a partial match.

The following properties are useful in situations where the anchor must selectspecific occurrences of data holders. For an explanation, see Chapter 12, Locators,Keys, and Indexing.

source

target

For explanations of the following properties, see Standard Anchor Properties.

Page 108: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 7. Anchors

97

name

remark

disabled

optional

marking

phase

no_initial_phase

Online Sample

For an online sample of this anchor, see TransformByParser in Chapter 8,Transformers.

Group

The Group anchor binds a sequence of anchors and actions together. It lets youapply properties to all the nested components, together.

For example, a Group lets you define operations that Conversion Agent shouldperform on a set of anchors, or it lets you control the phase of the nested anchors.

How to Define

You can define a Group by editing the IntelliScript. Add nested anchors (andoptionally actions) that parse the content of the Group .

Optional Group

You can use the optional property of a Group to prevent Conversion Agent fromattempting to retrieve text from a missing section of a document.

For example, to parse the source

First name: Ron

you might define First name: as a Marker and Ron as Content. If some sourcedocuments do not contain the first-name data, you can put the Marker and Contentin a Group and make it optional.

If First name: is not found, the Group immediately fails. The parser does notsearch for the Content anchor. This is the behavior we desire.

There is a difference between making the Group optional and making its nestedanchors optional. If you make both the Marker and Content optional, instead of theGroup, Conversion Agent ignores the Marker failure and searches for the Content.This might result in retrieving irrelevant text.

Page 109: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 7. Anchors

98

Advanced Properties

absentIf selected, the Group succeeds only if one of its nested, non-optional anchorsor actions fails. You can use this feature to test for the absence of nestedanchors.

on_partial_matchIf the Group does not find all its nested, non-optional anchors, andon_partial_match has a value of fail, the Group fails.

If on_partial_match has a value of skip, the Group removes the area spannedby the successful nested anchors from its search scope and tries to find all thenested anchors again. It may succeed on the second try. It iterates thisprocedure, as long as there is a partial match.

search_orderThe order in which to process the nested anchors. The options are:

search_order Explanation

top-down Processing the nested anchors in the orderthat is defined in the IntelliScript.

bottom-up Process the nested anchors in reverse order.This is useful if you need to retrieve data froma later anchor, which affects how you processan earlier anchor.

The following properties are useful in situations where the anchor must selectspecific occurrences of data holders. For an explanation, see Chapter 12, Locators,Keys, and Indexing.

source

target

For explanations of the following properties, see Standard Anchor Properties.

Page 110: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 7. Anchors

99

name

remark

disabled

optional

marking

phase

no_initial_phase

Online Sample

In the Conversion Agent Samples folder, openProjects\persistent_search\persistent_search.cmw.

The sample illustrates a Group that is configured with the on_partial_match =skip property. The Group contains two Marker anchors:

The first Marker searches for the text A.

The second Marker searches for a pattern (a string containing any number of *characters), and has the adjacent property, which means that it must beadjacent to the first Marker.

On the first pass, the Group finds an A character at the beginning of the sourcedocument. It does not find the second Marker adjacent to the A character, however.

The Group therefore reduces its search scope by eliminating the first A character,and searches again for the two adjacent Marker anchors. It continues this procedureuntil it successfully finds a string A*, which contains the adjacent Marker anchors.

You can observe the behavior in the event log. The log records that the Group failson the first two trials, and succeeds on the third.

Try experimenting with the on_partial_match and adjacent settings. You can seethe effect in the color coding of the example source.

You can also try running the sample, although the result file is empty because theparser does not contain Content anchors. If you set on_partial_match = fail, youcan observe in the event log that the parser fails, because the Group cannot find theadjacent anchors.

HtmlForm

HtmlForm is an anchor that marks a <form> element in an HTML source document.It submits the form to a web server, which is specified in the action attribute of theform. It then activates a secondary parser, which parses the server response.

Within the field_filters property of HtmlForm, you can modify the fields andvalues of the form. The HtmlForm anchor collects the possible values of the formfields, combines them, and submits all the combinations. You can distribute thesubmissions over multiple computers.

Page 111: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 7. Anchors

100

HtmlForm appends the parsed output of the web-server responses to the mainparser output.

Configuration TipsWe suggest the following approach for configuring an HtmlForm anchor:

1. Prepare and run the anchor without any filters.

2. In the example pane, confirm that the HtmlForm anchor highlights the correctform.

3. Examine the Results\_HtmlForm.xml file, which contains the form data thatwas submitted. Confirm that the anchor included all the fields.

4. Now you can add filters or adjust the fields as desired.

How to Define

You can define an HtmlForm by editing the IntelliScript.

Basic Propertiesnext_parser

The name of a secondary parser, which parses the server response.

field_filtersAdds fields and their values to the form data, which the anchor submits.Specify a sequence of AddField , ModifyField , and RemoveFieldsubcomponents, which generate the desired fields (see the AnchorSubcomponent Reference).

Be sure to use the same field names as in the original HTML form.

clickSpecifies the HTML element that a simulated user clicks to submit the form.The options are ImageClick and SubmitClick (see the Anchor SubcomponentReference).

Advanced Properties

part_to_submitSelect the portion of the possible field-value combinations to submit. Theoptions are SubmitAll, SegmentIndex, and SegmentSize (see the AnchorSubcomponent Reference).

retriesThe number of retries, if the anchor cannot connect to the web server on thefirst attempt.

seconds_to_waitThe interval in seconds between retries.

Page 112: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 7. Anchors

101

js_functionThe name of a JavaScript function that exists in the source document. Theanchor calls the function before it submits the form.

js_paramsA list of data holders containing parameters of js_function (select from theSchema view). The parameters must be in the same order as in the functiondeclaration.

For explanations of the following properties, see Standard Anchor Properties.

name

remark

Marker

A Marker defines a location in a source document. It is used as a reference point,from which Conversion Agent searches for the succeeding anchors.

Example

For many examples and exercises using this anchor, see Getting Started withConversion Agent.

How to DefineYou can define a Marker by the select-and-click method or by editing theIntelliScript. For instructions, see Defining Anchors above.

Basic Propertiessearch

Defines the search criteria for the Marker . The search criteria determine wherethe Marker is located within the search scope (see How a Parser Searches forAnchors). For example, a NewlineSearch locates the Marker at a newlinecharacter. A TextSearch locates the Marker at a specified string.

The value of this property is one of the following searcher components (see theSearcher Component Reference).

search Explanation

NewlineSearch Searches for a newline character.

TextSearch Searches for a predefined text string, or a text string that isdynamically defined (contained in a data holder).

PatternSearch Searches for a string that matches a specified regular expression.

Page 113: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 7. Anchors

102

search Explanation

OffsetSearch Skips a predefined number of characters following the precedingreference point, or a number of characters that is dynamicallydefined (contained in data holder).The Marker is the point following the skipped characters.

TypeSearch Searches for a string that conforms to a specified XSD data type.

Advanced Properties

adjacentIf selected, the Marker must be adjacent to the anchor at the beginning of itssearch scope (if direction = backward, adjacent to the anchor at the end of itssearch scope). If not selected (which is the default), Conversion Agent can skipover text until if finds the Marker.

absentIf selected, the Marker is a test that the specified text or pattern is absent fromthe document. If Conversion Agent finds the Marker, the parser or groupcontaining the Marker fails.

countThe occurrence number to find. The default is 1. To set the Marker at thesecond newline following the preceding anchor, for example, set search =NewlineSearch and count = 2.

For explanations of the following properties, see Standard Anchor Properties.

name

remark

disabled

optional

direction

marking

phase

By default, the phase property of a Marker is initial, which means thatConversion Agent scans a document for Marker anchors before it searches forContent anchors.

Online Sample

In the Conversion Agent Samples folder, open Projects\Markers\Markers.cmw. Thesample demonstrates Marker anchors that search for:

A predefined text string

A newline character

Page 114: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 7. Anchors

103

An offset

A data type

A regular expression

If you run the parser, note that the result file is empty because the configurationdoes not have any Content anchors.

RepeatingGroup

The RepeatingGroup anchor parses a repetitive region of a source document.

The repeating units (called iterations) are typically delimited by a separator. TheRepeatingGroup contains a sequence of nested anchors and actions, which parseeach iteration.

The RepeatingGroup anchor treats all iterations in the same way. To parse a semi-repetitive region containing sections that require differing treatment, you can use aDelimitedSections anchor, instead.

Example

For detailed examples and exercises using this anchor, see Getting Started withConversion Agent.

How to Define

You can define a RepeatingGroup by editing the IntelliScript. Add the nestedanchors (and optionally actions) that parse each iteration of the RepeatingGroup.

Search for Iterations

By default, a RepeatingGroup searches for iterations from the beginning to the endof its search scope (see How a Parser Searches for Anchors). Optionally, you can setthe iteration_order property for a reverse search.

In each iteration:

If the RepeatingGroup is configured with a separator, it searches for the nextseparator. Then, it searches for the anchors, which lie between a pair ofseparators.

If the RepeatingGroup is not configured with a separator, it searches only forthe anchors.

End of a RepeatingGroup

You can signal the end of a RepeatingGroup in ways such as the following:

The RepeatingGroup can continue until the end of the document.

Page 115: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 7. Anchors

104

You can insert a Marker after the RepeatingGroup. You should configure theMarker in an earlier search phase than the RepeatingGroup (this is the default).This causes the parser to search for the Marker first, and use it to limit thesearch scope of the RepeatingGroup (see Adjusting the Search Phase).

You can set the count property, which limits the search to a maximum numberof iterations.

If the RepeatingGroup does not have a separator, it ends when the parsercannot find any more iterations.

Success or Failure of a RepeatingGroup

If a RepeatingGroup cannot find the non-optional anchors in an iteration, theiteration fails.

When an iteration fails, the RepeatingGroup can either end, fail, or skip the failediteration. The behavior is as follows:

If the RepeatingGroup does not have a separator, the RepeatingGroup ends.Provided that there was at least one successful iteration prior to the failediteration, the RepeatingGroup succeeds.

If the RepeatingGroup has a separator, and the skip_failed_iterationsproperty is not selected, the RepeatingGroup fails.

If the RepeatingGroup has a separator, and the skip_failed_iterationsproperty is selected, Conversion Agent skips over the failed iteration andproceeds with the next iteration. Provided that at least one iteration succeeds,the RepeatingGroup succeeds.

Event Log of a RepeatingGroupThe Conversion Agent event log records events for every iteration of aRepeatingGroup.

If the skip_failed_iterations property is selected, the RepeatingGroup maygenerate an optional failure event (symbolized by a icon, which follows thesuccessful iterations. A failure event ( icon) may be nested within the optionalfailure. These events occur because the RepeatingGroup cannot find additionaliterations to parse. The events are normal and not a cause for concern.

For example, if the source document contains two iterations, the log may displayan optional failure and a failure because the RepeatingGroup cannot find a thirdinstance of its separator.

Basic Propertiesseparator_position

Position of the separator relative to the iterations of the RepeatingGroup.

The following table explains the options. The examples assume that theseparator is a vertical-line character ( | ).

Page 116: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 7. Anchors

105

separator_position Explanation Example

before There is a separator before each iteration (including thefirst iteration).

|1|2|3

after There is a separator after each iteration (including the lastiteration).

1|2|3|

between There is a separator between the successive iterations (notbefore the first iteration and not after the last iteration).

1|2|3

around There are separators before and after each iteration(including the first and last iterations).

|1|2|3|

separatorAn anchor (typically a Marker) that delimits the iterations.

If you leave this property empty, the RepeatingGroup does not look for adelimiter between the iterations. Instead, it assumes that an iteration isfinished when it has found all the nested anchors. It then starts to parse thenext iteration from the top of the nested anchor sequence.

You can build a complex separator by inserting a Group in the separatorproperty, instead of a Marker.

Advanced Propertiesskip_failed_iterations

This option has an effect only if the RepeatingGroup has a separator.

By default, this option is selected. This means that the RepeatingGroup skipsover a failed iteration and proceeds with the next iteration. Provided that atleast one iteration succeeds, the RepeatingGroup succeeds.

If you deselect the option, the RepeatingGroup fails if any iteration fails.

search_orderThe order in which to process the nested anchors within each iteration. Theoptions are:

search-order Explanation

top-down Processing the nested anchors in the orderthat is defined in the IntelliScript.

bottom-up Process the nested anchors in reverse order.This is useful if you need to retrieve data froma later anchor, which affects how you processan earlier anchor.

Page 117: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 7. Anchors

106

iteration_orderThe order in which to process the iterations. The options are the same as forsearch_order, but apply to the iterations rather than to the anchors within aniteration.

countThe number of iterations to run. You may enter a number, or click the browsebutton and select a data holder that contains the number. If blank, theiterations continue until the search scope is exhausted.

If count = 0, the RepeatingGroup does not search for iterations. In this case, theRepeatingGroup succeeds, but it does not produce any output.

current_iterationA data holder, where the RepeatingGroup should output the number of thecurrent iteration (select from a Schema view).

on_partial_matchThis option controls the behavior if, in a particular iteration, theRepeatingGroup finds some but not all of its nested, non-optional anchors.

In such a case, if on_partial_match has a value of fail, the iteration fails.

If on_partial_match has a value of skip, the RepeatingGroup removes the areaspanned by the successful nested anchors from its search scope and tries tofind all the nested anchors again. It may succeed on the second try.

The removal-retry procedure is repeated until the iteration succeeds, or untilthere is no longer a partial match. In the latter case, the iteration fails.

The following properties are useful in situations where the anchor must selectspecific occurrences of data holders. For an explanation, see Chapter 12, Locators,Keys, and Indexing.

source

target

For explanations of the following properties, see Standard Anchor Properties.

name

remark

disabled

optional

no_initial_phase

phase

marking

Online SamplesIn the Conversion Agent Samples folder, openProjects\Dynamic_And_RepeatingGroup\Dynamic_And_RepeatingGroup.cmw. Thesample uses a RepeatingGroup to iterate over the lines of a document.

Page 118: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 7. Anchors

107

Some lines of the source document contain a parenthesized footnote reference,such as "(1)". The RepeatingGroup contains a Group, whose purpose is to parse thefootnote and insert its content in the output.

The Group contains a Content anchor that retrieves the footnote reference andstores it in a variable. The Group then performs a RunParser action, which activatesa secondary parser. The secondary parser finds the footnote referenced by thevariable, parses it, and inserts the result in the output.

For additional samples, see the book Getting Started with Conversion Agent. Thebook contains examples of:

Various configuration of the separator property

The use of a RepeatingGroup without an empty separator property

A nested iterative structure, which is parsed by a RepeatingGroupwithin aRepeatingGroup

Searcher Component Reference

This section documents the searcher components, which find text in a sourcedocument. The searcher components are used in many locations throughoutConversion Agent, for example:

To define the location of anchors

To define delimiter characters or strings (see Chapter 5, Formats)

To define the find_what string of a Replace transformer (see Chapter 8,Transformers)

AttributeSearch

This component searches a source document for a specified attribute, which occursin an expression of the type:

AttributeName = value

or

AttributeName = "value"

The component retrieves the value.

The component is a possible setting of the value property, which belongs to theContent anchor. For more information, see the Content anchor.

Example

An HTML document contains the element:

<img src='MyPicture.gif'>

Page 119: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 7. Anchors

108

You can use AttributeSearch to retrieve the value of the src attribute. It returnsthe text MyPicture.gif .

Supported Attribute Syntax

AttributeSearch supports attribute strings containing an equals sign. Optionally,the equals sign can be surrounded by spaces. The attribute can be surrounded bydouble quotes, single quotes, or no quotes.

For example, suppose that AttributeSearch is configured to search for an attributecalled time. In all of the following examples, it returns the same value, 12:55:33.

time = 12:55:33time=12:55:33time = '12:55:33'time='12:55:33'time = "12:55:33"time="12:55:33"

Basic Propertiesatt

The attribute name.

Advanced Properties

match_caseIf selected, AttributeSearch considers the attribute name to be case sensitive.

Online Sample

In the Conversion Agent Samples folder, open Projects\Content\Content.cmw. Thesample illustrates the use of an AttributeSearch to parse a text document that hasa variable = value structure.

LearnByExample

This component learns how to search for text by examining the text location in theexample source document. It uses the parser format to interpret the sourcedocument.

For example, if the parser has a tab-delimited format, LearnByExample counts thenumber of tabs from the search start to the example text. It searches for text in thesource document that lies at the same number of tabs from the start of the searchscope.

The component is a possible settings of the value property, which belongs to theContent anchor. For more information, see the Content anchor.

Note that if the Content anchor is configured with direction = backward,Conversion Agent counts the delimiters from the end of the search scope.

Page 120: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 7. Anchors

109

Basic Properties

exampleThe text in the example source document at the anchor location.

NewlineSearch

This component searches for a newline (a linefeed character, a carriage returncharacter, or both).

Anchors can use NewlineSearch to find newline markers. A Delimiter componentcan use NewlineSearch to find newline delimiters.

OffsetSearch

This component defines the number of characters between a reference point (forexample, the end of a Marker anchor) and an anchor (for example, the beginning ofa Content anchor). The number of characters can be either predefined or (in somecomponents where OffsetSearch is used) retrieved dynamically from a dataholder.

For more information, see the Marker and Content anchors.

Basic Propertiesoffset

The number of characters between the reference point and the anchor.

In some of the locations where OffsetSearch is used (such as in a Markeranchor), Conversion Agent Studio displays a browse button next to the offsettext box. Clicking the button display a Schema view. In the Schema view, youcan select a data holder that contains the number of characters.

Advanced Properties

allow_smaller_offsetIf the offset is beyond the search scope (for example, an offset defining thelength of a Content anchor, at the end of a document), allow a smaller offset.

PatternSearch

This component searches for a string that conforms to a regular expression.

Regular expressions are a way to define a text search criterion—something like awildcard search, but with greatly enhanced syntax. For more information about theuse of regular expressions in Conversion Agent, and for detailed references, see theRegularExpression transformer in Chapter 8, Transformers.

Anchors can use PatternSearch to find markers or content. The Delimitercomponent can use PatternSearch to find delimiters. The Replace transformer canuse PatternSearch to find the text to be replaced.

Page 121: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 7. Anchors

110

Example

Suppose you want to define the string %%% (containing one or more % symbols) as adelimiter. Within the Delimiter component, you can use PatternSearch with thefollowing regular expression:

%+

In another example, suppose you want to define a comma and a semicolon asalternative delimiters, at the same level of the delimiter hierarchy. You can use theregular expression:

[,;]

Basic Propertiespattern

The regular expression.

Advanced Properties

escape_sequenceA prefix in the source document, such as a backslash character \, which causesthe parser to ignore an instance of the pattern.

SegmentSearch

This component searches for opening and closing markers in a text string. Itreturns the segment from the opening marker to the closing marker (including themarkers themselves).

The subcomponent is used in the Replace transformer to find text that is to bereplaced (see Chapter 8, Transformers).

Basic PropertiesOpening

The search criterion for the opening marker. The options are searchercomponents (TextSearch, PatternSearch, NewlineSearch, or OffsetSearch).

ClosingThe search criterion for the closing marker.

TextSearch

This component searches for an explicit string.

In some locations where TextSearch is used, it can also search for a string that isdefined dynamically, such as a string that the parser retrieves from the sourcedocument.

Page 122: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 7. Anchors

111

Anchors can use TextSearch to find markers. The Delimiter component can useTextSearch to find delimiters. The Replace transformer can use TextSearch to findtext that is to be replaced.

Example

To define the string percent-percent-tab as a delimiter, create a Delimiter componentand set its search property to TextSearch. In the TextSearch/text property, type:

%%

Then press Ctrl+a, and type the ASCII code of a tab character:

009

Specifying a Search String Dynamically

In some of the locations where TextSearch is used (such as in a Delimitercomponent or a Marker anchor), Conversion Agent Studio displays a browsebutton to the right of the text box. Clicking the button display a Schema view. Inthe Schema view, you can select a data holder that contains the search text.

For example, suppose that you want to find repeated instances of the first word ina document. You can define a Content anchor that retrieves the first word andstores it in a variable. You can then define Marker anchors that use TextSearch tofind other instances of the variable's value.

Basic Propertiestext

The string to find. In locations where dynamic search is supported, you canspecify a data holder that contains the string (click the Browse button, andselect from a Schema view).

To type control characters, press Ctrl+a and enter their ASCII codes.

The IntelliScript displays a tab as «. It displays other special characters as anASCII code prefixed with a dot, for example, 010 for a newline.

Page 123: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 7. Anchors

112

Advanced Properties

match_caseIf selected, text is required to match the text property exactly, with the sameupper and lower-case letter.

escape_sequenceA prefix in the source document, such as a backslash character \, which causesthe parser to ignore an instance of the string. In locations where dynamicsearch is supported, you can specify a data holder that contains the escapesequence (click the Browse button, and select from a Schema view).

Online SampleIn the Conversion Agent Samples folder, openProjects\Dynamic_And_RepeatingGroup\Dynamic_And_RepeatingGroup.cmw.

A Marker anchor, in the GetRemarkParser component of this sample, uses adynamically defined TextSearch to find a footnote at the end of the sourcedocument. For a full description of the sample, see the RepeatingGroup anchor.

TypeSearch

This component searches for an anchor of a specified XSD data type.

The component is a possible settings of the value property, which belongs to theContent anchor. For more information, see the Content anchor.

Basic Propertiesval_type

The XSD data type.

Anchor Subcomponent Reference

This section describes subcomponents that you can assign as the values of certainanchor properties.

AddField

This is an option of the HtmlForm property field_filters. It adds a field to besubmitted with an HTML form.

Optionally, you can define multiple values. The HtmlForm anchor submits allpossible combinations of the values, with the values of other fields, to the webserver.

Page 124: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 7. Anchors

113

Basic Properties

field_nameName of the field.

filterThe way to generate the field values. The options are:

filter Explanation

ExcludeValues (Not applicable in the AddField component).

UseDataHolder Assigns a value that is contained in a data holder (selectfrom a Schema view). To assign multiple values, use amultiple-occurrence data holder.

UseValues Assigns one or more explicit values.

Connect

This component specifies a correspondence between two data holders. The twodata holders must have the same XSD data type.

Connect is used in the EmbeddedParser anchor to specify where a secondary parsershould store its output in the output of the main parser. It is used inEmbeddedSerializer to specify how the input data holders of a secondary serializerare related to the input data holders of the main serializer. It is used inEmbeddedMapper for a similar purpose, on both the input and output data holders.

Example

A secondary parser outputs an XML element called ID. You want the main parserto store this result in a variable called VarID. You can connect ID to VarID.

For an additional example, see the EmbeddedSerializer component (Chapter 10,Serializers).

Basic Properties

data_holderA data holder that is referenced in the main parser or serializer (select from aSchema view).

embedded_data_holderA data holder that is referenced in the secondary parser or serializer (selectfrom a Schema view).

Page 125: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 7. Anchors

114

ImageClick

This subcomponent submits a form by simulating a user who clicks an image inthe HTML form.

The subcomponent is a possible value of the click property in the HtmlFormanchor.

The pixel_x and pixel_y properties are useful if the image has an area map(hotspot links). The properties indicate the location in the image where the userclicked.

image_nameThe name attribute of the image, which is specified in the HTML code.

pixel_xThe x-coordinate where the user clicked (in pixels from the left edge).

pixel_yThe y-coordinate where the user clicked (in pixels from the top edge).

ModifyField

This is an option of the HtmlForm property field_filters. It modifies the value of afield that is defined in the HTML code of a form.

Optionally, you can define multiple values. The HtmlForm anchor submits allpossible combinations of the values, with the values of other fields, to the webserver.

Basic Propertiesfield_name

Name of the field.

filterThe way to generate the field values. The options are:

filter Explanation

ExcludeValues Removes previously defined values.

UseDataHolder Assigns a value that is contained in a data holder (selectfrom a Schema view). To assign multiple values, use amultiple-occurrence data holder.

UseValues Assigns one or more explicit values.

Page 126: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 7. Anchors

115

RemoveField

This is an option of the HtmlForm property field_filters. It removes a field that isdefined in the HTML code of a form.

Basic Propertiesfield_name

Name of the field.

SegmentIndex

This is an option of the HtmlForm property part_to_submit, which is used todistribute the form submissions between several computers.

SegmentIndex divides the set of field-value combinations into a specified numberof portions, and specifies which portion to submit. On another computer, you canconfigure a SegmentIndex that submits a different portion.

Basic Propertiesparts

The number of portions into which the combinations should be divided.

selected_partThe portion to submit (1 means the first portion, etc.)

SegmentSize

This is an option of the HtmlForm property part_to_submit, which is used todistribute the form submissions between several computers.

SegmentSize divides the set of field-value combinations into portions of a specifiedsize, and specifies which portion to submit. On another computer, you canconfigure a SegmentSize that submits a different portion.

Basic Properties

part_sizeThe number of combinations in each portion (by default, 2).

selected_partThe portion to submit (1 means the first portion, etc.).

SubmitAll

This subcomponent is an option of the HtmlForm property part_to_submit, which isused to distribute the form submissions between several computers.

This subcomponent submits all combinations of the field values from the samecomputer.

Page 127: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 7. Anchors

116

SubmitClick

This subcomponent submits a form by simulating a user who clicks a submitbutton.

The subcomponent is a possible value of the click property in the HtmlFormanchor.

Basic Properties

submit_nameThe name of the submit button.

Page 128: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 8. Transformers

117

Transformers

Transformers are components that modify data.

You can use transformers within components such as anchors, serializationanchors, and actions. The transformers modify the output of the components. Forexample, if you use a transformer within a Content anchor, it modifies the data thatthe anchor extracts from the source document.

You can also use transformers as document processors or as stand-alone, runnablecomponents. In those cases, the transformers modify the complete content of adocument.

You can use the out-of-the-box transformers supplied with Conversion Agent, oryou can define custom transformers.

This chapter explains how to use transformers and provides detailed informationon the transformers available in Conversion Agent.

Defining Transformers

You can define transformers in the following locations of the IntelliScript:

In the transformers property of an anchor or a serialization anchor

In the default_transformers property of a format or of a serializer

In the ProcessByTransformers document processor

In the transformers property of certain actions

At the global level, as a stand-alone, runnable component that modifies asource document.

The following sections explain the use of transformers in each of these locations.

Using Transformers in Anchors

You can use transformers in an anchor that creates XML output, such as Content.In the IntelliScript, you should nest the transformer components within thetransformers property of the anchor.

The input of a transformer is the raw output of the anchor, before the anchorinserts the output in a data holder.

8

Page 129: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 8. Transformers

118

For example, suppose you are parsing the following source document:

First name: RonLast name: Lehrer

You want to create XML output in ALL CAPS, like this:

<Person><FirstName>RON</FirstName><LastName>LEHRER</LastName>

</Person>

To do this, you can configure the Content anchors, which retrieve the strings Ronand Lehrer, with the ChangeCase transformer.

Sequences of Transformers

You can configure an anchor with a sequence of transformers. Each transformermodifies the output of the preceding transformer.

In the Ron Lehrer example, suppose you want the following output:

<Person><FirstName>- RON -</FirstName><LastName>- LEHRER -</LastName>

</Person>

To do this, you might configure the Content anchors with the ChangeCase andAddString transformer, which change the case and add the hyphens, respectively.

Default Transformers

Very often, you want the same transformers to run on all the Content anchors in aparser. You can configure the format component of the parser with defaulttransformers. This saves you the trouble of adding the same transformers to eachanchor in the parser.

To do this, you should nest the transformers in the default_transformers propertyof the format (see Chapter 5, Formats).

Many of the predefined format components include default transformers. Forexample, the HtmlFormat component has default transformers that remove HTMLtags from the output and convert HTML entities to plain text. If you wish tochange the default transformers, you can edit the default_transformers propertyof the format.

If an anchor has its own transformers, they run after the default transformers.

You can cancel the default transformers for particular anchors. To do this, set theignore_default_transformers property of the anchor.

Page 130: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 8. Transformers

119

Using Transformers as Document Processors

You can run a transformer or a sequence of transformers as a document processor.

For example, you might run the RemoveTags transformer as a processor on anHTML document. The transformer removes the HTML tags from the document,before a parser starts to search for anchors in the document.

To do this, configure the parser format component with theProcessByTransformers document processor, and nest the transformers within thecomponent.

Using Transformers in Serialization Anchors

You can use transformers in serializers that write to the output document, such asContentSerializer. The transformers modify the data before the serializer writes itto the document.

For example, suppose a ContentSerializerwrites the content of a data holdercalled DoctorName to an output document. You might configure theContentSerializer with an AddString transformer, which adds the prefix "Dr. "to the content. Suppose the DoctorName data holder has the following content:

Albert Schweitzer

The transformer modifies the content, resulting in the following output:

Dr. Albert Schweitzer

You can add transformers to the default_transformers property of a serializer.The transformers that you add here run in all the ContentSerializer serializationanchors before they write to the output document.

Using Transformers in Actions

Certain actions, such as SetValue and Map, let you apply transformers to theiroutput. For details, see Chapter 9, Actions.

Using Transformers as Runnable Components

You can define a transformer at the global (top) level of the IntelliScript. You canthen run the transformer as a stand-alone component, which modifies a sourcedocument. The transformer runs as the startup component, not within a parser or aserializer.

For instructions on defining global components, see the chapter on Using theIntelliScript Editor in the book Conversion Agent Studio in Eclipse.

To run a globally-defined transformer in Conversion Agent Studio:

1. Set the transformer as the startup component.

2. Choose the Conversion Agent > Run command.

Page 131: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 8. Transformers

120

3. You are prompted to select the source document, which the transformershould process.

4. Conversion Agent Studio displays the Events view, where you can review theevents.

5. The output file is stored in the Results folder of the project. It has a filenamesuch as Transformation of <filename>.txt, where <filename> is the sourcefile. You can open the file in any suitable application.

Standard Transformer Properties

In this section, we review certain properties that are found in many transformers.For additional properties that are specific to particular transformers, see theTransformer Component Reference.

nameA name that you assign to the transformer. Conversion Agent includes thename in the event log. This can help you find an event that was caused by theparticular transformer.

remarkA comment describing the transformer.

disabledIf selected, Conversion Agent ignores the transformer. This is useful for testingand debugging, or for making minor modifications in a project withoutdeleting the existing transformers.

optionalThis property means that if the transformer fails, its parent component (suchas an anchor in which the transformer is nested) does not fail.

If you deselect the optional property, and the transformer fails, it causes theparent component to fail.

Transformer Quick Reference

AbsURLConverts a relative path (such as a relative URL) to an absolute path.

AddEmptyTagsTransformerAn XML-to-XML transformer, which adds empty elements if elements aremissing from the XML.

AddStringAdds strings before and/or after the input text.

Base64DecodeConverts the base64 MIME encoding to a binary string.

Page 132: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 8. Transformers

121

Base64EncodeConverts a binary string to the base64 MIME encoding.

BidiConvertReverses strings in languages that are written from right to left.

BigEndianUniToUniConverts big-endian Unicode to little-endian.

CDATADecodeDecodes a CDATA section of an XML document.

CDATAEncodeEncodes a CDATA section of an XML document.

ChangeCaseChanges the text to upper case or lower case.

CreateGuidGenerates a GUID identifier.

CreateUUIDGenerates a UUID identifier.

DateFormatFormats a date or time.

Dos96HebToAsciiConverts Hebrew text from the MS DOS code page to the Windows code page.

EbcdicToAsciiConverts EBCDIC to ASCII text.

EncodeAsUrlEncodes spaces and special characters, as required in a URL.

EncoderConverts text from one code page to another.

ExternalTransformerRuns a custom transformer that is implemented as a DLL.

FormatNumberFormats a number by adding a sign, decimal point, leading and trailing zeros,and a unit.

FromFloatConverts a floating point number from binary to an ASCII stringrepresentation.

FromIntegerConverts an integer from binary to an ASCII string representation.

FromPackDecimalConverts a number from packed decimals to an ASCII string representation.

Page 133: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 8. Transformers

122

FromSignedDecimalConverts a number from signed decimals to an ASCII string representation.

hebrewBidiReverses Hebrew text from RTL to LTR.

HebrewDosToWindowsTransformerConverts from the Hebrew MS-DOS to Windows code page.

HebrewEBCDICOldCodeToWindowsConverts Hebrew text from EBCDIC to the Windows-1255 code page.

hebUniToAsciiConverts Hebrew text from Unicode UTF-16 to the Windows-1255 code page.

hebUtf8ToAsciiConverts Hebrew text from UTF-8 to the Windows-1255 code page.

HtmlEntitiesToASCIIConverts HTML entities to plain text.

HtmlProcessorNormalizes whitespace in an HTML document.

InjectFPInserts a decimal point in a number.

InjectStringInserts a string into text.

JavaTransformerRuns a custom transformer that is implemented in Java.

LookupTransformerLooks up a value in a table.

NormalizeClosingTagsChanges <tag /> to <tag></tag> in XML input.

ODBCLookupReplaces the text with data retrieved from a database.

RegularExpressionModifies the text by using a regular expression.

RemoveMarginSpaceTrims leading and trailing space characters.

RemoveRtfFormattingRemoves all RTF formatting characters within the text.

RemoveTagsRemoves HTML tags.

ReplaceReplaces or deletes specified text.

Page 134: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 8. Transformers

123

ResizePads text to a specified size.

ReverseTransformerReverses a string.

RtfProcessorNormalizes RTF code.

RtfToASCIIConverts RTF input to plain text.

SubStringReturns a substring of the input.

ToFloatConverts a floating point number from an ASCII string representation tobinary.

ToIntegerConverts an integer from an ASCII string representation to binary.

ToPackDecimalConverts a number from an ASCII string representation to packed decimals.

ToSignedDecimalConverts a number from an ASCII string representation to signed decimals.

TransformationStartTimeOutputs the date and/or time at which the data transformation startedrunning.

TransformByParserRuns a parser on the input text, replacing segments of the text.

TransformByServiceRuns a Conversion Agent service on the input text.

TransformerPipelineApplies a sequence of transformers to the text.

WestEuroUniToAsciiConverts text in western-European languages from Unicode UTF-16 to theWindows-1252 code page.

XSLTTransformerApplies an XSLT transformation to XML input text.

Transformer Component Reference

This section documents the transformers that are available in Conversion Agent.

Page 135: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 8. Transformers

124

AbsURL

This transformer converts a relative file path or URL to an absolute path.

For example, if the input is test.html and the base URL ishttp://www.example.com, the output is http://www.example.com/test.html.

If the input is an absolute path, the transformer does not alter it.

Basic PropertiesBaseUrl

The base path or URL.

Advanced PropertiesFor explanations of the following properties, see Standard Transformer Properties:

name

remark

disabled

optional

AddEmptyTagsTransformer

This is an XML to XML transformer. The transformer checks if all the elementsdefined in the XSD schema exist in the XML input. If not, it adds empty elementsto the XML.

Basic Properties

root_elementThe root element of the XML (select from the Schema view).

Advanced PropertiesFor explanations of the following properties, see Standard Transformer Properties:

name

remark

disabled

AddString

This transformer adds strings before and/or after the input text.

Basic Properties

preThe string to add before the text.

Page 136: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 8. Transformers

125

postThe string to add after the text.

Advanced Properties

For explanations of the following properties, see Standard Transformer Properties:

name

remark

disabled

optional

Online Sample

In the Conversion Agent Samples folder, openProjects\Transformers_Example\Transformers_Example.cmw. The first Contentanchor in the parser is configured with an AddString transformer.

Base64Decode

This transformer converts the base64 MIME encoding to a binary string.

Advanced Properties

toleranceThis property controls how the transformer processes whitespace characters ornon-base64 sections of its input. The default is ignore_white_spaces.Alternatively, you can choose ignore_none or ignore_non_base64.

For explanations of the following properties, see Standard Transformer Properties:

name

remark

disabled

optional

Base64Encode

This transformer converts a binary string to the base64 MIME encoding. This isuseful, for example, when you want to save binary data in XML.

Advanced Properties

For explanations of the following properties, see Standard Transformer Properties:

Page 137: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 8. Transformers

126

name

remark

disabled

optional

BidiConvert

This transformer reverses strings that are written in right-to-left (RTL) languages,such as Hebrew and Arabic.

The input must be in the RTL format. The output is LTR.

Advanced PropertiesFor explanations of the following properties, see Standard Transformer Properties:

disabled

BigEndianUniToUni

This transformer converts big-endian Unicode to little-endian.

The transformer is supported for backwards compatibility with projects that havebeen upgraded from previous Conversion Agent versions. It is not available foruse in new projects. Instead, you should set the byte order in the project properties(see Encoding Page in Chapter 13, Project Properties).

CDATADecode

This transformer decodes a CDATA section of an XML document. For example, itconverts

<![CDATA[100 < 200]]>

to

100 < 200

Of course, if you write the result to XML, Conversion Agent re-encodes it using thestandard XML encoding:

100 &lt; 200

Advanced Properties

For explanations of the following properties, see Standard Transformer Properties:

Page 138: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 8. Transformers

127

disabled

optional

CDATAEncode

This transformer encodes a CDATA section of an XML document. For example, itconverts

100 < 200

to

<![CDATA[100 < 200]]>

Advanced PropertiesFor explanations of the following properties, see Standard Transformer Properties:

disabled

optional

ChangeCase

The ChangeCase transformer changes text to upper case, lower case, or first-letter-capitalized.

The transformer works on English characters. It may fail on some non-Englishcharacters. For example, it does not convert the lower-case German ß to the upper-case SS.

Basic Properties

case_typeThe output case (all_caps, all_lower, or first_cap).

Advanced PropertiesFor explanations of the following properties, see Standard Transformer Properties:

name

remark

disabled

Online Sample

In the Conversion Agent Samples folder, openProjects\Transformers_Example\Transformers_Example.cmw. The third Contentanchor in the parser is configured with a ChangeCase transformer.

Page 139: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 8. Transformers

128

CreateGuid

This transformer generates a GUID identifier. The GUID is guaranteed to beunique on every generation.

The transformer ignores its input. The GUID is not related to the input in any way.

The GUIDs may have a non-standard format on Unix platforms. For a fully Unix-compatible transformer, use CreateUUID instead.

CreateUUID

This transformer generates a UUID identifier. The UUID is guaranteed to beunique on every generation and is compatible with both Windows and Unixplatforms.

The transformer ignores its input. The GUID is not related to the input in any way.

Advanced Properties

For explanations of the following properties, see Standard Transformer Properties:

name

remark

disabled

optional

DateFormat

This transformer formats a date or time.

Example

Suppose you configure a DateFormat transformer with:

input_format = "d/M/yy"output_format = "MM/dd/yyyy"

If the input is

13/3/05

the output is

03/13/2005

Supported Formats

The transformer uses the ICU conventions to represent the date and time format.The following table lists the symbols that you can use in the format patterns. Foradditional information, please see:

http://icu.sourceforge.net/apiref/icu4c/classSimpleDateFormat.html

Page 140: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 8. Transformers

129

Patternsymbol

Meaning Type Examples

G Era designator Text AD

y Year Number 1996

u Extended year Number -200 (for 201 BC)

M Month in year Text or number July07

d Day in month Number 10

h Hour in AM/PM (1-12) Number 12

H Hour in day (0-23) Number 0

m Minute in hour Number 30

s Second in minute Number 55

S Fractional second Number 978

E Day of week Text Tuesday

e Day of week (local 1-7) Number 2

D Day in year Number 189

F Day of week in month Number 2 (for the 2nd Wednesday inJuly)

w Week in year Number 27

W Week in month Number 2

a AM/PM marker Text PM

k Hour in day (1-24) Number 24

K Hour in AM/PM (0-11) Number 0

z Time zone Time Pacific Standard Time

Z Time zone (RFC 822) Number -0800

v Time zone (generic) Text Pacific Time

g Julian day Number 2451334

A Milliseconds in day Number 69540000

' ' The text within single quotes isinterpreted as a literal string

Text 'Today is 'dd/MM/yyyygenerates output such asToday is 15/03/2005

Page 141: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 8. Transformers

130

Patternsymbol

Meaning Type Examples

'' Literal single quote Text 'o''clock'generates the outputo'clock

The count of pattern symbols further determines the format:

For text: Four or more pattern symbols means to use the full form. Fewer thanfour means to use a short or abbreviated form if it exists. For example, if EEEEproduces Monday , EEE produces Mon).

For numbers: The number of pattern symbols is the minimum number ofdigits. Shorter numbers are zero-padded. For example, if m produces 6, mmproduces 06).

For years: The two-digit year is yy, and the four-digit year is yyyy. Forexample, if yy produces 05, yyyy produces 2005.

For months: If M produces 1, then MM produces 01, MMM produces Jan, and MMMMproduces January.

All non-alphabetic characters are interpreted as literals, even if they are notenclosed in single quotes. For example, dd/MM/yyyy HH:mm produces 15/03/200513:15 (containing the /, space, and : characters).

Basic Propertiesinput_format

The format of the input date, for example, d/M/yy. You can type the format, orbrowse to a data holder that contains the format.

output_formatThe format of the output date, for example, MM/dd/yyyy. You can type theformat, or browse to a data holder that contains the format.

Advanced Properties

For explanations of the following properties, see Standard Transformer Properties:

disabled

optional

Dos96HebToAscii

This transformer converts Hebrew text from the MS DOS code page to theWindows-1255 code page.

Page 142: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 8. Transformers

131

EbcdicToAscii

This transformer converts EBCDIC to ASCII text.

EncodeAsUrl

This transformer encodes spaces and special characters, as required in a URL. Thecharacters are encoded as hexadecimal, preceded by a % symbol.

For example, the transformer converts

http://www.example.com?name=Ron Lehrer

to

http://www.example.com?name=Ron%20Lehrer

Advanced Properties

For explanations of the following properties, see Standard Transformer Properties:

name

remark

disabled

Online Sample

In the Conversion Agent Samples folder, openProjects\Transformers_Example\Transformers_Example.cmw. The fourth Contentanchor in the parser is configured with an EncodeAsUrl transformer.

Encoder

This transformer converts text from one code page to another.

Basic Properties

input_code_pageThe input code page (select from the list).

output_code_pageThe output code page (select from the list).

Advanced Properties

For explanations of the following properties, see Standard Transformer Properties:

Page 143: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 8. Transformers

132

name

remark

disabled

optional

ExternalTransformer

This component lets you run a custom transformer, which is implemented as aC++ DLL.

This component is supported for backwards compatibility with existing customtransformers. For new custom transformers, we recommend using the approach describedin the chapter on External Components in the Conversion Agent Engine Developer'sGuide.

Creating a Custom Transformer

To create a DLL that is suitable to run with the ExternalTransformer component,follow these steps:

The instructions are for the Microsoft Visual C++ compiler, running on a MicrosoftWindows platform. For compilation instructions on non-Windows platforms,please contact SAP support.

1. Copy the online-sample file ExternalTransformerExample.c. You can find thefile under the Conversion Agent installation folder, at the location:

Samples\SDK\ExternalTransformer\ExternalTransformerExample.c

2. Using the Visual C++ compiler, create a Win32 dynamic-link library project,and insert the C file into the project.

3. The C file contains the following function, which implements the transformerprocessing:

__declspec(dllexport) int transform(const char* in, char** out)

In the sample implementation, the function reverses the text. Replace thesample code with your implementation.

4. Implement the buffer-release function, which is also in the sample code:

__declspec(dllexport) void release_buf(char* buf)

5. Compile the DLL.

6. Store the DLL in the externLibs\user subfolder of your Conversion Agentinstallation folder.

You may then use the DLL in the ExternalTransformer component.

Optionally, you can add the transformer to the drop-down component list thatConversion Agent displays (for instructions, see the chapter about Using theIntelliScript Editor in the book Conversion Agent Studio in Eclipse).

Page 144: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 8. Transformers

133

Basic Properties

import_dllBrowse to the custom transformer DLL in the externLibs\user folder.

Advanced Properties

For explanations of the following properties, see Standard Transformer Properties:

name

remark

disabled

optional

FormatNumber

This transformer formats a number by adding a sign, decimal point, leading ortrailing zeros, and unit.

Basic Properties

signAdds a plus or minus sign at the beginning or end of the number. The optionsare un_signed (deletes a sign if present), leading_sign, trailing_sign,negative sign only, and as in source (does not change the input sign).

insert_decimal_pointSets the decimal point symbol. The options are none, point, and comma.

unit_typeAdds a unit after the number. Select a unit such as meter, cm, mm, or inch. If youdo not want to add a unit, select undefined.

size_of_integer_partPads the integer part with leading zeros to the indicated size.

number_of_decimalsPads the decimal part with trailing zeros to the indicated size.

Advanced PropertiesFor explanations of the following properties, see Standard Transformer Properties:

Page 145: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 8. Transformers

134

name

remark

disabled

optional

FromFloat

This transformer converts a floating point number from binary to an ASCII stringrepresentation.

Advanced Properties

sizeSize of the number: single_precision_32_bit or double_precision_64_bit .

For explanations of the following properties, see Standard Transformer Properties:

name

remark

disabled

optional

FromInteger

This transformer converts an integer from binary to an ASCII string representation,in decimal, octal, or hexadecimal.

Basic Propertiessize

Size in bytes of the binary representation. The supported values are 1 to 8.

Advanced Properties

signedIf selected, the transformer adds a sign to the number.

to_baseThe base of the output: decimal, octal, hexadecimal, lowercase hexadecimal.

For explanations of the following properties, see Standard Transformer Properties:

Page 146: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 8. Transformers

135

name

remark

disabled

optional

FromPackDecimal

This transformer converts a number from packed decimals to an ASCII stringrepresentation.

Advanced Properties

For explanations of the following properties, see Standard Transformer Properties:

name

remark

disabled

optional

FromSignedDecimal

This transformer converts a number from signed decimals to an ASCII stringrepresentation.

Advanced Properties

insert_sign_symbolAdds a sign symbol (plus or minus) before or after the number. The optionsare no, before, and after.

For explanations of the following properties, see Standard Transformer Properties:

name

remark

disabled

optional

hebrewBidi

This transformer reverses Hebrew text from RTL to LTR. It is similar toBidiConvert, but it uses a different algorithm for the reversal, which may produceslightly different results.

Page 147: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 8. Transformers

136

HebrewDosToWindowsTransformer

This transformer converts Hebrew documents from the MS DOS Hebrew codepage to the Windows Hebrew code page.

HebrewEBCDICOldCodeToWindows

This transformer converts Hebrew text from EBCDIC to the Windows-1255 codepage.

hebUniToAscii

This transformer converts Hebrew text from Unicode UTF-16 to the Windows-1255code page.

hebUtf8ToAscii

This transformer converts Hebrew text from Unicode UTF-8 to the Windows-1255code page.

HtmlEntitiesToASCII

This transformer converts HTML entities to plain text. For example, it converts&copy; or &#169; to a copyright symbol (© ).

Supported EntitiesThe transformer supports the ISO 8859-1 (Latin-1) entities that are defined in theHTML 4.0 reference (http://www.w3.org/TR/1998/REC-html40-19980424/sgml/entities.html). The supported entities include:

&amp;, &lt;, &gt;, and &quot; (& < > ", respectively)

Numeric character codes &#0; to &#255;

Entities for Latin-1 characters (&nbsp; = non-breaking space, &copy; =copyright, etc.)

The transformer does not support extended characters (codes > 255 or non-Latin-1characters).

Output Encoding for Upper-ASCII CharactersIf the transformer output contains upper-ASCII characters, be sure to select anoutput encoding that supports the characters, such as Windows-1252 or UTF-8.

If the output is XML, we recommend that you include an encoding attribute in theXML processing instruction. Otherwise, Conversion Agent Studio may be unableto display the characters.

To set the encoding, see Encoding Page in Chapter 13, Project Properties.

Page 148: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 8. Transformers

137

Advanced Properties

For explanations of the following properties, see Standard Transformer Properties:

name

remark

disabled

HtmlProcessor

This transformer (which is also available as a format preprocessor, see Chapter 5,Formats) normalizes whitespace according to HTML conventions. It converts anysequence of tabs, line breaks, and space characters to a single space character.

You can use this transformer to normalize whitespace in any type of text. It is notrestricted to HTML text.

InjectFP

This transformer inserts a decimal point at a specified location in a number. Forexample, the transformer can convert 12345 to 123.45.

Basic Properties

digits_after_decimal_pointThe number of digits after the decimal point.

Advanced PropertiesFor explanations of the following properties, see Standard Transformer Properties:

name

remark

disabled

optional

InjectString

The InjectString transformer inserts a string into text.

Basic Properties

injection_placeThe location in the text to insert the string (0 to insert the string before thetext).

string_to_injectThe string to insert.

Page 149: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 8. Transformers

138

Advanced Properties

For explanations of the following properties, see Standard Transformer Properties:

name

remark

disabled

JavaTransformer

This component lets you run a custom transformer, which is implemented in Java.

This component is supported for backwards compatibility with existing customtransformers. For new custom transformers, we recommend using the approach describedin the chapter on External Components in the Conversion Agent Engine Developer'sGuide.

Creating a Custom TransformerTo implement a custom transformer in Java, follow these steps:

1. Create a new Java project and package, for example, named MyTransformer

2. Create a class that implements a static method with the following syntax. Themethod can have any name.

public static byte[] Transform(byte[] in)

3. Create a jar file containing the class.

4. Store the jar file in the externLibs\user subfolder of your Conversion Agentinstallation folder.

You may then use the jar file in the JavaTransformer component.

Optionally, you can add the transformer to the drop-down component list thatConversion Agent displays (for instructions, see the chapter about Using theIntelliScript Editor in the book Conversion Agent Studio in Eclipse).

Example

The following example is a transformer that changes text to upper case.

package MyTransformer;

public class TransformerTest{public static byte[] Transform(byte[] in){

String str = new String(in);String ret = str.toUpperCase();return ret.getBytes();

Page 150: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 8. Transformers

139

}}

Basic Properties

java_classThe path of the Java class, for example, MyTransformer/TransformerTest .

methodThe method to run, for example, Transform.

Advanced PropertiesFor explanations of the following properties, see Standard Transformer Properties:

disabled

optional

LookupTransformer

This transformer looks up a value in a table.

Example

You can configure a LookupTransformer to look up values in a table such as thefollowing:

key value

1 George Washington

2 John Adams

3 Thomas Jefferson

4 James Madison

If the input to the transformer is 3, the transformer outputs the value ThomasJefferson.

Basic Propertieslook_at

Under this property, you can define the lookup table. You can select one of thefollowing values:

Page 151: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 8. Transformers

140

look_at Explanation

InlineTable Lets you define the table in the IntelliScript.

XMLLookupTable Lets you specify an XML file, which contains the table data.

If you use the same lookup table repeatedly, you should consider defining theInlineTable or the XMLLookupTable component at the global level of theIntelliScript (see Defining a Global Component in the book Conversion AgentStudio in Eclipse). You can then reference the table by name in the look_atproperty.

Advanced PropertiesFor explanations of the following properties, see Standard Transformer Properties:

optional

NormalizeClosingTags

For XML input, this transformer removes shorthand closing tags from emptyelements. It changes <tag /> to <tag></tag> .

The transformer does not correct incorrect XML. It converts well-formed XMLfrom one style of closing tag to another.

Advanced Properties

For explanations of the following properties, see Standard Transformer Properties:

name

remark

disabled

ODBCLookup

The ODBCLookup transformer uses the input text to query a database. It replaces thetext with the query result.

Basic Properties

db_connectionThe database connection. The value is an ODBC_Text_Connectionsubcomponent, which specifies a DSN, user name, etc.

Advanced Properties

queryA SQL SELECT or EXEC query that retrieves the data from the database. Use ? torepresent the input text, for example:

Page 152: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 8. Transformers

141

SELECT Name FROM Employees WHERE Id = ?

The query must retrieve a single field, which is the transformer output.

For explanations of the following properties, see Standard Transformer Properties:

disabled

optional

RegularExpression

The RegularExpression transformer does a pattern-match search on the input text.It replaces the found text with a specified string.

Regular ExpressionsThe transformer uses a regular expression to define the search pattern. Regularexpressions are a way to define a search criterion—something like a wildcardsearch, but with greatly enhanced syntax.

For detailed information about regular expressions, see the book Mastering RegularExpressions by Jeffrey E. F. Friedl and Andy Oram (O'Reilly, 1997). You can also getcopious information by searching for regular expressions on the Internet.

Conversion Agent uses the Regex++ implementation of regular expressions,copyright (c) 1998-2003 by Dr. John Maddock (Version 3.12, 18 April 2000). Forinformation about this implementation, seehttp://www.boost.org/libs/regex/doc/index.html. For a summary of the regularexpression syntax supported by Regex++, seehttp://www.boost.org/libs/regex/doc/syntax.html.

For another use of regular expressions in Conversion Agent, see the PatternSearchcomponent in Chapter 7, Anchors.

Example

Suppose you want to replace a string of lower-case letters, beginning and endingwith AA, with the single character Z. To do this, use a RegularExpressiontransformer configured with:

exp = AA[a-z]+AAreplacement = Z

This replaces AAexampleAA with Z .

Basic Properties

expA regular expression for the search criterion.

replacementThe replacement text.

Page 153: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 8. Transformers

142

Advanced Properties

For explanations of the following properties, see Standard Transformer Properties:

name

remark

disabled

optional

Using Parentheses and Parameters to Preserve Portions of the Found TextIn the exp property, you can enclose portions of the regular expression inparentheses. In the replacement property, you can use:

$0 to identify the entire found text that matches the regular expression

$1 to identify the substring that matches the first parenthesized portion of theregular expression

$2 to identify the substring that matches the second parenthesized portion ofthe regular expression

And so forth

For example, suppose you set:

exp = abc([0-9]+)(def)replacement = $1

This replaces abc5624def with 5624.

Alternatively, suppose you set:

exp = abc([0-9]+)(def)replacement = $2ZYX$1

This replaces abc5624def with defZYX5624.

RemoveMarginSpace

This transformer deletes leading and trailing space characters from the text.

Advanced PropertiesFor explanations of the following properties, see Standard Transformer Properties:

name

remark

disabled

RemoveRtfFormatting

The transformer removes RTF formatting instructions from the text.

Page 154: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 8. Transformers

143

Advanced Properties

For explanations of the following properties, see Standard Transformer Properties:

name

remark

disabled

RemoveTags

This transformer removes HTML tags from the input text.

It replaces the tags at internal locations of the text with a separator string, such as aspace character. It does not insert the separator string at the beginning or end ofthe text. Adjacent multiple tags are transformed into a single separator.

Basic Properties

replace_withThe separator string.

Advanced Properties

For explanations of the following properties, see Standard Transformer Properties:

name

remark

disabled

optional

Replace

This transformer finds and replaces strings in the input text. Leaving thereplace_with property empty deletes the found text.

Basic Propertiesfind_what

The text to find. The value is one of the following searcher components (seethe Searcher Component Reference in Chapter 7, Anchors):

NewlineSearch: Finds a newline character.PatternSearch: Finds text that matches a regular expression.SegmentSearch: Finds a segment from a specified opening marker to a

closing marker.TextSearch: Finds a specified string.

replace_withThe replacement string.

Page 155: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 8. Transformers

144

Advanced Properties

occurrenceSpecifies which occurrences to replace: all, first, or last.

For explanations of the following properties, see Standard Transformer Properties:

name

remark

disabled

optional

Online Sample

In the Conversion Agent Samples folder, openProjects\Transformers_Example\Transformers_Example.cmw. The second and fifthContent anchors in the parser are configured with Replace transformers.

Resize

This transformer fits the input text to a specified size. It pads or truncates the textas required.

Basic Properties

sizeThe desired size. Type an integer, or click the browse button and select a dataholder that contains an integer.

padding_characterThe padding character, for example " " (a space character). Type the character,or click the browse button and select a data holder that contains a character.

alignThe text alignment within the resized string. The options are left (padding ortrimming is on the right) and right (padding or trimming is on the left).

ReverseTransformer

This transformer reverses a string. For example, it transforms 1234 to 4321.

RtfProcessor

This transformer normalizes RTF code. It is also available as a format preprocessor(see Chapter 5, Formats).

Page 156: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 8. Transformers

145

RtfToASCII

This transformer converts RTF input to plain text. It removes RTF control wordsfrom the text.

Advanced Properties

For explanations of the following properties, see Standard Transformer Properties:

name

remark

disabled

SubString

This transformer returns a substring of the input, starting and ending at specifiedlocations.

Basic Properties

beginThe start location (the beginning of the input is 0).

endThe end location.

Advanced PropertiesFor explanations of the following properties, see Standard Transformer Properties:

name

remark

disabled

ToFloat

This transformer converts a floating point number from an ASCII stringrepresentation to binary.

Advanced Propertiessize

Size of the number: single_precision_32_bit or double_precision_64_bit .

For explanations of the following properties, see Standard Transformer Properties:

Page 157: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 8. Transformers

146

name

remark

disabled

optional

ToInteger

This transformer converts an integer from an ASCII string representation (decimal,octal, or hexadecimal) to binary.

Basic Properties

sizeSize in bytes of the binary representation. The supported values are 1 to 8.

Advanced Propertiessigned

If selected, the input has a plus or minus sign.

from_baseThe base of the input: decimal, octal, hexadecimal, lowercase hexadecimal.

For explanations of the following properties, see Standard Transformer Properties:

name

remark

disabled

optional

ToPackDecimal

This transformer converts a number from an ASCII string representation to packeddecimals.

Advanced PropertiesFor explanations of the following properties, see Standard Transformer Properties:

Page 158: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 8. Transformers

147

name

remark

disabled

optional

ToSignedDecimal

This transformer converts a number from an ASCII string representation to signeddecimals.

Advanced Properties

insert_sign_symbolAdds a sign symbol (plus or minus) before or after the number. The optionsare no, before, and after.

For explanations of the following properties, see Standard Transformer Properties:

name

remark

disabled

optional

TransformationStartTime

This transformer outputs the date and/or time at which the data transformationstarted running.

The transformer ignores its input. It copies the date and time from the VarSystemvariable, and formats the output according to your specification.

Basic Properties

formatThe format of the date and time. You can type the format, or browse to a dataholder that contains the format.

For an explanation of the supported formats, see the DateFormat transformer.

Advanced PropertiesFor explanations of the following properties, see Standard Transformer Properties:

Page 159: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 8. Transformers

148

disabled

optional

TransformByParser

This transformer runs a parser on its input text. The parser should containFindReplaceAnchor components, which mark segments of the text for replacement.When the parser completes execution, the transformer performs the replacements.

The transformer output is the modified text. Conversion Agent ignores any XMLoutput that the parser generates.

Example

For an example of this transformer, see the FindReplaceAnchor in Chapter 7,Anchors.

Advanced Properties

parserThe name of the parser.

For explanations of the following properties, see Standard Transformer Properties:

name

remark

disabled

optional

Online SampleIn the Conversion Agent Samples folder, openProjects\TransformByParser\TransformByParser.cmw.

The sample uses TransformByParser to replace every instance of the string ~NL~with a carriage return - linefeed sequence.

To execute the sample:

1. Set MyTransformByParser as the startup component.

2. Run the transformer.

3. At the prompt, select the source file Report.edi .

4. The transformer stores its output in Results\Transformation of Report.edi.You can compare the output with the source in Notepad.

Page 160: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 8. Transformers

149

TransformByService

This transformer runs a Conversion Agent service on its input (see Chapter 15,Deploying Conversion Agent Services). The output of the transformer is the output ofthe service.

For example, if you use the transformer to invoke a parser service, the output ofthe transformer is an XML string.

Basic Propertiesservice_name

The name of the service.

Advanced Properties

disable_automatic_encodingIf selected, Conversion Agent does not apply the input and output encodingsthat are defined in the service (see Chapter 13, Project Properties).

usernameA user name, which the service can use for authentication, for example, if theservice accesses a protected web site.

passwordA password, which the service can use for authentication, for example, if theservice accesses a protected web site.

parametersA list of initial values, which Conversion Agent should assign to variables thatare defined in the service (known as service parameters; see Initializing Variablesat Runtime in Chapter 6, Data Holders). In each element of the list, specify thename of a variable and its value.

For explanations of the following properties, see Standard Transformer Properties:

disabled

optional

TransformerPipeline

This transformer applies a sequence of nested transformers to its input.

Advanced Properties

For explanations of the following properties, see Standard Transformer Properties:

Page 161: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 8. Transformers

150

name

remark

disabled

optional

WestEuroUniToAscii

This transformer converts text in western-European languages from Unicode UTF-16 to the Windows-1252 code page.

The transformer is supported for backwards compatibility with projects that havebeen upgraded from previous Conversion Agent versions. It is not available foruse in new projects. Instead, you should set the encoding in the project properties(see Encoding Page in Chapter 13, Project Properties).

XSLTTransformer

This transformer applies an XSLT transformation to XML input text.

Example

You use a parser to extract data from an XML document. A Content anchorretrieves a complete, well-formed branch of the XML tree. You can configure theContent anchor with an XSLTTransformer, which runs an XSLT transformation onthe branch.

Advanced Properties

xslt_fileThe path and filename of the XSLT file.

For explanations of the following properties, see Standard Transformer Properties:

name

remark

disabled

optional

Transformer Subcomponent Reference

This section describes subcomponents that you can assign as the values of certaintransformer properties.

Page 162: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 8. Transformers

151

InlineTable

This component lets you define a lookup table in the IntelliScript. The table is usedby the LookupTransformer.

Basic Propertiestable

Under this property, enter a sequence of entry components. In each entry ,specify key and value strings. For example, you might specify:

key value

1 George Washington

2 John Adams

3 Thomas Jefferson

4 James Madison

Advanced Propertiesmatch_case

If selected, the key string is considered to be case-sensitive.

ODBC_Text_Connection

The subcomponent defines a database connection. It is used, for example, in theODBCLookup transformer.

Before using this component, use the operating system tools to define a DSN forthe database connection.

Advanced Properties

DSNThe data source name of the connection.

usernameUser name for the database connection.

passwordPassword of the user.

timeoutTime in seconds to wait for the database response.

Page 163: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 8. Transformers

152

XMLLookupTable

This component lets you specify an XML file, which contains a lookup table. Thetable is used by the LookupTransformer .

The XML file must have the following syntax:

<LookupTable><Entry key="..." value="..." />...

</LookupTable>

ExampleThe following is an XML lookup table:

<LookupTable><Entry key="1" value="George Washington" /><Entry key="2" value="John Adams" /><Entry key="3" value="Thomas Jefferson" /><Entry key="4" value="James Madison" />

</LookupTable>

Basic Properties

xml_file_nameBrowse to the XML file.

Advanced Propertiesmatch_case

If selected, the key string is considered to be case-sensitive.

Page 164: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 9. Actions

153

Actions

Actions are components that perform operations on data that Conversion Agenthas extracted from a source document. Some examples of the supported actionsare:

Arithmetic computations

String concatenations

Submitting forms to a web server

Activating a secondary parser

Querying a database

You can use the out-of-the-box actions supplied with Conversion Agent, or youcan define custom actions.

This chapter explains how to use actions and documents the actions that areavailable in Conversion Agent.

How Actions Work

An action takes its input from the data holders that are currently available. Asingle action can have multiple inputs.

If the action is embedded in a parser, the available data holders are the ones thatthe parser has generated. In a serializer, the data holders are the ones that exist inthe input XML, plus any additional data holders that the serializer has generated.For a mapper, the data holders can be in either the input or the output.

The action performs operations on the input and generates output. You canconfigure many actions to store their output in data holders.

In most actions, the input and output data holders must contain the data directly;that is, they must not contain nested elements. A few actions work with dataholders that contain nested elements, with multiple-occurrence data holders, orwith other special types. These details are explained in the Action ComponentReference.

An action can have additional effects, such as writing to a file, updating a database,or submitting data to an external application.

9

Page 165: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 9. Actions

154

Comparison between Actions and Transformers

Some actions perform operations that are similar to transformers, for example,modifying a string or querying a database. However, actions differ fromtransformers in some fundamental ways. The following table summarizes thedifferences.

Transformers Actions

Input A transformer has a single input,which is a string.

The input is implemented by theaction. For example, an action canhave multiple inputs, which are dataholders.

Output The output of a transformer is astring.

The output is implemented by theaction. For example, an action cancreate output data holders.

Side effects A transformer has no side effects,other than modifying the input string.

An action can have side effects,such as updating a database.

Defining Actions

You can define actions by editing the IntelliScript. You can insert the actions underthe contains line of components such as a Parser, Serializer, Mapper, Group,RepeatingGroup, etc. Essentially, you can insert the actions in any location whereyou can insert anchors, serialization anchors, or mapper anchors.

The actions run in sequence with the anchors that you specify in the same location.In a parser, you can set the phase property of an action, which controls whether itruns in the initial, main, or final stage of the parsing procedure. For an explanationof the phase, see Search Phases in Chapter 7, Anchors.

Standard Action Properties

In this section, we review certain properties that are found in many actions. Foradditional properties that are specific to particular action, see the Action ComponentReference.

nameA name that you assign to the action. Conversion Agent includes the name inthe event log. This can help you find an event that was caused by theparticular action.

remarkA comment describing the action.

Page 166: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 9. Actions

155

disabledIf selected, Conversion Agent ignores the action. This is useful for testing anddebugging, or for making minor modifications in a project without deletingthe existing actions.

optionalBy default, if an action fails, the parent component (such as a Parser in whichthe action is nested) fails. If you select the optional property, the parentcomponent does not fail.

phaseThe processing phase during which Conversion Agent should execute theaction (initial, main, or final). This property has an effect only if the action isused in a parser.

Action Quick Reference

AddEventActionWrites a message in the event log.

AppendListItemsConcatenates a list of strings that are stored in a multiple-occurrence dataholder.

AppendValuesConcatenates strings.

CalculateValuePerforms a computation defined in a JavaScript expression.

CombineValuesGenerates all possible concatenations from multiple-occurrence data holders.

CreateListFills a multiple-occurrence data holder with specified data.

DateAddIncrements a date.

DateDiffComputes the difference between two dates.

DownloadFileDownloads a file.

DownloadFileToDataHolderDownloads the content of a file into a data holder.

DumpValuesA debugging tool for dumping extracted data.

Page 167: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 9. Actions

156

EnsureConditionEvaluates a JavaScript expression. If the expression is false, the action fails.

ExcludeItemsDeletes values from a multiple-occurrence data holder.

ExternalCOMActionRuns a custom action that is implemented as an ActiveX DLL.

JavaScriptFunctionRuns a JavaScript function.

MapCopies a data holder, optionally running transformers on the value.

ODBCActionRuns a database query.

ResetVisitedPagesResets the list of visited pages, permitting repeat visits to a page.

RunMapperRuns a mapper.

RunParserRuns a parser.

RunSerializerRuns a serializer.

SetValueFill a data holder with predefined content.

SortSorts the occurrences of a multiple-occurrence data holder.

SubmitFormSubmits an HTML form using the Post method and parses the response.

SubmitFormGetSubmits an HTML form using the Get method and parses the response.

WriteValueWrites a value to a location such as a file, MSMQ queue, or database.

XSLTMapRuns an XSLT transformation.

Action Component Reference

This section documents the actions that are available in Conversion Agent.

Page 168: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 9. Actions

157

AddEventAction

This action adds a message to the event log.

Basic Propertiesseverity

The severity level of the message. The options are notification, warning,failure, or fatal error.

messageThe message string.

Advanced PropertiesFor explanations of the following properties, see Standard Action Properties:

name

remark

disabled

phase

AppendListItems

The AppendListItems action concatenates the strings in a multiple-occurrence dataholder.

To prepare the input for this action, see Mapping to Multiple-Occurrence Data Holdersin Chapter 7, Anchors.

ExampleA source document contains the following space-separated text:

H E L L O

When you parse the document, you wish to remove the spaces and store the resultin an XML element called Greeting.

One way to do this is to create a multiple-occurrence variable called VarLetter.Create several Content anchors, which retrieve the individual letters and storethem in occurrences of VarLetter.

Then, use the AppendListItems action to concatenate the occurrences of VarLetterand store the result in the Greeting element. The result is:

<Greeting>HELLO</Greeting>

Basic Properties

inputThe multiple-occurrence data holder (select from a Schema view).

Page 169: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 9. Actions

158

outputA data holder to store the output (select from a Schema view).

Advanced Properties

For explanations of the following properties, see Standard Action Properties:

name

remark

disabled

optional

phase

Online Sample

In the Conversion Agent Samples folder, openProjects\AppendListItems\AppendListItems.cmw. The sample uses aRepeatingGroup to store values in a multiple-occurrence variable. It then uses as anAppendListItems action to concatenate the values.

AppendValues

The AppendValues action concatenates strings.

ExampleA parser has generated the following XML:

<Name><First>Ron</First><Last>Lehrer</Last>

<Name>

You can configure an AppendValues action that outputs:

<FullName>Ron Lehrer</FullName>

Basic Properties

inputA list of data holders containing the values to be appended (select from aSchema view).

outputA data holder to store the output (select from a Schema view).

Advanced Properties

skip_unfound_valuesIf selected, and one of the input data holders is missing, the action continues.If not selected, the action fails.

Page 170: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 9. Actions

159

For explanations of the following properties, see Standard Action Properties:

name

remark

disabled

optional

phase

CalculateValue

The CalculateValue action performs a computation that is defined by a JavaScriptexpression.

For example, you can use the action to compute a sum of numerical values, or toconcatenate string values.

JavaScript Syntax

For information about the JavaScript syntax that CalculateValue supports, seeEnsureCondition.

Precision

For an explanation of how CalculateValue handles the precision of xs:decimaland xs:float values, see Precision of Numerical Data in Chapter 6, Data Holders.

Example

A parser has generated the following XML:

<ItemOrdered><Name>Gizmo</Name><Quantity>100</Quantity><Price>25<Price>

</ItemOrdered>

You can use this action to generate the output:

<ItemOrdered><Name>Gizmo</Name><Quantity>100</Quantity><Price>25<Price><Total>2500</Total>

</ItemOrdered>

To do this, define the Name and Quantity element as input parameters. Specify theJavaScript expression $1 * $2, and store the result in the Total element.

Basic Propertiesparams

Data holders that contain the input parameters (select from a Schema view).

Page 171: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 9. Actions

160

expressionThe JavaScript expression. Use $1, $2, etc. (up to $9), to represent the inputparameters.

For information about the supported JavaScript syntax, see theEnsureCondition action.

resultA data holder to store the output (select from a Schema view).

Advanced Propertiesfailure_action

The behavior in the event of a failure. The options are Ignore (continue theparser) and HaltExecution (stop the parser).

For explanations of the following properties, see Standard Action Properties:

name

remark

disabled

optional

phase

Online SampleIn the Conversion Agent Samples folder, openProjects\CalculateValue\CalculateValue.cmw. The sample retrieves threenumbers from a source document and stores them in variables. It uses aCalculateValue action to compute a mathematical function of the numbers.

CombineValues

The CombineValues action generates all possible combinations from lists of strings,which are stored in multiple-occurrence data holders (see Multiple-Occurrence DataHolders in Chapter 6, Data Holders). It concatenates the strings in each combination,generating an output list.

The input of this action must include one or more multiple-occurrence dataholders. Optionally, it may also include single-occurrence data holders.

The output is a multiple-occurrence data holder. Each occurrence of the dataholder stores a combination.

This action is useful, for example, to prepare the VarFormData variable, which isrequired by the SubmitForm action.

ExampleIn a multiple-occurrence variable called VarDay, you have stored the list Monday,Tuesday. In a multiple-occurrence variable called VarTime, you have stored

Page 172: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 9. Actions

161

morning, afternoon. In a single-occurrence variable called VarSpace, you havestored a space character.

Suppose you run CombineValues on VarDay, VarSpace, and VarTime, with an outputdata holder called DayTime. The output is:

<DayTime>Monday morning</DayTime><DayTime>Monday afternoon</DayTime><DayTime>Tuesday morning</DayTime><DayTime>Tuesday afternoon</DayTime>

Basic Propertiesinput

From a Schema view, select the data holders containing the input. Typically, atleast one of the inputs should be a multiple-occurrence data holder.

outputA multiple-occurrence data holder, where the action stores its output (selectfrom a Schema view).

Advanced PropertiesFor explanations of the following properties, see Standard Action Properties:

name

remark

disabled

optional

phase

Online Sample

In the Conversion Agent Samples folder, openProjects\CombineValues\CombineValues.cmw. The sample retrieves lists of days,months, and years from a source document. It uses a CombineValues action togenerate all possible dates from the lists.

For an additional online sample, see the SubmitForm action.

CreateList

This action inserts data in a list. The output is a multiple-occurrence data holdercontaining the list (see Multiple-Occurrence Data Holders in Chapter 6, Data Holders).

Nested in this component, enter the data values.

Example

If the input data values are

Jack

Page 173: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 9. Actions

162

JennieLarissa

the action can create the output

<Name><First>Jack</First><First>Jennie</First><First>Larissa</First>

</Name>

Basic Properties

data_holderThe multiple-occurrence data holder, where the action should store the list(select from a Schema view).

Advanced PropertiesFor explanations of the following properties, see Standard Action Properties:

name

remark

disabled

phase

DateAdd

This action increments a date.

Basic Properties

input_formatThe date format, for example, dd/MM/yy (see DateFormat in Chapter 8,Transformers). You can type the format, or browse to a data holder thatcontains the format. If you omit the format, the system default is assumed.

input_dateThe date to be incremented. You can type the date, or select a data holdercontaining the date from a Schema view.

Page 174: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 9. Actions

163

num_of_daysThe number of days to add. You can type a positive or negative integer, orselect a data holder containing the number from a Schema view.

outputThe data holder to store the output date (select from a Schema view).

Advanced Properties

For explanations of the following properties, see Standard Action Properties:

name

remark

disabled

optional

phase

DateDiff

This action computes the difference between two dates.

Basic Propertiesdate_format1, date_format2

The formats of the two dates, for example, dd/MM/yy (see DateFormat inChapter 8, Transformers). You can type the format, or browse to a data holderthat contains the format. If you omit the format, the system default is assumed.

date1, date2The two dates. You can type the date, or select a data holder containing thedate from a Schema view.

outputThe data holder to store the difference, in days (select from a Schema view).

Advanced Properties

For explanations of the following properties, see Standard Action Properties:name

remark

disabled

optional

phase

DownloadFile

This action downloads a file to the local computer. The file can be specifieddynamically, in the source document.

Page 175: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 9. Actions

164

Basic Properties

file_urlA data holder that stores the file path or URL (select from a Schema view).

target_pathThe folder path to store the downloaded file. If you leave the property blank,the file is stored in the Results folder of the project.

Advanced Properties

transformersA sequence of transformers that the action applies to the path or URL, beforedownloading.

For explanations of the following properties, see Standard Action Properties:

name

remark

disabled

optional

phase

Online SampleIn the Conversion Agent Samples folder, openProjects\DownloadFile\DownloadFile.cmw.

To run the sample, you must have an Internet connection. The sample retrieves theURL of a file. It then uses the DownloadFile action to download the file to theResults folder of the project.

DownloadFileToDataHolder

This action downloads a file and stores its content in a data holder.

If the file contains symbols such as < and > , the action converts them to entitiessuch as &lt; and &gt;, in order to generate valid XML.

The file must be located on a web server.

Basic Propertiesfile_url

A data holder that stores the URL of the file (select from a Schema view).

outputThe data holder to store the downloaded content (select from a Schema view).

Advanced PropertiesFor explanations of the following properties, see Standard Action Properties:

Page 176: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 9. Actions

165

name

remark

disabled

optional

phase

DumpValues

This action is a debugging tool, which writes data to a<DumpValues>...</DumpValues> element.

Nested in the action, insert the data holders that should be dumped.

Advanced Properties

outputThe file in which to write the output. The options are ResultFile (the defaultoutput file of the project) or OutputFile (specify a file path).

For explanations of the following properties, see Standard Action Properties:

name

remark

disabled

phase

EnsureCondition

This action evaluates a JavaScript expression. If the expression is false , the actionfails.

JavaScript SyntaxInformation about JavaScript syntax is available in many books about webdevelopment. On the Internet, you can find a tutorial introduction on theW3Schools site:

http://www.w3schools.com

A definitive syntax reference for JavaScript is on the ECMA site (ECMAScript is aninternational JavaScript standard):

http://www.ecma-international.org/publications/standards/Ecma-262.htm

The internal Conversion Agent JavaScript processor supports standard JavaScriptexpressions containing the following features:

The unary and binary operators:

() + - * / % == != < <= > >= && ||

Page 177: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 9. Actions

166

The ternary ?: operator.

The following methods:

charAtindexOflastIndexOflengthsubstringtoString

If you apply these methods to a literal having a simple data type, you mustenclose the literal in parentheses, for example:

123.toString(); //Wrong(123).toString(); //Right

"Hello, World".substring(3,7); //Wrong("Hello, World").substring(3,7); //Right

The following functions:

Math.ceilMath.floorMath.maxMath.minMath.powMath.sqrtparseFloatparseInt

The internal JavaScript processor does not support features such as the following:

The unary and binary operators:

++ -- typeof void >> >>> << === !== ~ & | ^

Assignment operators:

= += -= *= /= >>= >>>= <<= &= |= ^=

The comma operator ( , ).

The values NaN, null, infinity, or -0 (negative 0).

Data types other than string, number, and boolean.

The Date object.

The equalsIgnoreCase function.

Expressions that use these features can nonetheless be used in some cases. If theinternal JavaScript processor cannot process an expression, the systemautomatically uses an external JavaScript processor, which is supplied withConversion Agent, to interpret the expression.

Page 178: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 9. Actions

167

Basic Properties

conditionA JavaScript expression to be evaluated. In the expression, use $1, $2, etc. (upto $9), to refer to the params. For example, the following expression checkswhether the first parameter has the value Ron Lehrer:

$1 == "Ron Lehrer"

paramsA list of data holders, containing parameters that you can use in the condition(select from a Schema View.)

Advanced PropertiesFor explanations of the following properties, see Standard Action Properties:

name

remark

disabled

phase

ExcludeItems

This action deletes specified values from a multiple-occurrence data holder (seeMultiple-Occurrence Data Holders in Chapter 6, Data Holders).

Nested in the action, specify the values to exclude.

Basic Properties

data_holderThe multiple-occurrence data holder (select from a Schema view).

Advanced Properties

For explanations of the following properties, see Standard Action Properties:

name

remark

disabled

optional

phase

ExternalCOMAction

The ExternalCOMAction component runs a custom action. The custom action isimplemented as a COM (ActiveX) DLL, or as a .NET DLL with the COMinteroperability (interop) feature.

Page 179: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 9. Actions

168

Because this component uses the Microsoft COM (ActiveX) architecture to activatethe custom action, it runs only on Microsoft Windows platforms.

Creating the Custom DLL

You should program the custom action as a COM component containing thefollowing function:

function Run(ByVal inp as String, ByVal design_mode as Boolean) as String

The inp parameter is the input string, which the action should process. TheExternalCOMAction component passes the input string to the function and receivesthe return value as output. The function can have any desired side effects, such asinteracting with a third-party system.

The design_mode parameter is True if the action is activated within ConversionAgent Studio. If the custom action requires a long processing time or has sideeffects that interfere while you are designing a parser, the function can performdifferent operations based on the design_mode value.

Register the COM component on the Conversion Agent computer. You may thenuse the DLL in the ExternalCOMAction component.

Optionally, you can add the custom action to the drop-down component list thatConversion Agent displays (for instructions, see the chapter about Using theIntelliScript Editor in the book Conversion Agent Studio in Eclipse).

Implementation in Visual Studio 6In Microsoft Visual Studio 6, you should implement the custom action as a COMDLL component that exposes the IDispatch interface (in Visual Basic: as anActiveX DLL project).

Use the regsvr32 utility to register the DLL on the Conversion Agent computer.

Implementation in Visual Studio .NETOptionally, you can implement the custom action in Microsoft Visual Studio .NET.Conversion Agent must be installed on the development computer.

Create a class library project, which references the file ICMAction.dll in theConversion Agent installation folder. Create a class that implements the ICMActioninterface.

Configure the project with the COM interop feature, which lets Conversion Agentaccess the DLL as a COM component. For a description of this feature, see the topicIntroduction to COM Interop and other topics under the interop index entry, in theMicrosoft MSDN Library.

Compile the .NET assembly, and copy the DLL to the Conversion Agent computer.Use the regasm utility to register the DLL file.

For the convenience of Conversion Agent users who may not be familiar VisualStudio .NET, the following is a step-by-step procedure for implementing a customaction in the C# language. The instructions for other .NET languages, such asVisual Basic .NET, are similar.

Page 180: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 9. Actions

169

1. Open Visual Studio .NET and create a new Class Library project.

2. Add a reference to the file ICMAction.dll.

In the Add Reference window, you can find the reference on the .NET tab(component name ICMAction). Alternatively, you can browse to ICMAction.dllin the Conversion Agent installation folder.

3. Add a class that implements the ICMAction interface.

You can copy the following sample code. You should change the namespaceand class names (CMActionExample and CCMActionExample, respectively) tomeaningful names for your projects.

using System;using System.Runtime.InteropServices; //Enables COM interopusing Itemfield.ContentMaster; //For ICMAction interface

namespace CMActionExample{

//Prevents automatic creation of class interface.//Causes class to be exported to COM only as an implementor//of the ICMAction interface[ClassInterface(ClassInterfaceType.None)]

public class CCMActionExample : ICMAction{public CCMActionExample(){}

public string Run(string inp, bool design_mode){//ToDo: Insert code here}

}}

4. Implement the Run function, inserting code that performs the desired action.

For example, suppose you want the custom action to count the characters inthe input, and return the result. The following Run function performs thisaction:

public string Run(string inp, bool design_mode){

Int32 res = inp.Length;return res.ToString();

}

Page 181: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 9. Actions

170

5. In the Solution Explorer, right-click the project and edit its properties.

a. In the left pane of the properties window, expand the tree and selectConfiguration Properties / Build.

b. In the right pane, in the Outputs section, set the Register for COM Interopproperty to true.

6. Right-click the project and choose the Build option.

This generates the DLL file that you need for use in the ExternalCOMActioncomponent.

On the computer where you developed the .NET project, Visual Studio .NETregistered the DLL when you built the project. There are no additional installationsteps.

To run the custom action in Conversion Agent on another computer, perform thefollowing steps to install the custom DLL:

1. The Microsoft .NET Framework, version 1.1 or higher, must be installed on thecomputer.

2. Copy your custom DLL to any convenient location on the computer. Therecommended location is the Conversion Agent program folder.

3. Open a command prompt, and use the regasm utility to register your DLL. Theutility is located in the Windows folder, in the subfolderMicrosoft.NET\Framework\<version>.

For example, enter the following command:

regasm <path>\YourCustomDLL.dll /codebase

The regasm utility displays a message, indicating that the DLL wassuccessfully registered.

Basic PropertiesCOM

A COMClass component, which defines the COM object implementing thecustom action (see the Action Subcomponent Reference below).

inputA data holder storing the input of the action (select from a Schema view).

outputA data holder where the action should store its output (select from a Schemaview).

Advanced PropertiesFor explanations of the following properties, see Standard Action Properties:

Page 182: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 9. Actions

171

name

remark

disabled

optional

phase

Online SampleFor an online sample of a Visual Studio .NET project, which implements a customaction in the C# language, see the following location in the Conversion Agentinstallation folder:

Samples\SDK\CMACTION

JavaScriptFunction

This action executes a JavaScript function, for example, a function located in anHTML source document. You can pass parameters to the function, and you canstore the return value of the function.

Basic Properties

function_to_executeThe name of the function.

resultA data holder, in which to store the return value of the function (select from aSchema view).

paramsA list of data holders containing the input parameters of the function. Theparameters must be in the same order as in the function declaration.

Advanced Propertiesrefresh

If selected, Conversion Agent recompiles the function for each page that aparser processes. If not selected, Conversion Agent assumes that the functionis the same on all the pages, and it compiles the function only on the first page.

For explanations of the following properties, see Standard Action Properties:

name

remark

disabled

optional

phase

Page 183: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 9. Actions

172

Map

This action copies a value from one data holder to another.

When copying a simple data holder (one that has no nested elements), the sourceand destination must have compatible data types. The action can applytransformers to the copied value.

Optionally, you can use the action to copy data holders that contain nestedelements. In that case, the source and destination must have identical internalstructures and identical XSD types.

Basic Properties

sourceThe source data holder (select from a Schema view).

targetThe destination data holder (select from a Schema view).

transformersA sequence of transformers that modify the value. Do not assign this propertyif the source and destination are complex XML elements.

Advanced PropertiesFor explanations of the following properties, see Standard Action Properties:

name

remark

disabled

optional

phase

Online Sample

In the Conversion Agent Samples folder, open Projects\CopyValue\CopyValue.cmw.The sample uses a Map action to copy a complex element, which contains anattribute and nested elements.

ODBCAction

This action runs a SQL query on a database. For example, it can perform a SELECTquery that retrieves data, or it can perform an INSERT or UPDATE query that addsdata to the database.

Example

A source document contains an employee ID number, which you have stored in avariable called EmpID. You want to retrieve the employee's name from a database

Page 184: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 9. Actions

173

and store the result in the following XML structure, which is defined in your XSDschema:

<Person><Name>

<First>...</First><Last>...</Last>

<Name></Person>

To do this, you can run an ODBCAction . You can configure the action as follows:

In this example:

The db_connection property defines the database connection.

The output_record defines the data holder where the action should store theretrieved data.

The sql_statement is the SQL query that retrieves the data.

The input_parameters property contains the EmpID variable, which is the inputof the action.

Basic Propertiesdb_connection

An ODBC_XML_Connection subcomponent, which defines the ODBC provider(typically a database).

sql_statementThe SQL query, for example:

SELECT Name FROM Employees WHERE Id = ?

Use the ? symbol to represent an input parameter. If there is more than oneinput parameter, each ? symbol represents the next parameter in sequence, forexample:

SELECT Name FROM Employees WHERE Id = ? AND Gender = ?

In this case, the two ? symbols represent the first and second inputparameters, respectively.

Page 185: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 9. Actions

174

The SQL syntax must be valid for the ODBC provider. Please see the provideror database documentation for details.

Advanced Properties

on_sql_no_dataThe result if the SQL query does not retrieve any data. The value can besuccess (the action does not fail) or fail (the action fails).

output_recordAn XML element, defined in the XSD schema, where the action should storeany data that the SQL query retrieves. The element must contain nestedelements (at the top level of nesting), whose names are identical to the outputfields of the query. Select the element from a Schema view.

If the SQL query retrieves multiple records, the schema should permitmultiple occurrences of the XML element (see Multiple Occurrence Data Holdersin Chapter 6, Data Holders).

retryThe number of retries if the first connection attempt fails.

input_parametersA list of data holders that contain the input parameters (select from an XSDView window Schema view).

For explanations of the following properties, see Standard Action Properties:

name

remark

disabled

optional

phase

ResetVisitedPages

This action clears the list of visited pages of specified secondary parsers.

This action is used with the reject_recurring_pages property of a Parsercomponent (see Chapter 3, Parsers). ResetVisitedPages allows multiple visits tothe same page, even if reject_recurring_pages is selected. You might do this, forexample, if you want to post different input data to the same web page.

Basic Propertiesparsers

Specify the parsers to be reset.

Advanced Properties

For explanations of the following properties, see Standard Action Properties:

Page 186: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 9. Actions

175

name

remark

disabled

phase

RunMapper

This action runs a mapper.

For example, you can use this action in a parser to run a mapper, which modifiesthe parsed data.

Basic Propertiesmapper

The mapper. You may select the name of an existing Mapper component, oryou may create a Mapper component at this location of the IntelliScript (seeChapter 10, Serializers).

inputA data holder storing XML text on which to run the mapper (select from aSchema view).

The data holder must have a simple data type such as xs:string. The value ofthe string can be XML text of any complexity. (To run a mapper on a dataholder that has a complex type, see EmbeddedMapper in Chapter 11, Mappers.)

If you omit this property, the mapper uses the data holders available in thescope of the action. For example, if the action is nested in a parser, the mapperruns on the output of the parser. If the action is within a Group , it runs on theoutput of the Group.

Advanced Properties

For explanations of the following properties, see Standard Action Properties:

name

remark

disabled

optional

phase

RunParser

This action activates a parser, which parses a dynamically specified source.

In a parser, for example, you can use this action, to follow the links in an HTMLfile and run a secondary parser on the link destinations.

Page 187: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 9. Actions

176

In a serializer, you can use this action to parse bits of unstructured data that existin the input.

The output of RunParser is appended to the output of the main component thatactivated it (such as a parser or serializer).

The RunParser action differs from the EmbeddedParser anchor, in that RunParserparses a new source, whereas EmbeddedParser parses a section of an existingsource.

See also the SubmitForm and SubmitFormGet actions, which submit HTML form datato a URL.

Example

An HTML file has a link to a second file. A Content anchor stores the file path orURL of the link destination in the VarLinkURL system variable. The RunParseraction accesses the destination file and runs a secondary parser on it.

For another example, where the main parser selects which parser to run accordingto text in the source document, see the Alternatives anchor in Chapter 7, Anchors.

Posting Data to the URLOptionally, the action can post data to the URL. This simulates the submission ofan HTML form to a web server, and it parses the result.

To do this, you must store the data in the VarPostData system variable. To correctlyprepare the data in VarPostData, we recommend the following procedure:

1. Save a copy of the HTML page containing the form on your local computer.

2. Edit the copy, changing the form action attribute to your email address. Forexample, if the form element reads <form method="POST"action="http://example.com/MyServer.exe">, change it to <formmethod="POST" action="mailto:[email protected]">.

3. Open the copy in your browser, fill in the form, and click the submit button.This sends an email containing the form data to your address.

4. The body of the email is a string containing the form data. Assign this string tothe VarPostData variable.

Basic Propertiesnext_parser

The name of the parser to run (recursive calls to the same parser arepermitted).

Advanced Propertiesinput_source_as_text

This property specifies the type of data that the input_source data holdercontains.

Page 188: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 9. Actions

177

If input_source_as_text is selected, input_source contains a text string thatshould be parsed. If not selected, input_source contains a file path or a URL.

input_sourceIf input_source_as_text is selected, input_source is a data holder thatcontains a string to be parsed.

If input_source_as_text is not selected, input_source is a data holdercontaining the path or URL of the document to be parsed. The default value isthe VarLinkURL system variable.

If the VarPostData system variable contains a value, the value is posted to theURL. If VarPostData is empty, the action accesses the URL without postingany data.

pre_processorA document processor that the parser should apply to the source.

retriesThe number of times to retry if the request fails.

seconds_to_waitThe interval in seconds between retries.

include_stringsStrings that must be present in the input_source value. If a specified string isnot present, the action does not access the source or activate the secondaryparser. For example, if you want to follow links only with the same web site,you might add the domain name to include_strings.

exclude_stringsStrings that must not be present in the input_source. If a string is present, theaction does not access the source or activate the secondary parser.

For explanations of the following properties, see Standard Action Properties:

name

remark

disabled

optional

phase

RunSerializer

This action runs a serializer. The output of the serializer is stored in a data holder.

For example, you can use this action in a parser to run a serializer, which modifiesthe parsed data.

Page 189: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 9. Actions

178

Basic Properties

serializerThe serializer. You may select the name of an existing Serializer component,or you may create a Serializer component at this location of the IntelliScript(see Chapter 10, Serializers).

inputA data holder storing XML text on which to run the serializer (select from aSchema view).

The data holder must have a simple data type such as xs:string. The value ofthe string can be XML text of any complexity. (To run a serializer on a dataholder that has a complex type, see EmbeddedSerializer in Chapter 10,Serializers.)

If you omit this property, the serializer uses the data holders available in thescope of the action. For example, if the action is nested in a parser, theserializer runs on the output of the parser. If the action is within a Group, itruns on the output of the Group.

outputA data holder to store the serializer output (select from a Schema view).

Advanced Properties

For explanations of the following properties, see Standard Action Properties:

name

remark

disabled

optional

phase

Online SampleIn the Conversion Agent Samples folder, openProjects\RunSerializer\RunSerializer.cmw.

To observe how the sample works, set MainParser as the startup component andrun it. MainParser contains a RepeatingGroup, which parses pairs of names andstores them in variables. After each iteration, the RepeatingGroup executes aRunSerializer action, which concatenates the variables with some predefined text.The action stores its output in an XML element, which is added to the parseroutput.

SetValue

This action fills a data holder with predefined content.

Page 190: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 9. Actions

179

The assignment overwrites any existing content (except for a multiple-occurrencedata holder, see Multiple-Occurrence Data Holders in Chapter 6, Data Holders).

Basic Properties

quoteThe content to assign.

data_holderThe data holder (select from a Schema View).

Advanced Properties

transformersA list of transformers that are applied to the content.

For explanations of the following properties, see Standard Action Properties:

name

remark

disabled

phase

Sort

This action sorts the occurrences of a multiple-occurrence data holder. The outputis saved to the original data holder (see Multiple-Occurrence Data Holders in Chapter6, Data Holders).

You can sort any multiple-occurrence data holder in the scope of the project, forexample:

The output of a parser

The input of a serializer

The input or output of a mapper

A variable

If you run the action on a XML element that contains attributes or nested elements,you can use them as sort keys.

LimitationYou cannot use the Sort action if a Key is defined on the multiple-occurrence dataholder (see Chapter 12, Locators, Keys, and Indexing).

Basic Properties

recurring_elementThe multiple-occurrence data holder that should be sorted.

Page 191: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 9. Actions

180

by_fieldsThe sort keys, in decreasing order of precedence. For each field, select the dataholder and an ascending or descending sort. You can select the multiple-occurrence data holder itself, or any of its nested elements or attributes.

Advanced Properties

For explanations of the following properties, see Standard Action Properties:

name

remark

disabled

phase

SubmitForm

This action submits HTML form data to a URL and parses the response.

SubmitForm uses the HTTP Post method to submit the form. To use the HTTP Getmethod, use the SubmitFormGet action, instead. See also the RunParser action,which can submit an HTML form.

The output of SubmitForm is appended to the output of the main component thatactivated it (such as a parser or serializer).

SubmitForm is an alternative to using the HtmlForm anchor. HtmlForm is easier to usebecause it performs some of the data-preparation steps automatically. SubmitFormgives you greater control because it lets you configure these steps yourself.

Using SubmitForm

To use the SubmitForm action, you should:

1. Store the URL to which you want to submit the form (the action attribute ofan HTML <form> element) in the VarFormAction system variable.

2. Store the form data in the VarFormData system variable. You can determine thecorrect format of the data in the following way:

a. Save a copy of the HTML page containing the form on your local computer.b. Edit the copy, changing the form action attribute to your email address. For

example, if the form element reads <form method="POST"action="http://example.com/MyServer.exe">, change it to <formmethod="POST" action="mailto:[email protected]">.

c. Open the copy in your browser, fill in the form, and click the submit button.This sends an email containing the form data to your address.

d. The body of the email is a string containing the form data. Assign this stringto the VarFormData variable.

3. Run the SubmitForm action. The action submits the data that you stored inVarFormData to the location that you stored in VarFormAction.

Page 192: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 9. Actions

181

Submitting Multiple Copies of a Form

VarFormData is a multiple-occurrence variable (see Multiple-Occurrence Data Holdersin Chapter 6, Data Holders). This means that you can create multiple occurrences ofVarFormData, each storing a different set of post data.

If you do this, SubmitForm posts each occurrence of VarFormData independently,and it parses each of the web-server responses.

You can use the CombineValues action to prepare the VarFormData occurrences. Forexample, if you know the possible values of each form field, CombineValues canprepare all possible combinations of the values.

Basic Propertiesaction

Select OpenURL, which specifies how to parse the web-server response (see theOpenURL subcomponent in the Action Subcomponent Reference).

Advanced Properties

For explanations of the following properties, see Standard Action Properties:

name

remark

disabled

optional

phase

Online SampleIn the Conversion Agent Samples folder, openProjects\SubmitForm\SubmitForm.cmw. The sample works in the following way:

1. The main parser, which is called Flower_form_parser, retrieves options fromthe HTML order form of an online florist. The options include several flowertypes and price ranges.

2. The parser runs a CombineValues action, which prepares all possiblecombinations of the flower-type and price-range options.

3. The parser runs a SubmitForm action, which posts the combinations to an webapplication.

4. The SubmitForm action activates a secondary parser, which parses theresponses from the web application. The parsing output is added to the outputof the main parser.

You cannot run this sample because the web application does not exist.

SubmitFormGet

This action submits HTML form data to a URL and parses the response.

Page 193: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 9. Actions

182

SubmitFormGet is identical to SubmitForm, except that it uses the HTTP Get methodinstead of Post. For details, see SubmitForm.

WriteValue

This action writes the value of a data holder to a location such as a file, a database,or an MSMQ queue.

If the data holder is an XML element, the action writes both the element and anynested elements.

Basic Properties

inputThe data holder to write (select from a Schema view).

outputThe output location. The options are listed in the following table (for details,see the Action Subcomponent Reference).

output Explanation

MSMQOutput Writes to an MSMQ queue.

OutputDataHolder Writes to a data holder.

OutputFile Writes to a file.

ResultFile Writes to the default results file of thedata transformation.

OutputCOM Lets you use a custom COMcomponent to output the data.Do not select this option directly.Instead, select the display name ofthe custom COM component. Forinstructions, see OutputCOM.

ExternalOutputWriter Lets you use a custom component tooutput the data. For details, pleasecontact SAP support.

Advanced Properties

no_tagsBy default, the action surround the value that it writes with XML tags.

If you select no_tags , the XML tags are omitted. This is appropriate only ifinput is a simple data holder, containing no nested elements or attributes.

Page 194: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 9. Actions

183

transformersA list of transformers that modify the value before writing. The input to thetransformers is the complete input data holder, including XML tags.

For explanations of the following properties, see Standard Action Properties:

name

remark

disabled

optional

phase

Online Samples

In the Conversion Agent Samples folder, open Projects\Splitter\Splitter.cmw.

The sample demonstrates how to split a file into two files. A parser uses aRepeatingGroup to retrieve the records of an HL7 file. It uses a Map action to createunique filenames for each record, and a WriteValue action to write the records tothe files. You can find the output files (MyOutput1.txt and MyOutput2.txt) in theResults folder of the project.

A practical use of the splitting technique is in sending large messages to an MSMQqueue for subsequent processing by Microsoft BizTalk Server. To avoid exceedingthe MSMQ and BizTalk size limits, you can split the messages using aRepeatingGroup and WriteValue. For a sample, seeProjects\BizTalkSplitter\BizTalkSplitter.cmw.

Before you run the BizTalkSplitter example, use the MSMQ administration toolsto create a queue where the parser will store its output. Edit the output property ofthe WriteValue action, inserting your queue path.

XSLTMap

This action runs an XSLT transformation.

The input and output are branches of an XML document (either the outputdocument of a parser or the input document of a serializer).

ExampleSuppose that the following XML is the result of a parser:

<Person><First>Ron</First><Last>Lehrer</Last>

</Person>

You can use the XSLTMap action, with an appropriate XSLT file, to convert this to:

<Person Name="Lehrer, Ron" />

Page 195: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 9. Actions

184

Basic Properties

inputThe XML element at the root of the branch to be transformed (select from aSchema view).

outputThe XML element at the root of the branch that should store the output (selectfrom a Schema view).

xslt_fileBrowse to the XSLT file.

Advanced Properties

For explanations of the following properties, see Standard Action Properties:

name

remark

disabled

optional

phase

Action Subcomponent Reference

This section describes subcomponents that you can assign as the values of certainaction properties.

COMClass

This subcomponent is used, for example in an ExternalCOMAction , to define acustom COM component.

Basic PropertiesProgID

The ProgID of the COM component.

If you developed the custom action in Visual Basic 6, the ProgID typically hasthe form dll_name.class. If you used Visual Studio .NET with the COMinteroperability option, the ProgID has the form namespace.class.

For example, if you developed a .NET namespace with the nameCMActionExample, and the class name is CCMActionExample, the ProgID isCMActionExample.CCMActionExample.

Page 196: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 9. Actions

185

Advanced Properties

thread_safeDeselect this option if the COM component is incompatible withmultithreading. This causes Conversion Agent to synchronize calls to thecomponent (at the cost of slower performance).

MSMQOutput

This subcomponent specifies that a stream should be written to an MSMQ messageand sent to a queue.

The subcomponent is used in the WriteValue action to specify the output location.

Basic Properties

output_idThe identifier (such as a path) of the MSMQ queue. You may type theidentifier, or click the Browse button and select a data holder that contains theidentifier.

Advanced Properties

appendThis property is not in use.

For explanations of the following properties, see Standard Action Properties:

name

remark

ODBC_XML_Connection

The subcomponent defines a database connection. It is used, for example, in anODBCAction.

Before using this component, use the operating system tools to define a DSN forthe database connection.

Advanced PropertiesDSN

The data source name of the connection.

usernameUser name for the database connection.

passwordPassword of the user.

timeoutTime in seconds to wait for the database response.

Page 197: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 9. Actions

186

OpenURL

This subcomponent is used within the SubmitForm and SubmitFormGet actions, tospecify how to parse a web-server response.

Basic Propertiesnext_parser

The name of a parser to run on the web-server response.

Advanced Propertiesretries

The number of retries if the first request fails.

seconds_to_waitThe interval in seconds between retries.

For explanations of the following properties, see Standard Action Properties:

name

remark

disabled

optional

OutputCOM

The OutputCOM option of the WriteValue action lets you use a custom COMcomponent to create output from Conversion Agent. The component can performany desired operations: modifying the data, writing the data to multiple locations,interacting with an information system, etc.

Because OutputCOM uses the Microsoft COM technology, it operates only onMicrosoft Windows systems.

Note that OutputCOM works differently from other Conversion Agent components.It is a template for a custom component, and not a component that you can usedirectly. You cannot configure a WriteValue action with the OutputCOM option, andnest a custom component within OutputCOM. Instead, you must program a customCOM component, add it to the drop-down list, and select its name in the outputproperty of WriteValue. The following paragraphs explain the procedure.

Programming the Custom COM Component

To create the custom COM component, program an ActiveX DLL containing thefollowing function:

Public Function process_output( _ByVal output_id As String, _ByVal outContent As String, _ByVal mode As String) _

Page 198: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 9. Actions

187

As String

The WriteValue action passes the following parameters to the function:

output_idAn identifier for the desired output location (the value of the output_idproperty of the OutputCOM component).

outContentThe content that the WriteValue action is outputting.

modeIf the append property of the component is not selected in the IntelliScript, mode= "CREATE". If the append property is selected, mode = "APPEND".

The function can perform any desired operations. Conversion Agent ignores thereturn value of the function.

Install and register the DLL on the Conversion Agent computer.

Adding the Custom COM Component to the Drop-Down ListYou must add the custom component to the drop-down list that Conversion Agentdisplays under the WriteValue action. To do this:

1. In Notepad, create a text file.

2. Type a line such as the following in the file:

profile DisplayName ofPT OutputCOMT("MyProject.MyClass")

Here, DisplayName is the name that you would like to display in the drop-down list, and MyProject.MyClass is the ProgID of the component.

3. Save the file with an extension of *.tgp (for example, MyClass.tgp), in theprogram subdirectory ConversionAgent/AutoInclude/User.

Configuring WriteValue to Use the Custom COM Component

To use the custom COM component:

1. Configure a WriteValue action.

2. In the output property of the action, select the DisplayName that youconfigured above.

3. Assign the output_id and append properties of the DisplayName component.

You can then run the Conversion Agent data transformation. The WriteValueaction activates the custom component.

Page 199: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 9. Actions

188

Basic Properties

output_idAn identifier for the location where the custom COM component should writeits output. You can use this parameter, for example, to pass the name of anoutput file to the custom component.

You can type an identifier, or you can select a data holder that contains theidentifier.

Advanced Propertiesappend

If selected, the custom COM component should append its output to theexisting content of the output location (instead of overwriting).

For explanations of the following properties, see Standard Action Properties:

name

remark

OutputDataHolder

This subcomponent specifies how to write a stream to a data holder. Thesubcomponent is used in the WriteValue action to specify the output location.

Basic Properties

data_holderThe data holder (select from a Schema view).

Advanced Propertiestransformers

A sequence of transformers that modify the stream before writing.

For explanations of the following properties, see Standard Action Properties:

name

remark

OutputFile

This subcomponent specifies that a stream should be written to a file. The file canbe specified dynamically, for example by extracting the path from the sourcedocument.

The subcomponent is used in the DumpValues and WriteValue actions to specify theoutput location.

Page 200: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 9. Actions

189

Basic Properties

fileThe filename, optionally including a path. You may type the name, or click theBrowse button and select a data holder that contains the name.

The path can be absolute or relative. In the latter case, Conversion Agentresolves the path relative to the output folder of the request (if you run theproject within Conversion Agent Studio, with respect to the project Resultsfolder).

Advanced Properties

appendIf selected, the data is appended to the existing content of the file (instead ofoverwriting).

For explanations of the following properties, see Standard Action Properties:

name

remark

ResultFile

This subcomponent specifies that a stream should be written to the normal outputfile of a project.

The subcomponent is used in the DumpValues and WriteValue actions to specify theoutput location.

Page 201: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 10. Serializers

190

Serializers

Serialization is the opposite of parsing. A parser converts a source document,which can be in any format, to an XML file. A serializer converts an XML file to anoutput document that can be in any format. The output of a serializer, for example,can be a text, HTML, or HL7 document, or even another XML document.

You can create a serializer from an existing parser, by instructing ConversionAgent to invert the parser configuration. Alternatively, you can create a serializerby using the New Serializer wizard and by editing the IntelliScript.

You can also use a combination of these methods: you can create a serializer from aparser, and then edit the serializer configuration in the IntelliScript.

By any method, it is usually easier to create a serializer than a parser. This isbecause the XML input is completely structured. The structure makes it easy toidentify the required data and write it, in a sequential procedure, to the output. Aparser, in contrast, may need to process unstructured or semi-structured input—atask that can be much more complex than serialization.

The main components that are nested in a serializer are called serialization anchors.The function of the serialization anchors is to identify the XML data and write it tothe output. Serialization anchors are analogous to the anchors that are used in aparser (which are called simply anchors), except that they work in the oppositedirection.

This chapter explains the procedures for creating serializers, and it describes theserialization anchors.

Creating a Serializer from a Parser

To create a serializer automatically from a parser:

1. In Conversion Agent Studio, open a project that contains an existing parser.

2. Right-click the parser name in the IntelliScript, and choose Create Serializerfrom the pop-up menu.

Conversion Agent notifies you that the serializer has been successfullycreated, and it displays the serializer in the IntelliScript.

The name of the serializer is derived from that of the parser, with the suffix_serializer. For example, if you create a serializer from Parser1, the serializeris called Parser1_serializer.

10

Page 202: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 10. Serializers

191

3. Conversion Agent stores the serializer in a new TGP script file, which has aname such as Parser1_auto_generated_serializer.tgp. Use the ConversionAgent Explorer of the Component view to open the new serializer file in anIntelliScript editor.

4. Test the serializer (see Running a Serializer below), and edit the IntelliScript ifrequired (see Troubleshooting an Auto-Generated Serializer).

Online Samples

In the Conversion Agent Samples folder, openProjects\Serialization\TabDelimited\TabDelimited.cmw.

The sample demonstrates a full parser/serializer cycle, using an auto-generatedserializer. You can run the sample in the following way:

1. Set MyHL7Parser as the startup component, and run it. This generates anoutput file Results\output.xml.

2. Now set MyHL7Parser_serializer as the startup component, and run it. At theprompt, browse to Results\output.xml as the input. The original input file isregenerated.

A variant of this project is in Projects\Serialization\HL7\HL7.cmw. You cangenerate the serializer yourself and try the above experiment.

Controlling How the Create Serializer Command Works

When you run the Create Serializer command, Conversion Agent converts theContent anchors of the parser to ContentSerializer serialization anchors.

By default, the command converts all other text in the example source toStringSerializer serialization anchors. Assuming that the other text containsboilerplate text, this means that the output of the serializer contains all theboilerplate that was in the original example source.

For example, suppose the parser runs on tab-delimited source documents havingthe following structure:

Page 203: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 10. Serializers

192

Name (first and last):<tab>Ron Lehrer

Assume that the anchors are defined in the following way:

Source text Anchor

Name Marker

(first and last):<tab> (not marked as an anchor)

Ron Lehrer Content

The XML output of the parser is:

<FullName>Ron Lehrer<FullName>

Now, suppose that you generate a serializer from this parser, and you run theserializer on the following input:

<FullName>Larissa Chan<FullName>

The output of the serializer is:

Name (first and last):<tab>Larissa Chan

Serialization Mode

The example source may contain text that you don't want in the serializer output.In that case, you can modify the behavior of the Create Serializer command, in away that does not generate the StringSerializer serialization anchors.

To do this, set the serialization_mode property of the Parser component (seeChapter 3, Parsers). The possible values of the serialization_mode are:

FullThe Create Serializer command copies the non-XML text to the serializerconfiguration.

OutlineThe Create Serializer command copies only the delimiters of the non-XML textto the serializer configuration.

Under the Outline option, you can select the use_markers option. This causesthe Create Serializer command to copy the content of the Marker anchors butonly the delimiters of other non-XML text.

The following table illustrates the results of the serialization_mode settings.

Page 204: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 10. Serializers

193

serialization_modeproperty of theparser:

The Create Serializer command converts: Sample serializeroutput:

outlineWith use_markersnot selected

Content anchors to ContentSerializerserialization anchorsThe delimiters of other text in the examplesource to StringSerializer serializationanchors

<tab>Larissa Chan

outlineWith use_markersselected

Content anchors to ContentSerializerserialization anchorsThe complete text of Marker anchors toStringSerializer serialization anchorsThe delimiters of other text in the examplesource to StringSerializer serializationanchors

Name<tab>LarissaChan

full (the default) Content anchors to ContentSerializerserialization anchorsAll other text in the example source toStringSerializer serialization anchors

Name (first andlast):<tab>LarissaChan

Of course, you can edit the auto-generated serializer configuration to furthermodify the output.

Troubleshooting an Auto-Generated Serializer

Often, you can use an automatically generated serializer directly. In some cases,you may need to edit the serializer configuration for it to work correctly.

The following paragraphs list some typical circumstances under which you need toedit the serializer, and the suggested editing steps.

Root TagOn the XML Generation tab of the project properties, there is an option to AddXML Root Element. The effect of this option is to nest the parser output in aspecified root element (see XML Generation Page in Chapter 13, Project Properties).

If this option is selected, and you try to run an auto-generated serializer on theparser output, it cannot find the input XML elements because of the nesting.

The solution is to set the root_tag property of the serializer to the same value as inthe project properties. The serializer then finds its input nested under the root.

Page 205: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 10. Serializers

194

Parser is configured to add XML root element

Assign the root_tag property of the serializer

VariablesIf the parser uses a variable to store intermediate results, an auto-generatedserializer may fail. To solve the problem, review the serializer logic, and removethe variable if necessary.

Additional ComponentsThe Create Serializer command inverts the anchors of a parser. It does not invertcomponents such as document processors, transformers, or actions.

For example, suppose that a parser uses a PdfToTxt_3_00 document processor toconvert PDF source documents to text. The parser contains anchors that transformthe text to XML.

Page 206: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 10. Serializers

195

The auto-generated serializer transforms the XML back to text. It does not convertthe text to PDF. You can obtain PDF output by submitting the text to the AdobeAcrobat Distiller or to any other PDF generation utility.

In another example, suppose that a parser uses an AddString transformer to add aprefix to the output of a Content anchor. The auto-generated serializer does notremove the prefix. If you need to remove it, you can edit the serializer and insert acomponent such as a Replace transformer.

Creating a Serializer by Using the New SerializerWizard

You can use the New Serializer wizard to create a serializer. The procedure isanalogous to that for creating a parser (see Using the New Parser Wizard inChapter 3, Parsers).

Opening the Wizard

To create a new project that contains a serializer, choose File > New > Project onthe Conversion Agent Studio menu. In the left pane of the New Project window,select Conversion Agent. In the right pane, select Serializer Project.

To create a new serializer in an existing project, choose File > New > Serializer onthe menu.

Wizard Options

The New Serializer wizard prompts you for options such as the following:

Serializer nameA name for the serializer.

Script nameA name for a TGP script file, where the wizard stores the serializer definition.

Schema file path(Optional) The name of an XSD schema, which defines the data holders wherethe parser will store its output.

If you omit this step, you can add an XSD schema to the project afterwards, byright-clicking in the Conversion Agent Explorer view.

Completing the Serializer Configuration

After you have entered the wizard options, click Finish to create the serializer. Theserializer is displayed in the Conversion Agent Explorer view and the Componentview of Conversion Agent Studio.

To complete the serializer configuration:

1. Display the serializer in the IntelliScript editor.

2. Under the contains line, add a sequence of serialization anchors and actions.

Page 207: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 10. Serializers

196

3. Run and test the serializer (see Running a Serializer below), and modify theIntelliScript as required.

Creating a Serializer by Editing the IntelliScript

Like all other Conversion Agent components, you can create a serializer by editingthe IntelliScript directly.

Add an XSD file to the project, which defines the schema of the XML filesthat you want to serialize (see Chapter 6, Data Holders).

1. At the global (top) level of the IntelliScript, add a Serializer component.

2. Edit the properties of the Serializer as required (see the Serializer ComponentReference below).

3. Nested within the Serializer, add a sequence of serialization anchors (seeDefining Serialization Anchors). Optionally, you may also add actions.

4. Test the serializer (see Running a Serializer below), and modify the IntelliScriptif required.

Online SampleIn the Conversion Agent Samples folder, openProjects\ManualSerializer\ManualSerializer.cmw. The sample illustrates aserializer that was created by editing the IntelliScript. You can run the serializer onthe input file Example XML of Person.xml .

Creating a Serializer within a RunSerializer Action

In addition to defining a serializer at the global level, it is possible to define aserializer within a RunSerializer action. For details, see Chapter 9, Actions.

Running a Serializer

To run a serializer in Conversion Agent Studio:

1. Set the serializer as the startup component.

2. On the menu, choose Run > Run.

3. You are prompted to open the input XML file.

If you created the serializer from a parser, a convenient test file is the parseroutput. Browse to the output file (by default, Results\output.xml in theproject folder).

Page 208: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 10. Serializers

197

Note: Alternatively, you can set the example_source property of the serializer.This lets you test a serializer repeatedly on the same input, without needing tobrowse to the file each time.

4. When the execution is complete, Conversion Agent Studio displays the Eventsview. Examine the events for any failures or warnings.

5. To view the serialization results, open the results file, which is located in theResults folder of the project.

Serialization Anchors

The main components that you can use in a serializer are called serialization anchors.These are analogous to the anchors that are used in a parser (which are calledsimply anchors), except that they work in the opposite direction. Anchors read datafrom locations in the source document and write the data to XML. Serializationanchors read XML data and write the data to locations in the output document.

Please note that a serialization anchor is not an anchor, despite their similar names.You cannot use anchors in a serializer, and you cannot use serialization anchors in aparser.

The most important serialization anchors are called ContentSerializer andStringSerializer. A ContentSerializer writes the content of a specified dataholder to the output document; it is the inverse of a Content anchor, which readscontent from a source document. A StringSerializer writes a predefined string tothe output; it is the inverse of a Marker anchor, which finds a predefined string in asource document.

Example of Serialization Anchors

The following example illustrates three serialization anchors of a serializer.

The first anchor is a StringSerializer, which instructs the serializer to write thefollowing text in the output document:

First Name:<tab>

Page 209: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 10. Serializers

198

The second anchor is a ContentSerializer. The anchor seeks the value of thePerson/Name/First element in the XML, and writes the value in the output.

The third anchor is a StringSerializer, which writes the string:

<newline>Last Name:<tab>

As always, the IntelliScript represents the newline and tab using ASCII codes and «,respectively. To edit the special characters, see Using the IntelliScript Editor in the bookConversion Agent Studio in Eclipse .

Now, assume that you run the serializer on the following XML:

<Person gender="M"><Name>

<First>Ron</First><Last>Lehrer</Last>

</Name><Id>547329876</Id><Age>27</Age>

</Person>

From the illustrated serialization anchors, the output is:

First Name<tab>Ron<newline>Last Name<tab>

The display of this text is:

Of course, the serializer contains additional serialization anchors, which are notshown in the above illustration. The complete output of the serializer is:

Defining Serialization Anchors

To define serialization anchors, edit the IntelliScript under a Serializer component.

Page 210: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 10. Serializers

199

Sequence of Serialization Anchors

A serializer executes the serialization anchors in the sequence of their definitions.

Serialization anchors write data sequentially, always appending it to the end of theoutput document. Of course, you can alter the order by changing the sequence inthe serializer configuration.

You can intersperse actions with the serialization anchors. The actions are executedas part of the sequence.

Standard Serializer Properties

In this section, we review certain properties that are found in the Serializercomponent and in many serialization anchors. For additional properties that arespecific to particular components, see the Serializer Component Reference andSerialization Anchor Component Reference.

nameA name that you assign to the component. Conversion Agent includes thename in the event log. This can help you find an event that was caused by theparticular component.

remarkA comment describing the component.

disabledIf selected, Conversion Agent ignores the component. This is useful for testingand debugging, or for making minor modifications in a project withoutdeleting the existing components.

optionalBy default, if a component fails, the parent component (such as a Serializerin which a serialization anchor is nested) fails. If you select the optionalproperty, the parent component does not fail.

Page 211: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 10. Serializers

200

Serializer Quick Reference

The main serialization component is:

SerializerThe main component of a serializer, which converts XML to an outputdocument format.

Within a Serializer component, you can nest the following serialization anchors:

AlternativeSerializersSpecifies alternative serialization anchors that may be appropriate, dependingon the structure of the XML.

ContentSerializerSerializes XML data and writes it to the output document.

DelimitedSectionsSerializerSerializes sections of data, writing a separator string between them.

EmbeddedSerializerRuns a secondary serializer.

GroupSerializerBinds a set of serialization anchors together for processing as a unit.

RepeatingGroupSerializerCreates a repetitive structure in the output document.

StringSerializerWrites a specified string to the output document.

Serializer Component Reference

This section documents the top-level Serializer component. For serializationanchors, see the Serialization Anchor Component Reference.

Serializer

A Serializer converts XML documents to output documents in any format.

Advanced Properties

example_sourceA sample XML source document. When you run the serializer in ConversionAgent Studio, it operates on the sample document.

The value of the property is an input port such as LocalFile (a file on yourcomputer), URL (a file on a web site), or Text (a text string). If you assign aLocalFile, Conversion Agent copies the file to the project directory.

Page 212: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 10. Serializers

201

If you leave the example_source property blank, Conversion Agent promptsyou for a source document when you run the serializer.

To view the sample document in the example pane, choose IntelliScript >Open Example Source on the menu.

Nested within the example_source property, you can assign a preprocessor,which converts the source documents to a format that the serializer can accept.For example, the example_source might be a Microsoft Excel workbookconfigured with the ExcelToXml preprocessor (see Chapter 4, DocumentProcessors).

output_file_extensionThe file extension of the generated output file, including the leading period,for example:

.txt

root_tagThe name of a root XML element, which is not in the XSD schema of theproject.

For example, if the top-level element of the schema is Person, but the XMLinput nests Person in an element called InputWrapper, enter root_tag =InputWrapper.

default_transformersA list of transformers that the Serializer applies to all serialized data.

The following properties are useful in situations where the serializer must selectspecific occurrences of data holders. For an explanation, see Chapter 12, Locators,Keys, and Indexing.

source

target

For explanations of the following properties, see Standard Serializer Properties:

name

remark

Serialization Anchor Component Reference

This section describes the serialization anchors, which you can use in a Serializer.

AlternativeSerializers

This serialization anchor lets you define a set of alternative, nested serializationanchors. You can define a criterion for which alternative the parser should accept.Only the accepted alternative affects the serializer output. The other serializationanchors (whether failed or successful) have no effect on the serializer output.

Page 213: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 10. Serializers

202

Example

The input XML may contain a Product element or a Service element, but not both.You wish to serialize whichever element is in the input.

In an AlternativeSerializers serialization anchor, and set its selector propertyto ScriptOrder.

Within the AlternativeSerializers, nest two ContentSerializer serializationanchors. Configure one of them to process the Product element, and the other toprocess Service.

Basic Propertiesselector

The criterion for deciding which alternative to accept. The options are:

selector property Explanation

ScriptOrder Conversion Agent tests the nested serialization anchors in thesequence that they are defined in the IntelliScript. It accepts thefirst one that succeeds.If all the nested serialization anchors fail, theAlternativeSerializers component fails.

NameSwitch Conversion Agent searches for the nested serialization anchorwhose name property is specified in a data holder (select from aSchema view). It ignores the other nested serialization anchors.If the named serialization anchor fails, theAlternativeSerializers component fails.

Advanced Properties

For explanations of the following properties, see Standard Serializer Properties:

name

remark

disabled

optional

ContentSerializer

This serialization anchor writes the serialized data to the output document.

Basic Propertiesopening_str

A string that the anchor should write before the data_holder.

Page 214: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 10. Serializers

203

closing_strA string that the anchor should write after the data_holder.

data_holderThe data holder containing the data (select from a Schema view).

Advanced Properties

allow_empty_valuesIf selected, the data_holder can be empty. If not selected, and the data_holderis empty, the ContentSerializer fails.

ignore_default_transformersIf selected, the default transformers of the Serializer are not applied to theserialized data.

transformersA list of transformers that are applied to the serialized data.

For explanations of the following properties, see Standard Serializer Properties:

name

remark

disabled

optional

DelimitedSectionsSerializer

This serialization anchor processes sections of data. Between each section of theoutput, the DelimitedSectionsSerializer writes a separator string.

Within the DelimitedSectionsSerializer, you should nest other serializationanchors. Each nested serialization anchor is responsible for outputting a singlesection.

Example

The XML input contains an employee resume. You wish to write the data to anoutput text document in the following format:

----------------------------Jane PalmerEmployee ID 123456----------------------------Professional Experience...----------------------------Education...

Page 215: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 10. Serializers

204

You can define a DelimitedSectionsSerializer, with the line of hyphens as theseparator. Because you want a line of hyphens before each section, setseparator_position = before.

Within the DelimitedSectionsSerializer, nest three GroupSerializercomponents. The first GroupSerializer writes the Jane Palmer section, the secondwrites the Professional Experience sections, and so forth.

Optional SectionsIn the above example, suppose that the second section, Professional Experience,is missing from some input XML documents. Nevertheless, you want to write itsseparator (the line of hyphens).

----------------------------Jane PalmerEmployee ID 123456--------------------------------------------------------Education...

To handle this situation, you should configure the DelimitedSectionsSerializerin the following way:

In the second GroupSerializer, select the optional property. This means thatif the GroupSerializer fails, it should not cause theDelimitedSectionsSerializer to fail.

In the DelimitedSectionsSerializer, set using_placeholders = always. Thismeans to write the separator of an optional section, even if the section itself ismissing.

Now suppose that if the Professional Experience section is missing, you do notwant to write its separator:

----------------------------Jane PalmerEmployee ID 123456----------------------------Education...

In this case, you should configure the DelimitedSectionsSerializer as follows:

In the second GroupSerializer, select the optional property.

In the DelimitedSectionsSerializer, set using_placeholders = never. Thismeans not to write the separator of a missing section.

Basic Propertiesseparator_position

Position of the separator relative to the sections.

Page 216: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 10. Serializers

205

The following table explains the values. The examples assume that theseparator is a vertical-line character ( | ).

separator_position Explanation Example

before Write a separator before each section (includingthe first section).

|1|2|3|4

after Write a separator after each section (includingthe last section).

1|2|3|4|

between Write a separator between the successivesections (not before the first section and not afterthe last section).

1|2|3|4

around Write separators before and after each section(including the first and last sections).

|1|2|3|4|

using_placeholdersThis property specifies whether the DelimitedSectionsSerializer shouldwrite the separator of an optional section that is missing from the XML input.

The following table explains the values. The examples assume that the basicoutput structure is|1|2|3|4. The examples illustrate the output if sections 2and 4 are missing.

using_placeholders Explanation Example

always Always write the separator of a missing section. |1||3|

never Never write the separator of a missing section. |1|3

when necessary Always write the separator of a missing internalsection. Never write the separator of a missingterminal section.

|1||3

separatorThe separator string.

Advanced PropertiesFor explanations of the following properties, see Standard Serializer Properties:

name

remark

disabled

optional

Page 217: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 10. Serializers

206

EmbeddedSerializer

This serialization anchor activates a secondary Serializer, which writes its outputin the same output document.

ExampleThe XML input is a family tree. The input contains Person elements, which arerecursively nested as shown:

<Person> <!-- Parent -->...<Person> <!-- Child -->

...<Person> <!-- Grandchild -->

...</Person>

</Person></Person>

A Serializer can use an EmbeddedSerializer component to call itself recursively,until all levels of nesting are exhausted.

Basic Properties

serializerThe name of the secondary serializer (choose from the list). The serializer mustbe defined at the global level of the IntelliScript.

schema_connectionsConnects the data holders that are referenced in the secondary serializer to thedata holders that are referenced in the main serializer. The property contains alist of Connect subcomponents, which specify the correspondence (see theConnect component in Chapter 7, Anchors.).

If all the data holders in the main and secondary serializers are identical, youcan omit this property. If there are any differences between the data holders,you must connect the data holders explicitly (even the ones that are identical).

In the recursive example described above, Person should be connected toPerson/Person. This instructs the secondary instance of the serializer toprocess a nested level of the input. It is sufficient to connect just the parentelement (Person), and not the nested elements (Person/*s/Name,Person/*s/BirthDate, etc.), provided that the two Person elements have thesame XSD type.

Advanced PropertiesFor explanations of the following properties, see Standard Serializer Properties:

Page 218: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 10. Serializers

207

name

remark

disabled

optional

Online Sample

For a sample, where an EmbeddedSerializer calls a serializer recursively, seeDefining a Serializer in the book Getting Started with Conversion Agent.

GroupSerializer

The GroupSerializer serialization anchor binds its nested serialization anchorstogether. You can set properties of the GroupSerializer, which affect the membersof the group.

Advanced PropertiesThe following properties are useful in situations where the serialization anchormust select specific occurrences of data holders. For an explanation, see Chapter12, Locators, Keys, and Indexing.

source

target

For explanations of the following properties, see Standard Serializer Properties:

name

remark

disabled

optional

RepeatingGroupSerializer

This serialization anchor writes a repetitive structure to the output document.

The anchor is useful if the XML data contains a multiple-occurrence data holder(see Multiple-Occurrence Data Holders in Chapter 6, Data Holders). TheRepeatingGroupSerializer iterates over the occurrences of the data holder andoutputs the data.

Within the RepeatingGroupSerializer, you should nest serialization anchors thatprocess and output each occurrence of the data holder. Optionally, you can definea separator, which the RepeatingGroupSerializer writes to the output between theiterations.

Example

The XML input contains the following structure:

Page 219: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 10. Serializers

208

<Persons><Person>

<Name>John</Name><Age>35</Age>

</Person><Person>

<Name>Larissa</Name><Age>42</Age>

</Person>...

</Persons>

A RepeatingGroupSerializer, using a newline character as a separator, can outputthis data to:

John 35Larissa 42

You can iterate over several multiple-occurrence data holders in parallel, forexample, you can iterate over a list of men and a list of women, and output a list ofmarried couples. To do this, within the repeating group, add a ContentSerializerfor each data holder.

Basic Properties

separator_positionPosition of the separator relative to the iterations.

The following table explains the values. The examples assume that theseparator is a vertical-line character ( | ).

separator_position Explanation Example

before Write a separator before each iteration (includingthe first iteration).

|1|2|3

after Write a separator after each iteration (includingthe last iteration).

1|2|3|

between Write a separator between the successiveiterations (not before the first iteration and notafter the last iteration).

1|2|3

around Write separators before and after each iteration(including the first and last iterations).

|1|2|3|

separatorA serialization anchor that outputs the separator (typically aStringSerializer). Leave this property empty if you do not want to output aseparator.

Page 220: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 10. Serializers

209

Advanced Properties

countThe number of iterations to run. You may enter a number, or click the browsebutton and select a data holder that contains the number. If blank, theiterations continue until the input is exhausted.

current_iterationA data holder, where the RepeatingGroupSerializer should output thenumber of the current iteration (select from a Schema view). You can use aContentSerializer to write the number to the output.

The following properties are useful in situations where the serialization anchormust select specific occurrences of data holders. For an explanation, seeChapter 12, Locators, Keys, and Indexing.

source

target

For explanations of the following properties, see Standard Serializer Properties.

name

remark

disabled

optional

StringSerializerThis serialization anchor writes a predefined string to the output document.

Basic Properties

strThe string to write.

Advanced Properties

For explanations of the following properties, see Standard Serializer Properties:

name

remark

disabled

Page 221: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 11. Mappers

210

Mappers

Mappers are components that convert an XML source document to another XMLstructure or schema.

A mapper processes the XML input like a serializer, and generates the XML outputlike a parser. Because both the input and the output are fully structured XML, theconfiguration is very easy.

The principles of mapper operation are similar to those of a serializer. Before youread this chapter, we recommend that you read Chapter 10, Serializers, where theprinciples are discussed in depth.

Within a mapper, you can nest mapper anchors and actions. Mapper anchors areanalogous to anchors (which are used in parsers) and to serialization anchors(which are used in serializers).

This chapter explains how to configure the mapper and mapper anchorcomponents.

Creating a Mapper

To create a mapper, you should edit the IntelliScript. The general procedure is asfollows:

1. Add XSD schemas to the project. You need a schema for the input XML andfor the output XML.

It is permitted to use either the same schema or different schemas for the inputand the output.

2. At the global (top) level of the IntelliScript, add a Mapper component.

3. Assign the source and target properties of the Mapper, and edit the otherproperties as required (see the Mapper Component Reference below).

4. Nested within the Mapper, add a sequence of actions and mapper anchors (seeComponents Nested within a Mapper).

5. Test the mapper (see Running a Mapper below), and modify the IntelliScript ifrequired.

11

Page 222: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 11. Mappers

211

Creating a Mapper within a RunMapper Action

In addition to defining a mapper at the global level, it is possible to define amapper within a RunMapper action. For details, see Chapter 9, Actions.

Components Nested within a Mapper

Within a Mapper, you should nest the following components:

Any number of Map actions, which retrieve a data holder from the output andwrite the content to the output.

Optionally, any number of mapper anchors (see the Mapper Anchor ComponentReference below).

The Map actions and the mapper anchors can be in any sequence. You can alsoinsert any other desired actions in the sequence.

Notice that the work of the mapper—writing to the output XML—is done by Mapactions rather than by mapper anchors. This may seem a little different fromparsers and serializers, where the output is created by anchors and serializationanchors, respectively. Actually, this is just a terminology issue. The Map actioncould have been defined as a mapper anchor. It is defined as an action because it isuseful in other circumstances, which are not related to mappers.

Mapper Example

To illustrate the mapper configuration, we present a simple example.

Source XMLThe input of the mapper is an XML document, which contains a list of personalnames and their associated ID numbers.

<Persons><Person ID="10">Bob</Person><Person ID="17">Larissa</Person><Person ID="13">Marie</Person>

</Persons>

Output XML

The desired output of the mapper is an XML list of the names and ID numbers,with no association between them.

<SummaryData><Names>

<Name>Bob</Name><Name>Larissa</Name>

Page 223: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 11. Mappers

212

<Name>Marie</Name></Names><IDs>

<ID>10</ID><ID>17</ID><ID>13</ID>

</IDs></SummaryData>

Mapper Configuration

The following mapper configuration performs the desired data transformation:

The RepeatingGroupMapping iterates over the Person elements of the input. It usesMap actions to write the data to the Name and ID elements of the output.

Running a Mapper

To run a mapper in Conversion Agent Studio:

1. Set the mapper as the startup component.

2. On the menu, choose Run > Run.

3. You are prompted to open the input XML file.

Page 224: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 11. Mappers

213

Note: Alternatively, you can set the example_source property of the mapper.This lets you test a mapper repeatedly on the same input, without needing tobrowse to the file each time.

4. When the execution is complete, Conversion Agent Studio displays the Eventsview. Examine the events for any failures or warnings.

5. To view the mapping results, open the output.xml file, which is in the Resultsfolder of the project.

Standard Mapper Properties

In this section, we review certain properties that are found in the Mappercomponent and in many mapper anchors. For additional properties that arespecific to particular components, see the Mapper Component Reference and theMapper Anchor Component Reference.

nameA name that you assign to the component. Conversion Agent includes thename in the event log. This can help you find an event that was caused by theparticular component.

remarkA comment describing the component.

disabledIf selected, Conversion Agent ignores the component. This is useful for testingand debugging, or for making minor modifications in a project withoutdeleting the existing components.

optionalBy default, if a component fails, the parent component (such as a Mapper inwhich a mapper anchor is nested) fails. If you select the optional property, theparent component does not fail.

Mapper Quick Reference

The main mapping component is:

MapperThe main component of a mapper, which converts XML to XML.

Within a Mapper component, you can nest the following mapper anchors:

AlternativeMappingsDefines a set of nested mappings, one of which is valid for the current XMLcontext.

EmbeddedMapperActivates a secondary mapper.

Page 225: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 11. Mappers

214

GroupMappingBinds its nested mapper anchors and actions together.

RepeatingGroupMappingMaps repetitive XML structures.

Mapper Component Reference

The runnable Mapper component is documented in this section. For mapperanchors, see the Mapper Anchor Component Reference.

Mapper

A Mapper performs XML to XML transformations. It converts a source XMLdocument to an output document, which has a different XML structure.

You must use the source and target properties to identify the root elements of theXML documents. For example, suppose that the document element of the sourceXML is Persons , and the document element of the output is SummaryData. Youshould set the source and target as follows:

Basic Properties

sourceUnder this property, insert a Locator component, and select the root of thesource XML from a Schema view (for information about other options of thisproperty, see Chapter 12, Locators, Keys, and Indexing).

targetUnder this property, insert a Locator component, and select the root of theoutput XML from a Schema view (for information about other options of thisproperty, see Chapter 12, Locators, Keys, and Indexing).

Page 226: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 11. Mappers

215

Advanced Properties

example_sourceA sample XML source document. When you run the mapper in ConversionAgent Studio, it operates on the sample document.

The value of the property is an input port such as LocalFile (a file on yourcomputer), URL (a file on a web site), or Text (a text string). If you assign aLocalFile, Conversion Agent copies the file to the project directory.

If you leave the example_source property blank, Conversion Agent promptsyou for a source document when you run the mapper.

To view the sample document in the example pane, choose IntelliScript >Open Example Source on the menu.

Nested within the example_source property, you can assign a preprocessor,which converts the source documents to a format that the mapper can accept.For example, the example_source might be a Microsoft Excel workbookconfigured with the ExcelToXml preprocessor (see Chapter 4, DocumentProcessors).

root_tagThe name of a root XML element, which is not in the XSD schema of the input.

For example, if the top-level element of the schema is Person, but the XMLinput nests Person in an element called InputWrapper, enter root_tag =InputWrapper.

For explanations of the following properties, see Standard Mapper Properties:

name

remark

Mapper Anchor Component Reference

This section describes the mapper anchors, which you can use in a Mapper.

AlternativeMappings

This mapper anchor lets you define a set of alternative, nested mapper anchors.You can define a criterion for which alternative the parser should accept. Only theaccepted alternative affects the serializer output. The other mapper anchors(whether failed or successful) have no effect on the mapper output.

ExampleThe input XML may contain a Product element or a Service element, but not both.You wish to process whichever element is in the input.

In an AlternativeMappings serialization anchor, and set its selector property toScriptOrder.

Page 227: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 11. Mappers

216

Within the AlternativeMappings, nest two ContentSerializer serializationanchors. Configure one of them to process the Product element, and the other toprocess Service.

Basic Propertiesselector

The criterion for deciding which alternative to accept. The options are:

selector property Explanation

ScriptOrder Conversion Agent tests the nested mapper anchors in thesequence that they are defined in the IntelliScript. It accepts thefirst one that succeeds.If all the nested mapper anchors fail, theAlternativeMappings component fails.

NameSwitch Conversion Agent searches for the nested mapper anchor whosename property is specified in a data holder (select from a Schemaview). It ignores the other nested mapper anchors.If the named mapper anchor fails, the AlternativeMappingscomponent fails.

Advanced PropertiesFor explanations of the following properties, see Standard Mapper Properties:

name

remark

disabled

optional

EmbeddedMapper

This mapper anchor activates a secondary Mapper, which stores its output in thesame output document.

Example

The XML input is a family tree. The input contains Person elements, which arerecursively nested as shown:

<Person> <!-- Parent -->...<Person> <!-- Child -->

...<Person> <!-- Grandchild -->

Page 228: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 11. Mappers

217

...</Person>

</Person></Person>

A Mapper can use an EmbeddedMapper component to call itself recursively, until alllevels of nesting are exhausted.

Basic Propertiesmapper

The name of the secondary mapper (choose from the list). The mapper must bedefined at the global level of the IntelliScript.

schema_connectionsConnects the data holders that are referenced in the secondary mapper to thedata holders that are referenced in the main mapper. The property contains alist of Connect subcomponents, which specify the correspondence (see theConnect component in Chapter 7, Anchors.).

If all the data holders in the main and secondary mappers are identical, youcan omit this property. If there are any differences between the data holders,you must connect the data holders explicitly (even the ones that are identical).

In the recursive example described above, Person should be connected toPerson/Person. This instructs the secondary instance of the mapper to processa nested level of the input. It is sufficient to connect just the parent element(Person), and not the nested elements (Person/*s/Name, Person/*s/BirthDate,etc.), provided that the two Person elements have the same XSD type.

Advanced PropertiesFor explanations of the following properties, see Standard Mapper Properties:

name

remark

disabled

optional

GroupMapping

The GroupMapping mapper anchor binds its nested mapper anchors and actionstogether. You can set properties of the GroupMapping, which affect the members ofthe group.

Advanced Properties

The following properties are useful in situations where the mapper anchor mustselect specific occurrences of data holders. For an explanation, see Chapter 12,Locators, Keys, and Indexing.

Page 229: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 11. Mappers

218

source

target

For explanations of the following properties, see Standard Mapper Properties:

name

remark

disabled

optional

RepeatingGroupMapping

This mapper anchor processes a repetitive structure in the input or output.

The anchor is useful if the XML input and/or output contains a multiple-occurrence data holder (see Multiple-Occurrence Data Holders in Chapter 6, DataHolders). The RepeatingGroupMapping iterates over occurrences of the data holders.

Within the RepeatingGroupMapping, you should nest mapper anchors and actionsthat process each occurrence of the data holder.

Example

For an example of a RepeatingGroupMapping, see the Mapper Example above.

Advanced Propertiescount

The number of iterations to run. You may enter a number, or click the browsebutton and select a data holder that contains the number. If blank, theiterations continue until the input is exhausted.

current_iterationA data holder, where the RepeatingGroupMapping should output the number ofthe current iteration (select from a Schema view).

Advanced Properties

The following properties are useful in situations where the mapper anchor mustselect specific occurrences of data holders. For an explanation, see Chapter 12,Locators, Keys, and Indexing.

source

target

For explanations of the following properties, see Standard Mapper Properties:

name

remark

disabled

optional

Page 230: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 12. Locators, Keys, and Indexing

219

Locators, Keys, and Indexing

In designing a data transformation, a frequent issue is how to locate the dataholders that you wish to process. If the same data holders can occur multiple timesin an XML structure, there can be ambiguities in identifying the occurrences. Thischapter explains how to use the Locator and Key components to resolve theambiguities.

The components described in this chapter let you identify the occurrences ofmultiple-occurrence data holders in three ways:

Sequentially. Each iteration of a component processes the next occurrence of thedata holder.

By occurrence number. For example, a component can select the thirdoccurrence of a data holder.

By a key, such as an attribute or a nested element, which uniquely identifiesthe occurrence of the data holder.

The sequential approach is the default. However, it is subject to some complexities,which you can control by using the Locator component.

The occurrence-number and key approaches are collectively known as indexing.The term is analogous to the index of a book, where you use a page number or asubject key to identify the location of information. You can implement the indexingby using components called LocatorByOccurrence, LocatorByKey, and Key.

You can use the locator and key components in parsers, serializers, or mappers.You can use the components to identify the occurrences of data holders in theinput, the output, or both.

The locator components are nested in the source and target properties of variousother components, such parsers, serializers, and mappers. We have referred tothese properties in previous chapters, but without a proper explanation. Themeaning and usage of the source and target properties is explained here.

Example of Locators

To understand the issues involved in identifying data holders, consider thefollowing example. The example illustrates the use of:

The target property

12

Page 231: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 12. Locators, Keys, and Indexing

220

The Locator component

We will explain the broad outline of the example here. In the following sections ofthe chapter, we will go back and explain how the target and the Locator work indetail.

Input and OutputSuppose that the output schema of a parser supports the following structure:

<Report><Company>

<Employee>John</Employee><Employee>Leslie</Employee><Employee>Pedro</Employee>

</Company><Company>

<Employee>Marie</Employee><Employee>Larry</Employee><Employee>Frances</Employee>

</Company></Report>

The source document, which the parser should process, is a list that contains asingle employee per company (let's say, the CEO of each company):

JohnMarie

The output of the parser should be:

<Report><Company>

<Employee>John</Employee></Company><Company>

<Employee>Marie</Employee></Company>

</Report>

Incorrect Solution

Suppose that you use the following RepeatingGroup to parse the source document:

Page 232: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 12. Locators, Keys, and Indexing

221

The output is incorrect:

The problem is that both Company and Employee are multiple-occurrence elements.The RepeatingGroup creates multiple Employee elements correctly, but it doesn'tknow that each Employee element should be nested in a separate Company element.

Correct Solution

To resolve the ambiguity, you can assign the target property of theRepeatingGroup.

The target identifies the data holder that the RepeatingGroup should create. Thetarget contains a Locator component, which points to the Company element. Thismeans that in each iteration of the RepeatingGroup should create a new occurrenceof the Company element.

If you configure the RepeatingGroup in this way, the output is correct:

Page 233: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 12. Locators, Keys, and Indexing

222

Example of Indexing by Key

To further introduce the data-holder identification issues, we present an exampleof indexing by key.

The example is a mapper, which uses indexing to identify the occurrences of dataholders in both its input and its output. On the input side, the indexing is used tomatch the corresponding data from different parts of an XML structure. On theoutput side, the indexing is used to find the correct location of an element in anXML structure.

The example illustrates the use of:

The source and target properties

The Locator, Key, and LocatorByKey components

In the following sections of the chapter, we will explain the detailed operation ofthese properties and components.

Source XMLThe input XML is a report listing the names of parents and their children.

For each parent, the XML lists a first name, a last name, and an ID.

For each child, the XML lists a first name, a hobby, and the ID of the parent.

<Report><Parents>

<Parent id="1" firstName="John" lastName="Smith"/><Parent id="2" firstName="Jane" lastName="Doe"/>

</Parents><Children>

<Child name="Eric" hobby="Swimming" parentID="1"/><Child name="Elizabeth" hobby="Biking" parentID="2"/><Child name="Mary" hobby="Painting" parentID="1"/><Child name="Edward" hobby="Swimming" parentID="2"/>

</Children></Report>

Page 234: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 12. Locators, Keys, and Indexing

223

Output XMLThe desired output is a list of hobbies, and the children who engage in each hobby.

<Hobbies><Hobby name="Swimming">

<Person firstName="Eric" lastName="Smith"/><Person firstName="Edward" lastName="Doe"/>

</Hobby><Hobby name="Biking">

<Person firstName="Elizabeth" lastName="Doe"/></Hobby><Hobby name="Painting">

<Person firstName="Mary" lastName="Smith"/></Hobby>

</Hobbies>

Outline of the Data Transformation ApproachThe data transformation is configured according to the following approach:

1. The input data is stored in the Child and Parent elements of the source XML.The corresponding Child and Parent elements are identified as follows:

id attribute of Parent = parentID attribute of Child

2. The data transformation creates Hobby and Person elements, where it stores itsoutput. Each Person is nested in the corresponding Hobby, as follows:

name attribute of Hobby = hobby attribute of Child

3. Write the child's first name into the Person element.

4. Write the parent's last name into the Person element.

Mapper Configuration

The IntelliScript uses Key components to define identifiers for the data holders:

The first Key specifies that the id attribute is a unique identifier of a Parentelement.

The second Key specifies that the name attribute is a unique identifier of a Hobbyelement.

Page 235: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 12. Locators, Keys, and Indexing

224

The IntelliScript then defines a Mapper, which has the following configuration:

The components of the Mapper configuration are described as follows. We havenumbered the components as in the Outline of the Data Transformation Approachabove.

Page 236: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 12. Locators, Keys, and Indexing

225

1. The source property of the RepeatingGroupMapping specifies that each iterationshould obtain its input from two data holders:

- From an occurrence of the Child element- From the corresponding occurrence of the Parent element

2. The target property of the RepeatingGroupMapping specifies that each iterationshould store its output in two data holders:

- In an occurrence of the Person element- In the corresponding occurrence of the Hobby element

3. The first Map action copies the name attribute of the Child to the firstNameattribute of the Person.

4. The second Map action copies the lastName attribute of the Parent into thelastName attribute of the Person.

Use of Indexing

The example uses indexing by key to identify the occurrences of the Parent andHobby data holders.

In the source property of the RepeatingGroupMapping, the indexing identifiesthe occurrence of Parent that corresponds to a Child.

In the target property, the indexing identifies the occurrence of Hobby where aPerson should be nested.

Source and Target Properties

The source and target properties exist in components such as the following:

In parsers:

ParserGroupRepeatingGroupEnclosedGroupFindReplaceAnchor

In serializers:

SerializerGroupSerializerRepeatingGroupSerializer

Page 237: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 12. Locators, Keys, and Indexing

226

In mappers:

MapperGroupMappingRepeatingGroupMapping

In all these categories, the meaning and usage of the properties is identical:

The sourceproperty identifies existing data holders that a data transformationshould use.

The target property identifies data holders that may or may not already exist.If they exist, the data transformation uses them. If they do not exist, the datatransformation creates them.

After you define the source and/or the target, the subsequent components use theidentified data holders. For example, if you define the target of a Group, theanchors nested within the Group use the data holders that the target identifies.

In the following sections, the meaning and usage are explained in detail.

There are properties called source and target in some other components, such as Map (seeChapter 9, Actions). These properties have a different meaning and usage from the above.For an explanation, please see the components where the properties are used.

Source Property

The source property identifies existing occurrences of data holders. The value ofthe source property is a list of the following components:

source Explanation

Locator Identifies a single-occurrence or multiple-occurrence data holder.In the latter case, each iteration accesses the next occurrence, insequence.

LocatorByKey Identifies an occurrence of a multiple-occurrence data holder by akey.

LocatorByOccurrence Identifies an occurrence of a multiple-occurrence data holder bynumber.

Page 238: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 12. Locators, Keys, and Indexing

227

Default Behavior

If you do not assign the source property of a component, the component identifiesdata holders in the following way:

If there is only one occurrence of the data holder, Conversion Agent uses theexisting occurrence.

If there are multiple occurrences of the data holder, the behavior is as follows:

- In an iterative context (for example, within a RepeatingGroupSerializer),each iteration accesses the next occurrence of the data holder, in sequence.

- In a non-iterative context (for example, a GroupSerializer that is not nestedwithin an iterative component), the component accesses the first occurrenceof the data holder.

Ambiguities in the Default Behavior

There can be some ambiguities in the default behavior. Ambiguities can arise, forexample, in the following circumstances.

In cases where a multiple-occurrence element is nested within anothermultiple-occurrence element. This is illustrated in Example 1 below.

In cases where the XSD schema permits alternative data holders (defined withxs:choice).

In cases where the XSD schema permits a data holder to be missing (definedwith minOccurs = 0).

In such cases, it is prudent to assign the source property explicitly.

Data Holder Must Exist

The source property identifies a data holder that already exists in the scope of thedata transformation. If the data holder does not exist, the component containingthe source property fails.

For example, suppose that the source property of a Group contains a non-optionalLocatorByOccurrence, which points to the third occurrence of a data holder. If onlytwo occurrences exist, the Group fails.

Using the Source Property for Input or OutputTypically, a component uses the source property to identify where it should obtaininput. For example, a GroupSerializer can use the property to identify anoccurrence that it should serialize.

It is also possible to use the property to identify where the component should storeoutput. For example, suppose that a parser has already created 10 occurrences ofan XML element. After the occurrences have been created, a Group anchor assignsan attribute in one occurrence of the element. The Group can use the sourceproperty to identify the occurrence.

Page 239: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 12. Locators, Keys, and Indexing

228

Example 1: Nested Multiple-Occurrence Data Holders

Suppose that the input schema of a serializer supports the following structure:

<Report><Company>

<Employee>John</Employee><Employee>Leslie</Employee><Employee>Pedro</Employee>

</Company><Company>

<Employee>Marie</Employee><Employee>Larry</Employee><Employee>Frances</Employee>

</Company></Report>

You want to iterate over all the Employee elements, and produce the followingoutput:

JohnLesliePedroMarieLarryFrances

At first thought, you might create a RepeatingGroupSerializer, and configure it tooutput the Employee data holder:

This does not work correctly! By default, each iteration selects a new instance ofEmployee within the same Company. The result is the output:

JohnLesliePedro

In other words, the RepeatingGroupSerializer accesses only the first Company.

Page 240: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 12. Locators, Keys, and Indexing

229

You can solve the problem by nesting the RepeatingGroupSerializer insideanother RepeatingGroupSerializer. To resolve any potential ambiguities, you canconfigure the source properties explicitly:

Each iteration of the outer RepeatingGroupSerializer processes a differentoccurrence of Company. Each iteration of the nested RepeatingGroupSerializerprocesses a different occurrence of Employee. The result is the desired output.

Alternatively, suppose you want to iterate only over the second Employee elementin each Company. The desired output is:

LeslieLarry

You can do this by configuring a single RepeatingGroupSerializer , whose sourceis Company. This causes each iteration to access the next instance of Company. Withinthe iteration, you can configure a GroupSerializer , whose source property uses aLocatorByOccurrence to select the second Employee . This generates the desiredoutput.

Page 241: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 12. Locators, Keys, and Indexing

230

Example 2: Indexing

The Example of Indexing by Key, at the beginning of this chapter, illustrates how touse the source property with indexing. The source property of theRepeatingGroupMapping is configured as follows:

The source property identifies two data holders:

It uses a Locator component to identify an occurrence of Child. Each iterationprocesses the next occurrence of Child, sequentially.

It uses a LocatorByKey component to identify an occurrence of Parent. Thiscauses each iteration to process the occurrence of Parent that corresponds tothe occurrence of Child.

Page 242: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 12. Locators, Keys, and Indexing

231

Target Property

The target property identifies an occurrence of a data holder, which may or maynot already exist. If the occurrence exists, the component uses it. If the occurrencedoes not exist, the component creates it.

The value of the target property is a list of the following components:

target Explanation

Locator Identifies a single-occurrence or multiple-occurrence data holder.In the latter case, each iteration creates a new occurrence.

LocatorByKey Identifies an occurrence of a multiple-occurrence data holder by anindexing key. If the occurrence does not yet exist, it is created.

LocatorByOccurrence Identifies an occurrence of a multiple-occurrence data holder bynumber.If the occurrence does not yet exist, it is created along with anyneeded intervening occurrences. For example, if four occurrencesexist, and LocatorByOccurrence specifies the tenthoccurrence, occurrences 5-9 are also created, but left empty.

Default Behavior

If you do not assign the target property of a component, the component identifiesdata holders in the following way:

If the schema permits only a single occurrence of the data holder, ConversionAgent accesses or creates the occurrence.

If the data holder can have multiple occurrences, the behavior is as follows:

- In an iterative context (for example, within a RepeatingGroup), each iterationcreates a new occurrence of the data holder.

- In a non-iterative context (for example, a Group that is not nested within aniterative component), the component creates one new occurrence of the dataholder.

Ambiguities in the Default Behavior

There can be some ambiguities in the default behavior. Ambiguities can arise, forexample, in the following circumstances.

In cases where a multiple-occurrence element is nested within anothermultiple-occurrence element. This is illustrated in Example 1 below.

In cases where the XSD schema permits alternative data holders (defined withxs:choice).

In cases where the XSD schema permits a data holder to be missing (definedwith minOccurs = 0).

Page 243: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 12. Locators, Keys, and Indexing

232

In such cases, it is prudent to assign the target property explicitly.

Data Holder Can Be Created

The target property identifies a data holder that may or may not already exist inthe scope of the data transformation. If the data holder does not exist, it is created.

For example, suppose that the target property of a Group contains a LocatorByKey,which points to a particular occurrence of a data holder. If the occurrence alreadyexists, the Group uses it. If the occurrence does not exist, the Group creates it.

Using the Target Property for Input or OutputTypically, a component uses the target property to identify where it should storeoutput. For example, a Group can use the property to identify an occurrence whereit should store data.

It is also possible to use the property to identify where a component should obtaininput. For example, suppose that a GroupSerializer contains an action, whichcomputes data and stores it in a variable. The GroupSerializer then contains aContentSerializer, which writes the variable to the output. You can use thetarget property to create the occurrence of the variable, which theGroupSerializer should use. The variable then serves as the input of theContentSerializer.

Example 1: Nested Multiple-Occurrence Data Holders

The Example of Locators, at the start of this chapter, illustrates how to use the targetproperty to differentiate between parent and child multiple-occurrence dataholders. The example is the exact inverse of the serializer example, which ispresented in Example 1 of the Source Property above.

Example 2: IndexingThe Example of Indexing by Key, above in this chapter, illustrates how to use thetarget property with indexing. The target property of the RepeatingGroupMappingis configured as follows:

The target property identifies two data holders:

Page 244: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 12. Locators, Keys, and Indexing

233

It uses a Locator component to identify an occurrence of Person. Each iterationcreates a new occurrence of Person.

It uses a LocatorByKey component to identify the occurrence of the Hobbyelement, where the occurrence of Person should be nested. If the Hobbyelement already exists, the data transformation uses it. If the Hobby elementdoes not yet exist, the data transformation creates it.

Standard Locator and Key Properties

In this section, we review certain properties that are found in the locator and keycomponents. For additional properties that are specific to particular components,see the Locator and Key Component Reference.

disabledIf selected, Conversion Agent ignores the component. This is useful for testingand debugging, or for making minor modifications in a project withoutdeleting the existing components.

optionalBy default, if a component fails, its parent component fails. If you select theoptional property, the parent component does not fail.

remarkA comment describing the component.

Locator and Key Component Quick Reference

The locator and key components are:

KeyDefines a unique identifier for a data holder.

LocatorIdentifies a single-occurrence or multiple-occurrence data holder.

LocatorByKeyIdentifies an occurrence of a multiple-occurrence data holder by using a key.

LocatorByOccurrenceIdentifies an occurrence of a multiple-occurrence data holder by number.

Locator and Key Component Reference

This section documents the locator and key components that are available inConversion Agent.

Page 245: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 12. Locators, Keys, and Indexing

234

Key

A Key component defines one or more attributes or elements, which together serveas a unique identifier of their parent element.

How to DefineYou can define a Key only at the global level of the IntelliScript. This lets youreference the Key anywhere in the project.

ExampleThe Example of Indexing by Key (above in this chapter) defines a key for the Hobbyelement in the following structure:

<Hobbies><Hobby name="Swimming">

<Person firstName="Eric" lastName="Smith"/><Person firstName="Edward" lastName="Doe"/>

</Hobby><Hobby name="Biking">

<Person firstName="Elizabeth" lastName="Doe"/></Hobby><Hobby name="Painting">

<Person firstName="Mary" lastName="Smith"/></Hobby>

</Hobbies>

The key is the name attribute, which uniquely identifies each Hobby.

Composite KeysOptionally, you can define a list of data holders as a composite key. To do this, nestmultiple data holders under the unique_fields property.

Consider the following example:

<Persons><Person ID="17" SubID="A">Bob</Person><Person ID="17" SubID="B">Jane</Person><Person ID="35" SubID="A">Larry</Person>

</Persons>

Page 246: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 12. Locators, Keys, and Indexing

235

Neither the ID attribute nor the SubID attribute identifies a Person elementuniquely. The combination of ID and SubID, however, is a unique identifier. Youcan define ID and SubID as a composite key.

Restrictions on the KeyThe unique_fields must be nested within the recurring_element. They can beattributes of the element, they can be nested elements at any level of nesting, orthey can be attributes of the nested elements.

This means, for example, that Persons/Person/SocialSecurity/@Number can be avalid key for Persons/Person, because @Number is nested within Persons/Person.On the other hand, Persons/Child is not a valid key for Persons/Person because itis not correctly nested.

The unique_fields must identify the closest ancestor that can have multipleoccurrences. For example, if both Parent and Child are multiple-occurrenceelements, then Parent/Child/@name can be a valid key for Parent/Child but not forParent.

The unique_fields must have simple data types. They cannot be structures.

Sibling and Non-Sibling OccurrencesA key uniquely identifies sibling occurrences of an element. It is permitted for non-sibling occurrences to have the same key.

For example, consider the following XML structure:

<Report><Company>

<Employee ID="1">John</Employee><Employee ID="2">Leslie</Employee>

</Company><Company>

<Employee ID="1">Marie</Employee><Employee ID="2">Larry</Employee>

</Company></Report>

The ID attribute can be a valid key for Employee because it uniquely identifies anEmployee within a single Company . The duplication of ID values in different Companyelements does not invalidate the key.

Keys of Reusable ElementsYou can define a key on a reusable element, which is defined in the XSD schema.

For example, suppose that Persons/Person can occur in several different contextswithin the XML. If you define ID as a key for Persons/Person, the key is valid inany context where Persons/Person is used.

Page 247: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 12. Locators, Keys, and Indexing

236

Enforced Uniqueness of a Key

Conversion Agent enforces the uniqueness of a Key. This has the followingconsequences:

If two or more sibling occurrences of an input element have the same keyvalues, Conversion Agent considers each occurrence to overwrite the previousoccurrences. It uses only the last occurrence that it encounters.

If an occurrence of an input element is missing a key value, the occurrence isignored.

If Conversion Agent outputs a keyed element, and a sibling element havingthe same key value already exists, the existing occurrence is overwritten.

In all these cases, Conversion Agent generates a warning in the event log.

Display in the Schema View

The Schema view displays a key in the following way:

The symbol

means that the name attribute has been defined as one of the unique_fields. Thesymbol

is called an XPath predicate. It is an XPath representation of the complete keydefinition.

Basic Properties

recurring_elementA multiple-occurrence element, whose occurrences are identified by the key(select from a Schema view).

unique_fieldsThe key (select one or more data holders from a Schema view).

Advanced Properties

For explanations of the following properties, see Standard Locator and Key Properties:

Page 248: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 12. Locators, Keys, and Indexing

237

disabled

remark

Locator

This component is used in the source and target properties to identify a dataholder.

You can use it to identify either a single-occurrence or multiple-occurrence dataholder. In the latter case, each iteration of the component that uses the Locatorprocesses the next occurrence of the data holder.

Example

For examples of Locator, see the Example of Locators and the Example of Indexing byKey above.

Basic Properties

data_holderThe data holder that the component identifies (select from a Schema view).

Advanced PropertiesFor explanations of the following properties, see Standard Locator and Key Properties:

disabled

optional

remark

LocatorByKey

This component is used in the source and target properties to identify anoccurrence of a multiple-occurrence data holder.

Before you use this component, you must define a Key at the global level of theIntelliScript. The Key specifies the data holder(s) that uniquely identify theoccurrence.

In the LocatorByKey configuration, you must specify:

The key that you wish to use.

The values of the key fields. You can specify the values either statically (bytyping a value) or dynamically (by selecting a data holder that contains thevalue).

ExampleThe Example of Indexing by Key at the beginning of this chapter illustrates how touse LocatorByKey.

Page 249: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 12. Locators, Keys, and Indexing

238

Conflicts Between Locators

In case of conflicts, a nested LocatorByKey overrides a parent locator.

For example, suppose that the target property of a Group contains a LocatorByKey,which points to the third occurrence of an element. A nested Group contains aLocatorByKey, which points to the fifth occurrence. The nested Group uses the fifthoccurrence.

Basic Properties

keyFrom a Schema view, select the XPath predicate representation of the key. Forexample, if you have defined Hobbies/Hobby/@name as a Key, then you canselect Hobbies/Hobby[@name=$1].

paramsUnder this property, specify the values of the $1 , $2, etc. parameters in theXPath predicate.

Type each value, or click the Browse button and select a data holder thatcontains the value.

Advanced Properties

For explanations of the following properties, see Standard Locator and Key Properties:

disabled

optional

remark

LocatorByOccurrence

This component is used in the source property to identify an occurrence of amultiple-occurrence data holder.

The component identifies the occurrence by number. For example, if there are tenoccurrences of a data holder, you can use LocatorByOccurrence to process the thirdoccurrence.

You can specify the occurrence number either statically (by entering a number) ordynamically (by selecting a data holder that contains the number).

ExampleFor an example of LocatorByOccurrence, Example 1 of the Source Property, above.

Conflicts Between Locators

In case of conflicts, a nested LocatorByOccurrence overrides a parent locator.

For example, suppose that the target property of a Group contains aLocatorByOccurrence, which points to the third occurrence of an element. A nested

Page 250: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 12. Locators, Keys, and Indexing

239

Group contains a LocatorByOccurrence , which points to the fifth occurrence. Thenested Group uses the fifth occurrence.

Basic Propertiesrecurring_element

The data holder that the component identifies (select from a Schema view).

occurrence_numberThe number of the occurrence. Type a number, or click the Browse button andselect a data holder that contains the number.

Advanced Properties

For explanations of the following properties, see Standard Locator and Key Properties:

disabled

optional

remark

Page 251: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 13. Project Properties

240

Project Properties

The project properties are options that you can set for the behavior of a project. Theycontrol essential features of the project such as the input and output encodings, theauthentication support, and the XML validation.

The project properties are saved with the project. They affect the behavior in allcircumstances where you run the project:

In the Conversion Agent Studio environment

When you deploy the project as a Conversion Agent service and run it inConversion Agent Engine.

For many projects, you can accept the default values of the project properties.Nevertheless, before you deploy a project as a Conversion Agent service, youshould always review the project properties and confirm that the settings meetyour needs.

Properties versus Preferences

Do not confuse the project properties with the Conversion Agent Studiopreferences:

The preferences affect the display of data transformations in ConversionAgent Studio. They apply to all projects equally.

The project properties affect the operation of a data transformation both inConversion Agent Studio and in Conversion Agent Engine. You can set theproperties independently for each project.

Setting the Project Properties

To set the properties of a project:

1. Open a TGP script file belonging to the project in an IntelliScript editor.

2. On the menu, choose Project > Properties.

Alternatively:

1. Select the project in the Conversion Agent Explorer.

13

Page 252: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 13. Project Properties

241

2. On the menu, choose File > Properties.

Info Page

The Info page of the project properties displays general information, such as thestorage location of the project.

Authentication Page

If the project accesses a location that requires a login, you can store the logininformation on the Authentication page of the project properties. This feature isuseful, for example, if a parser processes source documents that are located on apassword-protected web site.

The options are as follows:

Enable authenticationSelect this option if the remote location requires a login.

Prompt before executionWhen a login is required, Conversion Agent prompts the user to enter a username and password.

Save in projectWhen a login is required, the project automatically submits the user name andpassword that are specified in the login information.

Login informationThe user name and password.

Encoding Page

The Encoding page of the project properties lets you specify how the input, output,and working files of a project are encoded.

Supported EncodingsConversion Agent supports encodings such as:

Encoding Description

Big5 Chinese

Big5-HKSCS Chinese with Hong Kong Supplementary Character Set

EBCDIC-37 US/Canada

Page 253: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 13. Project Properties

242

Encoding Description

EBCDIC-284 Spanish

EBCDIC-424 Hebrew

EUC-KR Korean

GB2312 Chinese

GB18030 Chinese

ISO-8859-1 Latin-1 (English and West European)

ISO-8859-2 Latin-2 (East European)

ISO-8859-3 Latin-3 (South European)

ISO-8859-4 Latin-4 (North European)

ISO-8859-5 Cyrillic

ISO-8859-6 Arabic

ISO-8859-7 Greek

ISO-8859-8 Hebrew

ISO-8859-9 Latin-5 (Turkish)

ISO-8859-15 Latin-9

KSC_5601 Korean

Shift_JIS Japanese

TIS-620 Thai

UTF-16 Unicode

UTF-16BE Unicode

UTF-7 Unicode

UTF-8 Unicode

Windows-874 Thai

Windows-1250 Central European

Windows-1251 Cyrillic

Windows-1252 ANSI English and West European

Windows-1253 Greek

Windows-1254 Turkish

Windows-1255 Hebrew

Windows-1256 Arabic

Windows-1257 Baltic

Windows-1258 Vietnamese

Page 254: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 13. Project Properties

243

The proprietary Hebrew BaseCodePage continues to be supported in projects that wereupgraded from previous Conversion Agent versions. In new projects, you can use one of theother Hebrew code pages.

Additional encodings may be supported. For an up-to-date list, select one of theCustom options on the Encoding page and open the drop-down list. Please contactSAP if you need an encoding that isn't in the list.

LimitationsThe encoding support is subject to the following limitations:

The example pane of the IntelliScript editor may fail to display the East Asianencodings properly (Chinese, Japanese, and Korean).

Conversion Agent interprets multiple-byte encodings (East Asian andUnicode) as binary byte streams. It is not aware of the encoding semantics,such as the breaks between characters.

The Unicode encodings, except for UTF-8, are not supported as workingencodings.

If you define the working encoding as East Asian or UTF-8, you should avoiddefining Marker anchors that contain multiple-byte characters. ConversionAgent may misinterpret a Marker that contains such characters.

Input

The Input area of the Encoding page specifies how the input of a ConversionAgent project is encoded, for example, the source document that is processed by aparser.

Extract code page from sourceIf selected, Conversion Agent uses a code page that is specified in the sourcedocument (for example, in the encoding attribute of an XML document).

If Conversion Agent does not find an encoding specification in the document,it uses the encoding defined in the settings described below.

Use working encodingIf selected, Conversion Agent assumes that the input has the same encoding asthe working files of the project (defined in the Working area of the propertiespage).

CustomSelect the encoding from the drop-down list.

Encoding schemaThe encoding of special characters: none or XML.

In the XML encoding schema, symbols such as < or > are represented asentities (&lt; and &gt;, etc.).

Page 255: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 13. Project Properties

244

Note that serializers and mappers ignore this option. A serializer or mapperassumes that its input uses the XML encoding schema.

Byte orderThe byte order of binary data. The options are Little Endian (default,appropriate for most files on the Windows operating system), Big Endian, orno binary conversion.

WorkingThe Working area of the Encoding page specifies the encoding of the project'sworking files, including the TGP script files and the IntelliScript.

Use Conversion Agent default codepageUses the system default encoding.

CustomSelect the encoding from the drop-down list. The ISO-8859 and Windowsencodings are supported.

Working encoding schemaThe encoding of special characters: none or XML (as for the input encoding).

You must select a working encoding that is compatible with the encoding of your XSDschema. For details, see Encoding of the XSD Schema in Chapter 6, Data Holders.

OutputThe Output area of the Encoding page defines the encoding of the project output.

Use working encodingUse the same encoding as for the working files.

Same as inputUse the same encoding as for the input.

CustomSelect the encoding from the drop-down list.

Encoding schemaThe encoding of special characters: none or XML (as for the input encoding).

Note that parsers and mappers ignore this option. A parser or mapper encodesits output using the XML encoding schema.

Byte orderThe byte order of binary data. The options are Little Endian (default,appropriate for most files on the Windows operating system), Big Endian, orno binary conversion.

Page 256: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 13. Project Properties

245

External Tools Builders Page

The External Tools Builders page is a standard Eclipse page, which is not used byConversion Agent.

Namespaces Page

The Namespaces page of the project properties is used to configure XMLnamespaces.

You must define the namespaces in the targetNamespace attribute of the XSDschemas. In the project properties, you can edit only the namespace alias.

For more information, see Chapter 6, Data Holders.

Output Control Page

The Output Control page of the project properties specifies options for howConversion Agent should generate the project output.

XSLT stylesheet URLBrowse to an XSLT stylesheet, which you want to use to display XML outputof the project. Conversion Agent adds an XSLT processing instruction to theXML output, for example:

<?xml-stylesheet type="text/xsl"href="C:\stylesheets\MyStylesheet.xsl"?>

When you display the XML in Internet Explorer, the browser applies thestylesheet.

Create event logBy default, Conversion Agent Studio generates event logs for the project. Youcan click the Advanced button and define which types of events should beincluded in the log.

If you deselect the option to create event logs, Conversion Agent Studio doesnot generate an event log. The Events view displays only minimalinformation, such as the service initialization and termination.

This property has no effect when you run a Conversion Agent service inConversion Agent Engine. For information about the event logs generated byConversion Agent Engine, see the Conversion Agent Engine Developer's Guide.

Save parsed documentsSpecifies whether Conversion Agent should save a copy of the parseddocuments with the event log. The event log uses the copy to display thesource of an event.

Page 257: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 13. Project Properties

246

Add binary encoding prefix to output fileAdds a binary byte-order mark at the start of the output file. Some Unicodeapplications use the mark to identify the encoding.

Disable automatic outputBy default, a parser or serializer writes all output that it generates to theresults file. If you select this option, the output is not written unless the parseror serializer runs the DumpValues action. This is useful mainly for debugging.

Disable value compressionsDisables XML output optimizations. The optimizations improve performance,especially when processing large documents. Do not select this option unlessadvised by SAP support.

Project References Page

The Project References page is a standard Eclipse properties page, which is notused by Conversion Agent.

XML Generation Page

On the XML Generator page of the project properties, you can specify how theproject ensures that its XML output is valid.

Schema location, No namespace schema locationThese options insert schemaLocation and noNamespaceSchemaLocationattributes in the document element of the output XML.

For example, suppose that you specify the following project properties:

Schema location = http://www.example.com/NS1No namespace schema location = http://www.example.com/NoNSIf the document element of the output XML is called Doc, Conversion Agentoutputs the following code:

<Docxmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"xsi:schemaLocation="http://www.example.com/NS1"xsi:noNamespaceSchemaLocation="http://www.example.com/NoNS">

XML output modeThis option specifies what Conversion Agent should do if a parser does notmap any data to a data holder (an XML element or attribute) that is defined inthe XSD schema.

Page 258: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 13. Project Properties

247

XML output mode Explanation

Full Conversion Agent attempts to add the missing data holders to the XMLoutput. It assigns values to the added data holders as follows:The default value, if the schema defines one.If the data holder has an integer type: 0.If the data holder has a floating type: 0.0.Otherwise, it leaves the data holder empty.

Compact Conversion Agent does not add the missing data holders, and itremoves empty data holders.A data holder containing the number 0 is not considered empty for thispurpose, and is not removed.

As is The XML output contains the data holders that the parser explicitly set.Conversion Agent does not add missing data holders, and it does notremove empty values.

The above options do not cause a parser to produce invalid XML, providedthat you select the validate added option, which is described below. Undercertain conditions, however, the compact or as is options can cause a parser tooutput partial or empty XML. For example, suppose that you choose thecompact mode, and the parser does not create a required element. ConversionAgent removes its parent element, in an attempt to create valid XML. If theparent element is also required, the grandparent is removed. This processcontinues until Conversion Agent reaches an optional element, or until theXML is empty.

Add default values for required elements/attributesThese options instruct a parser to output XML elements or attributes that arerequired by the XSD schema. The options override the compact and as is outputmodes.

Conversion Agent assigns values to the required data holders as in the Fulloutput mode (see the explanation in the above table).

Validate added required XML elements or attributesIf selected, this option causes Conversion Agent to validate the elements orattributes that it adds because of the full or add required options. If adding theelement or attribute would invalidate the output XML, Conversion Agent doesnot add it.

This option is selected by default, and we recommend that you leave itselected. Deselecting the option may result in invalid XML.

<?xml?>The options under this heading add a processing instruction at the beginningof the output XML, for example:

<?xml version="1.0" encoding="Windows-1252"?>

Page 259: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 13. Project Properties

248

Select the XML version and the value of the encoding attribute. For theencoding, you can choose the output encoding (as defined on the EncodingPage) or a custom encoding designator.

Add custom processing instructionsThis option adds custom processing instructions to the XML header. Type theprocessing instructions (including the <? ?> symbols).

Add XML root elementThis option lets you wrap the output XML in a tag that is not configured in theIntelliScript, and may not be defined in the XSD schema.

For example, suppose that the output of a parser is:

<Result>1.0</Result>

If you select the option to add an XML root element, and you set the elementname to OutputWrapper, the project generates the following output:

<OutputWrapper><Result>1.0</Result>

</OutputWrapper>

You must use this option if you run a parser on multiple source documents(see Running on Additional Source Documents in Chapter 14, Running and TestingProjects).

Page 260: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 14. Running and Testing Projects

249

Running and Testing Projects

When you develop a Conversion Agent data transformation, you should test anddebug it thoroughly before you put it into production. Conversion Agent providesseveral tools for testing, debugging, and troubleshooting, such as:

Color-coding a source document in the example pane of the IntelliScript editor

Running the data transformation in Conversion Agent Studio and viewing theresults

Viewing the event log that Conversion Agent generates for each run

Cross-identifying an anchor in the example source, in the IntelliScript, and inthe Events view

Color-Coding the Example Source

As you construct a parser, Conversion Agent studio color-codes the anchors thatyou have defined in the example source.

In the learn-example style, the specific anchors that you use to define thedocument structure are color-coded, for example, an anchor that marks arepeating group.

In the mark-example style, all the anchors that Conversion Agent finds in thedocument are color coded, for example, all iterations of a Marker within arepeating group.

By examining the color-coded text, you can confirm that Conversion Agent isidentifying the anchors correctly.

On the IntelliScript menu or on the toolbar, you can choose the followingcommands, which control the color-coding style:

Learn the Example AutomaticallyEnables automatic color coding in the example pane, in the learn-examplestyle. When you define anchors in the IntelliScript, Conversion Agentautomatically highlights the corresponding location in the example.

Learn ExampleColor-codes the anchors in the example pane in the learn-example style.

14

Page 261: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 14. Running and Testing Projects

250

You can use this command to activate the color coding, if you have deselectedthe option to Lean the Example Automatically. You can also use this commandto return to the learn-example style, after you have displayed the mark-example style.

Mark ExampleRuns the selected parser, and color-codes the anchors in the example pane inthe mark-example style.

Stop Learning or Marking the ExampleStops the color-coding operation. If the example is very long, you can use thisoption to halt the color display and speed up the response.

For detailed instructions and exercises on how to use the color codes fordebugging, see Getting Started with Conversion Agent.

Example pane showing color-coded anchors in the learn-example style.

Example pane showing color-coded anchors in the mark-example style.

Running in Conversion Agent Studio

To test a data transformation, you can run it in the Conversion Agent Studioenvironment.

1. Set the startup component of the project. This is a parser, serializer, mapper, orglobally defined transformer that the project should activate.

You can do this in any of the following ways:

- In the IntelliScript editor, right-click the component and choose Set asStartup Component, or

- In the Component view, right-click the component and choose Set asStartup Component, or

- On the menu, choose Run > Run.

Page 262: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 14. Running and Testing Projects

251

2. Optionally, in the Run > Run command, you can click the Details button andset the initial values (also known as service parameters) of the variables in yourproject.

The initial values that you assign in this way are used when you test theproject in Conversion Agent Studio. For testing purposes, they override theinitialization property of the variables. However, they have no effect whenyou later deploy the project as a Conversion Agent service.

For more information, see Initializing Variables at Runtime in Chapter 6, DataHolders.

3. Run the component in one of the following ways:

- In the Run > Run menu command, click the Run button, or- On the menu, choose Run > Run <StartupComponentName>, where

<StartupComponentName> is the name of the startup component.

4. The data transformation is executed in Conversion Agent Engine. ConversionAgent Studio displays the Events view, which informs you of any problemsthat occurred in the execution (see Viewing the Event Log below).

5. To view the results, double-click the output file, in the Results folder of theConversion Agent Explorer.

For example, the output of a parser is an XML file, usually called output.xml .Conversion Agent opens the file in a Microsoft Internet Explorer window.

Output file of a parser, displayed in an Internet Explorer window.

If the Output File is not Displayed

Occasionally, a serious error may cause Conversion Agent to generate an outputfile that cannot be viewed in the default viewing application. To diagnose theproblem:

Examine the Events view for a description of the problem (see Viewing theEvent Log below).

Try opening the output file in an external application such as Notepad.

Page 263: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 14. Running and Testing Projects

252

If the output file is not created at all, examine the Output Control page of theproject properties, and confirm that the option to Disable Automatic Output isnot selected (see Chapter 13, Project Properties).

Running on Additional Source Documents

By default, Conversion Agent Studio runs a parser on the example source. Youshould test the parser on other source documents, as well.

To do this, set the sources_to_extract property of the parser. The value is a sourceidentifier such as a file path (optionally including * wildcards), a URL, or a list offiles (see Chapter 3, Parsers).

If you select multiple sources, you must select the option to add XML root element,which is located on the XML Generation page of the project properties (see Chapter13, Project Properties). This causes Conversion Agent to nest the results from eachsource in a root element. If you don't do this, the XML that the parser generates isnot well formed because it does not have a unique root.

You can also open a source document in the example pane, by using the IntelliScript > TestDocument command. Then run the IntelliScript > Mark Example command, and confirmthat the parser color-codes the anchors correctly.

Viewing the Event Log

When you run a data transformation in the Conversion Agent Studio environment,the Events view displays the events that occur during the execution. You shouldexamine the events for failure or warning messages.

An event log created by Conversion Agent Studio is stored, by default, at thelocation Results/events.cme in the project folder.

Event-Log Properties

In the project properties, you can configure the events that Conversion Agentwrites to the log. For information, see Output Control Page in Chapter 13, ProjectProperties.

Event Display Preferences

You can customize the event display by using the Window > Preferencescommand. On the Conversion Agent page of the preferences, you can configure:

The types of events that Conversion Agent Studio displays, such asnotifications, warnings, or failures.

Page 264: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 14. Running and Testing Projects

253

Whether the failure events propagate (bubble up) in the events tree.Propagation lets you find the failure events more easily because they arelabeled at the top levels of the tree.

Note that the preferences are independent of the event-log properties. Theproperties control the events that Conversion Agent stores in the log. Thepreferences control how the stored events are displayed.

Event display without propagation

Event display with propagation

Understanding the Event Log

The event log displays the detailed events that occurred during the execution. Forexample, it displays an event for each anchor that a parser found.

To display the events at a particular stage, select the stage in the left pane of theEvents view.

The events are labeled with status icons, which have the following meanings:

Page 265: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 14. Running and Testing Projects

254

InformationA normal operation performed by Conversion Agent.

WarningA warning about a possible error. For example, Conversion Agent generates awarning event if an operation overwrites the existing content of a data holder.The execution continues.

FailureA component failed. For example, an anchor fails if Conversion Agent cannotfind it in the source document. The execution continues.

Optional FailureAn optional component (a component configured with the optional property)failed. For example, an optional anchor is missing from the source document.The execution continues.

Fatal errorA serious error occurred, for example, a parser has an illegal configuration.Conversion Agent halts the execution.

UnknownThe event status cannot be determined.

Warnings, failures, and optional failures may be perfectly normal under somecircumstances. For example, a RepeatingGroup anchor may display an optionalfailure after its last iteration, because it cannot find any more data to parse. If theevent log displays warnings or failures, you should investigate why they occur,and determine whether they are normal or signal a problem.

Using Named Components

We suggest that you assign the name property of the components in your datatransformations. Conversion Agent uses the name to label the events. This can makethe event log and the IntelliScript easier to understand, and it helps you identifythe source of any failures.

IntelliScript with named Marker anchors

Page 266: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 14. Running and Testing Projects

255

Event log with named Marker anchors

Cross-Identifying Events

To help you trace the operation of a data transformation and diagnose problems,you can identify the events and the components that caused them in the followingways:

In the right pane of the Events view, double-click click an event, such as theevent for a Marker or Content anchor. The anchor that caused the event ishighlighted in the IntelliScript and example panes.

In the example pane, right-click an anchor and choose the followingcommands:

- View Instance, to find the anchor definition in the IntelliScript pane- View Event, to find the corresponding event in the Events view

In the IntelliScript, right-click an anchor and choose View Marking. This findsthe corresponding text in the example source.

Effect of Failure Events

If the event log shows that a component has failed, you should be aware of thefollowing consequences.

Failure Causes Parent to FailIf the optional property of the component is not selected, a failure of thecomponent causes its parent to fail. If the parent is also non-optional, its ownparent (the grandparent of the original component) fails, and so forth.

For example, suppose that a Parser contains a Group, and the Group contains aMarker. All the components are non-optional. If the Marker does not exist in thesource document, the Marker fails. This causes the Group to fail, which in turncauses the Parser to fail.

Pictorially, we can represent these relationships in the following way:

Parser //FailedGroup //Failed

Marker //Failed

Page 267: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 14. Running and Testing Projects

256

Optional Failure Does Not Cause Parent to Fail

If the optional property of a component is selected, a failure of the componentdoes not bubble up to the parent.

In the above example, suppose that the Group is optional. The failed Marker causesthe Group to fail, but the Parser does not fail.

Parser //SucceededGroup //Failed

Marker //Failed

Rollback

If a component fails, its effects are rolled back.

For example, suppose that a Group contains three non-optional Content anchors,which store values in data holders. If the third Content anchor fails, the Group fails.Conversion Agent rolls back the effects of the first two Content anchors. The datathat the first two Content anchors already stored in data holders is removed.

The rollback applies only to the main effects of a data transformation, such as aparser storing values in data holders, or a serializer writing to its output file. Therollback does not apply to side effects. In the above example, if the Group containsan ODBCAction that performs an INSERT query on database, the record that theaction added to the database is not deleted.

Group //FailedContent //Data holder is rolled backContent //Data holder is rolled backODBCAction //INSERT query is not rolled backContent //Failed

Setting the Optional Property

You can set the optional property of a component in two ways:

Edit the advanced properties of a component in the IntelliScript

Right-click the component and choosing Make Optional or Make Mandatoryfrom the pop-up menu

Components that Lack an Optional PropertyCertain components lack the optional property because the components never fail,regardless of their input.

An example is the Sort action. If the Sort action finds no data to sort, it simplydoes nothing; it does not report a failure.

Opening a Conversion Agent Engine Event Log

When you deploy a service to Conversion Agent Engine (see Chapter 15, DeployingConversion Agent Services), you can monitor the Engine event logs for errors or

Page 268: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 14. Running and Testing Projects

257

failures. You can open an event log in Conversion Agent Studio by dragging the*.cme file to the Events view.

For more information, see the Event Logs chapter in the Conversion Agent EngineDeveloper's Guide.

Disabling a Component

As you develop and test a data transformation, you may wish to disable certain ofits components temporarily. For example, if a particular anchor fails, you candisable the anchor and test the transformation without it.

You can enable or disable a component by setting its disabled property, in eitherof the following ways:

Edit the advanced properties of a component in the IntelliScript.

Right-click the component and choosing Enable or Disable from the pop-upmenu.

Page 269: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 15. Deploying Conversion Agent Services

258

Deploying Conversion Agent Services

When you finish configuring and testing a project, you should deploy the projectfrom Conversion Agent Studio as a Conversion Agent service. This lets ConversionAgent Engine access and run the project.

You can deploy Conversion Agent service both in the development environmentwhere you use Conversion Agent Studio, and on production servers. Deploying inthe development environment allows you to develop and test applications thatactivate the service. Deploying in the production environment allows yourapplications to run the services on live data.

There is no relation between Conversion Agent services and Windows services. You cannotview or administer Conversion Agent services in the Windows Control Panel.

Preparing a Project for Deployment

Before you deploy a project as a service, you should test the data transformation inConversion Agent Studio. For instructions, see Chapter 14, Running and TestingProjects. Then, you should review points such as:

Setting the startup component

Removing any testing or debugging settings, which may be inappropriate in adeployed service

Setting the Startup Component

Before you deploy the service, you must set one of its top-level components as thestartup component. This is the component that Conversion Agent starts when it runsthe service. It is the same as the startup component that you select when you testthe project in Conversion Agent Studio (see Chapter 14, Running and TestingProjects).

The startup component can be a parser, serializer, transformer, or mapper.Collectively, these are known as runnable components.

15

Page 270: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 15. Deploying Conversion Agent Services

259

Multiple Services with Different Startup Components

If the project contains multiple runnable components, you can deploy it multipletimes under different service names. Before you deploy each service, you can selecta different startup component.

In this way, for example, you can define multiple parsers in the same project, anddeploy services that run the parsers.

Reviewing Development Configurations

For testing and debugging purposes, you may have introduced variousconfigurations or project settings that are unneeded or inappropriate in a deployedConversion Agent service. You should review your project for such settings. Forexample:

You may have inserted components such as WriteValue for debuggingpurposes. You can remove the debugging components.

You may have used the sources_to_extract property of a parser to testmultiple source files quickly. You can delete the property value.

On the XML Generation page of the project properties, you may have selectedthe option to Add an XML Root Element to support multiple outputdocuments from a single run. You can deselect the option.

On the Output Control tab, you may have configured the event logging. Youshould review the settings.

You may have configured the initialization property of variables. If youplan to pass the initial values as service parameters from an application, youcan delete the initialization properties, or you can leave them as defaults(see Initializing Variables at Runtime in Chapter 6, Data Holders).

Of course, if you need any of these options when you run the service, you canleave them unchanged.

Conversion Agent Repository

Conversion Agent services are stored in the Conversion Agent repository directory.Conversion Agent Engine accesses the services that exist in the repository.

Ordinarily, you should set the Conversion Agent repository location when youinstall Conversion Agent. On Windows platforms, the default location is:

c:\Program Files\SAP\ConversionAgent\ServiceDB

On Unix platforms, the default location is:

/opt/SAP/ConversionAgent/ServiceDB

To verify or change the repository location:

Page 271: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 15. Deploying Conversion Agent Services

260

1. On the computer where you plan to deploy the service, open the ConversionAgent Configuration Editor.

2. View or edit the following setting:

CM Configuration/Directory services/File system/Base Path

Deploying a Service in a Development Environment

You can deploy a project as a Conversion Agent service on the developmentcomputer where Conversion Agent Studio is installed. This allows you to develop,test, and run applications that activate the service.

To do this, you must have write privileges for the Conversion Agent repositoryand for the CMReports folder (see Event Logs in the Conversion Agent EngineDeveloper's Guide). In case of doubt, contact your system administrator.

The deployment procedure is as follows:

1. In Conversion Agent Studio, open and select the project.

2. On the menu, choose Project > Deploy.

3. In the Deploy Service window, set the following options:

Service NameThe name of the service. By default, this is the project name.

To ensure cross-platform compatibility, the name must contain only Englishletters (A-Z, a-z), numerals (0-9), underscores ( _ ), and commas ( , ).

Conversion Agent creates a folder having the service name, in therepository location.

LabelA version identifier. The default value is a time stamp, indicating when theservice was deployed.

Startup ComponentThe runnable component that the service should start.

AuthorThe person who developed the project.

DescriptionA description of the service.

4. Click the Deploy button.

Conversion Agent Studio displays a message that the service was successfullydeployed. The service is displayed in the Repository view.

Page 272: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 15. Deploying Conversion Agent Services

261

Updating a Deployed Service

Conversion Agent Studio cannot open a deployed project that is located in theConversion Agent repository. If you wish to modify the project, you should followthis procedure:

1. Open the development copy of the project in Conversion Agent Studio. Editand test it as required.

2. Re-deploy the service to the same location, under the same service name. Youare prompted to overwrite the previously deployed version.

Note that re-deploying overwrites the complete service folder, including anyoutput files or other files that you have stored in it.

Removing a Deployed Service

To remove a Conversion Agent service that you have deployed, right-click theservice in the Repository view, and click Remove.

This removes only the copy in the repository. It has no effect on the developmentcopy of the project in your Conversion Agent Studio workspace.

Deploying a Service to a Production Server

You can deploy a Conversion Agent service from the Conversion Agent Studiocomputer to a remote computer such as a production server. The remote computercan be a Windows or Unix-type platform, where Conversion Agent Engine isinstalled.

To deploy a service to a remote computer, use the following procedure:

1. Deploy the service on the development computer (see Deploying a Service in aDevelopment Environment above).

2. Copy the deployed project folder from the Conversion Agent repository on thedevelopment computer to the repository on the remote computer. (Todetermine the repository locations, see Conversion Agent Repository above.)

3. If you have added any custom components or files to the Conversion AgentautoInclude\user directory, you must copy them to the autoInclude\userdirectory on the remote computer.

For information about using autoInclude\user, see External Components in theConversion Agent Engine Developer's Guide. Also see the topic Customizing theConversion Agent Component List in the book Conversion Agent Studio in Eclipse.

4. Conversion Agent Engine determines whether any services have been revisedby periodically examining (by default, every 30 seconds) the timestamp of afile called update.txt. This file exists in the repository root directory. Thecontent of the file can be empty.

Page 273: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide 15. Deploying Conversion Agent Services

262

If this is the first time that you have deployed a service to the remoterepository, update.txt may not exist. If so, copy it from the local repository.

If update.txt exists, you should update its timestamp as follows.

On Windows: Open update.txt in Notepad and save it.

On Unix: Open a command prompt, change to the repository directory, andenter the following command.

touch update.txt

Alternatively, if the development computer can access the remote file system, youcan change the Conversion Agent repository to the remote location and deploydirectly to the remote computer.

Deploying on Multiple Servers

For advice on how to deploy services on multiple servers or on a cluster server, seethe chapter on Administration and Deployment Policies in the Conversion AgentAdministrator's Guide.

Running a Service

After you deploy a service, you are ready to run it in Conversion Agent Engine.

You can do this in several ways:

By using the Conversion Agent Engine command-line interface. Forinformation, see the Conversion Agent Engine Developer's Guide.

By programming an application that uses the Conversion Agent API to submitsource documents to the Engine. The API is available in several programminglanguages. See the Conversion Agent Engine Developer's Guide and the APIreference documentation.

By posting source documents to the Engine via the CGI interface. See theConversion Agent Engine Developer's Guide.

By using the Conversion Agent process module for SAP XI (see the bookDeploying and Using Conversion Agent).

Page 274: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide Index

263

Index

.

.NET, 167

A

AbsURL, 124actions, 153

compared to transformers, 154custom COM, 167defining, 154input and output, 153properties of, 154side effects, 153

AddEmptyTagsTransformer, 124AddEventAction, 157AddField, 112AddString, 124AFPToXML, 26alternative parsers

selecting, 86AlternativeMappings, 215Alternatives, 85AlternativeSerializers, 201anchors, 69

defining, 72extent of complex, 83finding events, 255finding from event log, 255location in IntelliScript , 72Marker and Content, 69phase, 76properties of, 75quick reference, 84reference, 85relation to delimiters, 70relation to XML, 70serialization, 190, 197using transformers, 117

APIsConversion Agent engine, 262

AppendListItems, 157AppendValues, 158

applicationsConversion Agent services, 258

architectureConversion Agent transformations, 2

arithmetic computations, 159assigning

value to output, 178attributes

data holders, 51AttributeSearch, 107authentication

project properties, 241

B

Base64Decode, 125Base64Encode, 125BidiConvert, 126BigEndianUniToUni, 126BinaryFormat, 39BizTalk

splitting large files for, 183

C

CalculateValue, 159CDATADecode, 126CDATAEncode, 127CGI interface

Conversion Agent Engine, 262ChangeCase, 127CMW files, 5code pages

supported, 241codes pages

XSD schema, 53color coding

Learn Example, 249Mark Example, 249use in debugging, 249

combinationsof lists, 160

CombineValues, 160COMClass, 184

Page 275: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide Index

264

CommaDelimited, 43command-line interface

Conversion Agent Engine, 262components

overview of Conversion Agent, 2concatenation, 157, 158condition

ensuring in source document, 165Connect, 113Content, 69, 87ContentSerializer, 197, 202Conversion Agent Engine

running services in, 262Conversion Agent Studio

overview, 1Conversion Agent Studio

instructions for use, 10CreateGuid, 128CreateList, 161CreateUUID, 128CustomFormat, 40

D

data holders, 51destroying occurrences, 67identifying source and target, 225indexing multiple-occurrence, 219mixed content, 59overview, 3single or multiple occurrence, 66validating, 57

databaselookup transformer, 140querying, 172

databasesconnecting to, 151, 185

DateAdd, 162DateDiff, 163DateFormat, 128dates

format of, 128debugging

Conversion Agent projects, 249default transformers, 118DelimitedSections, 90DelimitedSectionsSerializer, 203Delimiter, 48DelimiterHierarchy, 44delimiters

custom hierarchy, 40relation to anchors, 70

direction propertyof anchors, 75

DocList, 21

document processors, 24custom C++, 31custom COM, 29custom Java, 29defining, 24installation, 24quick reference, 25reference, 26running multiple, 34

documentsoverview, 4

Dos96HebToAscii, 130DownloadFile, 163DownloadFileToDataHolder, 164drag-and-drop

defining anchors, 74DumpValues, 165dynamic offset, 109dynamic search, 111

E

EbcdicToAscii, 131Eclipse

Conversion Agent Studio for, 10EDI

delimiters for parsing, 44elements

data holders, 51EmbeddedMapper, 216EmbeddedParser, 93EmbeddedSerializer, 206enclosed

group, 94EnclosedGroup, 94EnclosingDelimiters, 49EncodeAsUrl, 131Encoder, 131encoding

code page transformer, 131input and output, 241limitations, 243supported, 241XSD schema, 53

EnsureCondition, 165errors

viewing, 253event log

configuring properties, 252Conversion Agent Engine, 257custom events, 157viewing, 252

eventsfinding anchors, 255

example source

Page 276: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide Index

265

in project, 5example_source property, 18

mapper, 215serializer, 200

examplesinstalling and opening online, 9

Excelgenerating from XML, 35parsing as HTML, 27parsing as text, 27parsing as XML, 27, 28

ExcelToDataXml, 27ExcelToHtml, 27ExcelToTextML, 27ExcelToTxt, 27ExcelToXml, 28ExcludeItems, 167ExpandFrameSet, 29external tools builders

project properties, 245ExternalCOMAction, 167ExternalCOMPreProcessor, 29ExternalJavaPreProcessor, 29ExternalPreProcessor, 31ExternalTransformer, 132extracting content

Content anchor , 87

F

failureeffect on parent, 255

failure events, 254generated by RepeatingGroup, 104

failureseffect of, 255viewing, 253

fatal error events, 254files

downloading, 163projects, 5

FileSearch, 21FindReplaceAnchor, 95format preprocessors, 49FormatNumber, 133forms

submitting HTML, 99, 180, 181frameset

parsing HTML, 29FromFloat, 134FromInteger, 134FromPackDecimal, 135FromSignedDecimal, 135

G

getHTTP method, 181

groupperforming actions on, 97repeating, 103

Group, 97GroupMapping, 217GroupSerializer, 207

H

Hebrewcode-page conversion, 136

hebrewBidi, 135HebrewDosToWindowsTransformer, 136HebrewEBCDICOldCodeToWindows, 136hebUniToAscii, 136hebUtf8ToAscii, 136HL7, 45HTML

removing tags, 143submitting form, 180, 181transforming entities, 136

HtmlEntitiesToASCII, 136HtmlForm, 99HtmlFormat, 40HtmlProcessor, 41, 42, 50, 137HTTP

Get and Post data, 63HTTP interface

Conversion Agent Engine, 262

I

iconsevents, 253

ImageClick, 114indexing, 219

example, 222multiple-occurrence data holders, 67quick reference, 233

information events, 254initialization

variables, 65InjectFP, 137InjectString, 137InlineTable, 151IntelliScript

defining anchors in, 74iterations

RepeatingGroup anchor, 103

J

JavaScript

Page 277: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide Index

266

syntax reference, 165JavaScriptFunction, 171JavaTransformer, 138

K

Key, 234keys

properties of, 233

L

LearnByExample, 108list types

mapping to, 82XSD, 67

listscombining, 160creating, 161multiple-occurrence data holders, 66of variables, 67sorting, 179

LocalFile, 21locations

marking in source document, 101Locator, 237LocatorByKey, 237LocatorByOccurrence, 238locators

properties of, 233login

project properties, 241logs

event, 252LookupTransformer, 139loop

RepeatingGroup anchor, 103

M

Map, 172mapper

calling secondary, 216creating, 210running in Conversion Agent Studio, 250

Mapper, 214mapper anchors

properties of, 213reference, 215

mappersdeploying as service, 258properties of, 213quick reference, 213running, 212running in parser, 175using indexing, 222

Marker, 69, 101marking property

of anchors, 75missing text

searching by optional Group, 97mixed content

data holders in, 59in XSD schema, 54mapping to, 72

ModifyField, 114MSMQ

sending to, 183writing to, 182

MSMQOutput, 185multiple occurrence

data holders, 66destroying occurrences, 67variables, 67

multiple-occurrence data holderscombining, 160creating lists in, 161indexing, 219mapping anchors to, 71sorting, 179

N

namespacesproject properties, 245

New Element windowdefining anchors in, 74

NewlineSearch, 109NormalizeClosingTags, 140numbers

formatting, 133

O

ODBC_Text_Connection, 151ODBC_XML_Connection, 185ODBCAction, 172ODBCLookup, 140offset

dynamically defined, 109OffsetSearch, 109online samples

installing and opening, 9OpenURL, 186optional failure

effect on parent, 256optional failure events, 254optional property

events, 254failure handling, 255setting, 256

Page 278: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide Index

267

outputviewing, 251

OutputCOM, 186OutputDataHolder, 188OutputFile, 188

P

packed decimals, 135, 146parameters

passing to data transformation, 65parser

running in Conversion Agent Studio, 250Parser, 17parsers

calling secondary, 93, 206creating, 12deploying as service, 258running, 15running secondary, 175

pathresolving relative, 124

PatternSearch, 109PdfToTxt_2_02, 33PdfToTxt_3_00, 32phase

of anchor search, 76phase property

of anchors, 76phases

nested, 77platform independence

parsers, 16Positional, 45post

HTTP method, 180posted data

retrieving, 63PostScript, 46PowerPoint

parsing as HTML, 33PowerpointToHtml, 33PowerpointToTextML, 33predicate

XPath, 236preprocessors

format, 49preprocessors

document, 24pre-processors

defining, 24ProcessByTransformers, 33processing instructions

adding to output, 247ProcessorPipeline, 34

processorscustom C++, 31custom COM, 29custom Java, 29document, 24installation, 24reference, 26using transformers as, 119

projectconfiguration overview, 7deploying, 8

project properties, 240authentication, 241encoding, 241external tools builders, 245general information, 241namespaces, 245output control, 245setting, 240versus preferences, 240XML generation, 246

projectsarchitecture, 5

projects:, 258properties

of actions, 154of anchors, 75of mappers, 213of serializers, 199of transformers, 120project, 240

Q

quick referenceanchors, 84document processors, 25indexing, 233mappers, 213serializers, 200transformers, 120

R

referenceanchors, 85delimiters, 42document processors, 26format preprocessors, 49formats, 39indexing, 233mapper anchors, 215mappers, 214parsers, 17serialization anchors, 201

Page 279: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide Index

268

reference pointsaround anchors, 75Marker anchor, 101of search scope, 80

Regex++regular expression implementation, 141

regular expressionsexamples , 141learning about, 141

RegularExpression, 141reloading schema, 56RemoveField, 115RemoveMarginSpace, 41, 42, 142RemoveRtfFormatting, 142RemoveTags, 143RepeatingGroup, 103RepeatingGroupMapping, 218RepeatingGroupSerializer, 207Replace, 143replacing text, 148

in source document, 95repository

Conversion Agent, 259requirements analysis, 6ResetVisitedPages, 174Resize, 144ResultFile, 189results

of data transformation, 251results file

debugging if not displayed, 251Results folder, 5retrieving content

Content anchor , 87ReverseTransformer, 144right-to-left text

reversing, 126rollback

after failure, 256root element

adding XML, 248RTF, 46RtfFormat, 41RtfProcessor, 41, 50, 144RtfToASCII, 145RtfToTextML, 34RunMapper, 175runnable components, 258RunParser, 175RunSerializer, 177

S

samplesinstalling and opening online, 9

schemaencoding, 53

schemasadding XSD to project, 55creating in Conversion Agent, 56editing, 56reloading, 56sample XSD, 52validation, 61viewing, 57XSD, 51

searchanchor direction, 75dynamically defined search string, 111

search criteriafor anchors, 77

search scopeadjusting, 79for anchors, 77

searcher components, 81, 107secondary mapper

EmbeddedMapper anchor, 216secondary parser

EmbeddedParser anchor, 93, 206SegmentIndex, 115SegmentSearch, 110SegmentSize, 115select-and-click

defining anchors, 74serialization

using transformers in, 119serialization anchors, 190, 197

defining, 198properties of, 199reference, 201sequence of operation, 199

serialization mode, 19, 192serializer

controlling auto-generation, 191running in Conversion Agent Studio, 250

Serializer, 200creating with wizard, 195

serializers, 190creating from parser, 190deploying as service, 258properties of, 199quick reference, 200running, 196running in parser, 177troubleshooting auto-generated, 193

service parameterspassing to data transformation, 65

servicesConversion Agent types, 4

Page 280: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide Index

269

deploying Conversion Agent, 258deploying in development environment,

260naming restrictions, 260removing, 261root folder, 259running in Conversion Agent Engine, 262updating, 261

SetValue, 178SGML, 46signed decimals, 135, 147single occurrence

data holders, 66Sort, 179sorting

multiple-occurrence data holders, 179source documents

testing in Conversion Agent Studio, 252source property, 225, 226sources_to_extract property, 19SpaceDelimited, 46splitting files, 183startup components

setting, 250strings

concatenating, 157, 158StringSerializer, 197, 209SubmitAll, 115SubmitClick, 116SubmitForm, 180SubmitFormGet, 181SubString, 145support

contacting, 9system time, 64system variables, 63

T

TabDelimited, 47target property, 225, 231test documents

in project, 6testing

Conversion Agent projects, 249Text, 22TextFormat, 41TextML

XML schema, 36TextSearch, 110TGP files, 5time

system, 64times

format of, 128

ToFloat, 145ToInteger, 146ToPackDecimal, 146ToSignDecimal, 147TransformationStartTime, 147TransformByParser, 148TransformByService, 149transformer

running in Conversion Agent Studio, 250TransformerPipeline, 149transformers, 117

as document preprocessors, 119compared to actions, 154custom DLL, 132custom Java, 138default, 118defining, 117deploying as service, 258global stand-alone, 119in serialization, 119properties of, 120quick reference, 120sequences of, 118using as document processors, 33using in anchors, 117

code pages, 131troubleshooting, 249TypeSearch, 112

U

Unixdesigning parsers for, 16

unknown events, 254URL, 22

relative to absolute, 124URLs

specifying connections, 63

V

validating data holders, 57validation

ensuring for XML output, 246XML, 61XML parser output, 61XML serializer input, 62

VarCurrentPost, 64VarCurrentURL, 64VarFormAction, 63VarFormData, 64Variable, 65variables, 62

data holders, 51initialization, 65

Page 281: SAP Conversion Agent by Itemfield · Conversion Agent Studio is the design and configuration environment of the SAP Conversion Agent system. Using Conversion Agent Studio, you can

Conversion Agent Studio User's Guide Index

270

lists, 67mapping anchors to, 64, 71system, 63using in actions, 65

Variablesuser-defined, 63

VarLinkURL, 63VarPostData, 63VarRequestedURL, 64VarSystem, 64

W

warning events, 254warnings

viewing, 253WestEuroUniToAscii, 150Word

parsing as HTML, 34parsing as RTF, 34parsing as text, 35parsing as XML, 35

WordPerfectToTextML, 34WordToHtml, 34WordToRtf, 34WordToTextML, 35WordToTxt, 35WordToXml, 35workflow

typical Conversion Agent development, 6WriteValue, 182

X

XMLadding empty tags, 124as parser input, 12generating sample, 58mapping anchors to, 70processing instruction, 247XSD schemas, 51

XSLT transformation, 150XML attributes

data holders, 51XML elements

data holders, 51XML generation

project properties, 246XML Spy, 53XML validation

ensuring, 246XmlFormat, 42XMLLookupTable, 152XmlToExcel, 35XPath

modified notation, 59XPath predicate, 236XPaths

validating, 57XSD

adding schema to project, 55background, 51creating in Conversion Agent, 56editing, 56editors, 51, 53included schemas, 54IntelliScript representation, 59sample schema, 52schema encoding, 53unsupported features, 54viewing, 57

XSD data typessearching for, 81

XSD schemas, 51in project, 5

XSLTrunning transformations, 183

XSLTMap, 183XSLTTransformer, 150