ds designer guide
TRANSCRIPT
-
8/14/2019 DS Designer Guide
1/280
-
8/14/2019 DS Designer Guide
2/280
Published by A scential Software
19972002 Ascential Software Corporation. All rights reserved.
Ascential, DataStage and MetaStage are tradema rks of Ascential Software Corp oration or its affiliates and m ay
be registered in other jurisdictions
Documentation Team: Mandy deBelin
GOVERNMEN T LICENSE RIGHTS
Software and d ocumenta tion acquired by or for the US Governm ent are provided w ith rights as follows:(1) if for civilian a gency use, with rights as restricted by vend or s stand ard license, as prescribed in FAR 12.212;(2) if for Dept. of Defense use, with rights as restricted by v end ors standard license, un less superseded by anegotiated vend or license, as prescribed in DFARS 227.7202. Any w hole or p artial reprod uction of software ordocumentation marked w ith this legend m ust reproduce this legend.
-
8/14/2019 DS Designer Guide
3/280
Table of Contents ii i
Table of Contents
Preface
Organization of This Manu al .........................................................................................x
Documentation Conventions .........................................................................................x
User Interface Conventions .................................................................................. xii
DataStage Documen tation ..........................................................................................xiii
Chapter 1. Introduction
About Data Warehousing ........................................................................................... 1-1
Operational Databases Versus Data Warehouses ............................................. 1-2
Constructing the Data Warehou se ...................................................................... 1-2
Defining the Data Warehou se ............................................................................. 1-3
Data Extraction ...................................................................................................... 1-3
Data Aggregation .................................................................................................. 1-3
Data Tran sform ation ............................................................................................. 1-3
Ad vantages of Data Warehou sing ...................................................................... 1-4
About Da taStage .......................................................................................................... 1-4
Client Components ............................................................................................... 1-5
Server Components .............................................................................................. 1-6
DataStage Projects ........................................................................................................ 1-6
DataStage Jobs .............................................................................................................. 1-6
DataStage NLS .............................................................................................................. 1-8
Char acter Set Map s and Locales ......................................................................... 1-8
DataStage Terms and Concepts .................................................................................. 1-9
Chapter 2. Your First DataStage Project
Setting Up You r Project ............................................................................................... 2-2
Starting the DataStage Designer ......................................................................... 2-3
Creating a Job ........................................................................................................ 2-4
-
8/14/2019 DS Designer Guide
4/280
iv Ascential DataStageDesigner Guide
Defining Table Definitions ...................................................................................2-6
Developin g a Job...........................................................................................................2-9
Adding Stages ........................................................................................................2-9
Linking Stages ......................................................................................................2-10
Editing the Stages ....................................................................................................... 2-11
Editing the Un iVerse Stage ................................................................................ 2-11
Editing the Tran sform er Stage ...........................................................................2-16
Editing th e Sequen tial File Stage.......................................................................2-21
Compiling a Job ..........................................................................................................2-23
Runn ing a Job ..............................................................................................................2-24
Analyzing Your Data Warehouse .............................................................................2-25
Chapter 3. DataStage Designer Overview
Starting the DataStage Designer ................................................................................3-1
The DataStage Designer Window .......................................................................3-2
Using Annotations .....................................................................................................3-18
Description Ann otation Properties ...................................................................3-19
Ann otation Properties ........................................................................................3-20
Specifying Designer Options ....................................................................................3-21
Defau lt Options ...................................................................................................3-21
Expression Editor Options ...............................................................................3-24
Graph ical Performance Monitor Op tions ........................................................ 3-25Job Sequencer Op tions ........................................................................................3-25
Printing Op tions ..................................................................................................3-28
Promp ting Options .............................................................................................3-28
Transformer Op tions ...........................................................................................3-31
Exiting the DataStage Designer ................................................................................3-31
Chapter 4. Developing a Job
Getting Started with Jobs ............................................................................................4-2
Creating a Job .........................................................................................................4-2Op ening an Existing Job .......................................................................................4-2
Saving a Job ............................................................................................................4-4
Stages ..............................................................................................................................4-5
Server Job Stages ...................................................................................................4-5
Mainfram e Job Stages ...........................................................................................4-7
Parallel Job Stages .........................................................................................................4-9
-
8/14/2019 DS Designer Guide
5/280
Table of Contents v
Active Stages .......................................................................................................... 4-9
File Stages............................................................................................................. 4-11
Database Stages ................................................................................................... 4-12
Links............................................................................................................................. 4-12
Linking Server Stages ......................................................................................... 4-12
Linking Parallel Jobs ........................................................................................... 4-15
Linking Mainframe Stages................................................................................. 4-18
Link Ordering ...................................................................................................... 4-20
Developing the Job Design ....................................................................................... 4-21
Adding Stages ..................................................................................................... 4-21
Moving Stages ..................................................................................................... 4-22
Renaming Stages ................................................................................................. 4-22
Deleting Stages .................................................................................................... 4-23
Linking Stages ..................................................................................................... 4-23
Editing Stages ...................................................................................................... 4-25
Using the Data Browser ..................................................................................... 4-31
Using the Performance Mon itor ....................................................................... 4-34
Compiling Server Jobs and Parallel Jobs ......................................................... 4-37
Generating Cod e for Mainframe Jobs .............................................................. 4-39
Job Proper ties .............................................................................................................. 4-44
Server Job and Para llel Job Prop erties ............................................................. 4-44
Specifying Job Param eters ................................................................................. 4-46Job Con trol Routines .......................................................................................... 4-55
Specifying Job Depen dencies ............................................................................ 4-58
Specifying Perform ance Enhancemen ts .......................................................... 4-60
Specifying Execution Page Options ................................................................. 4-62
Specifying Map s and Locales ............................................................................ 4-63
Mainfram e Job Prop erties .................................................................................. 4-66
Specifying Mainframe Job Parameters ............................................................ 4-67
Specifying Mainframe Job Environment Properties ...................................... 4-70
Specifying Extension Variable Values .............................................................. 4-71The Job Run Options Dialog Box ............................................................................. 4-72
Chapter 5. Containers
Local Conta iners ........................................................................................................... 5-1
Creating a Local Container .................................................................................. 5-2
-
8/14/2019 DS Designer Guide
6/280
vi Ascential DataStageDesigner Guide
Viewing or Modifying a Local Container ..........................................................5-2
Using Inpu t and Outp ut Stages ..........................................................................5-3
Deconstructing a Local Con tainer ......................................................................5-4
Shared Containers ........................................................................................................5-5
Creating a Shared Container ...............................................................................5-5
Viewing or Mod ifying a Shared Container Definition ....................................5-6
Editing Shared Container Definition Properties ...............................................5-7
Using a Shared Con tainer in a Job ......................................................................5-9
Converting Containers ..............................................................................................5-16
Chapter 6. Job Sequences
Creating a Job Sequ ence ..............................................................................................6-2
Activities ........................................................................................................................6-4
Triggers ..........................................................................................................................6-4
Control Entities .............................................................................................................6-7
Nested Conditions ................................................................................................6-7
Sequencer ................................................................................................................6-7
Job Sequence Proper ties ..............................................................................................6-8
Activity Prop erties .....................................................................................................6-12
Job Activity Prop erties ........................................................................................6-16
Routine Activity Properties ...............................................................................6-18
Email Notification Activity Prop erties .............................................................6-19Wait-For-File Activity Properties ......................................................................6-21
ExecComm and Activity Prop erties ..................................................................6-22
Exception Activity Properties ............................................................................6-22
Nested Condition Properties .............................................................................6-23
Sequencer Properties ..........................................................................................6-23
Compiling the Job Sequence .....................................................................................6-24
Chapter 7. Table Definitions
Table Definition Prop erties .........................................................................................7-2The Table Definition Dialog Box .........................................................................7-2
Imp orting a Table Definition .............................................................................7-10
Manua lly Enter ing a Table Definition ..............................................................7-12
Viewing or Modifying a Table Definition ........................................................ 7-26
Using the Data Browser ......................................................................................7-28
Stored Procedure Definitions ....................................................................................7-30
-
8/14/2019 DS Designer Guide
7/280
Table of Contents vi i
Importing a Stored Procedure Definition ........................................................ 7-30
The Table Definition Dialog Box for Stored P rocedu res ............................... 7-31
Manually Entering a Stored Procedure Definition ........................................ 7-33
Viewing or Modifying a Stored Procedure Definition .................................. 7-36
Chapter 8. Programming in DataStage
Programm ing in Server Jobs....................................................................................... 8-1
The Expression Editor .......................................................................................... 8-2
Programm ing Compon ents ................................................................................. 8-2
Routines .................................................................................................................. 8-3
Transforms ............................................................................................................. 8-4
Functions ................................................................................................................ 8-4
Expressions ............................................................................................................ 8-5
Subroutines ............................................................................................................ 8-5
Macros .................................................................................................................... 8-6
Programm ing in Mainframe Jobs .............................................................................. 8-6
Expressions ............................................................................................................ 8-6
Routines .................................................................................................................. 8-7
Programm ing in Parallel Jobs .................................................................................... 8-7
Appendix A. Editing Grids
Appendix B. Troubleshooting
Index
-
8/14/2019 DS Designer Guide
8/280
viii Ascential DataStageDesigner Guide
-
8/14/2019 DS Designer Guide
9/280
Preface ix
Preface
This man ual describes the features of the DataStage Designer. It is
intended for app lication developers and system adm inistrators who
wan t to use DataStage to design and d evelop d ata warehousing
applications.
If you are new to DataStage, read the first two chap ters for an over-
view of data wa rehousing an d the concepts and use of DataStage.
The manua l contains enou gh information to get you started in
designing DataStage jobs. For m ore detailed information abou t partic-ular typ es of d ata source or d ata target, refer toDataStage Server Job
Developers Guide, DataStage Parallel Job Developers Guide, and XE/390
Job Developer's Guide.
-
8/14/2019 DS Designer Guide
10/280
-
8/14/2019 DS Designer Guide
11/280
-
8/14/2019 DS Designer Guide
12/280
xii Ascent ial DataStage Designer Guide
All punctuation marks includ ed in the syntax for examp le,
commas, parenth eses, or quotation m arks are required u nless
otherwise indicated .
Syntax lines that do not fit on one line in this man ual are
continued on subsequen t lines. The continuation lines areinden ted. When entering syntax, type the entire syntax entry,
includ ing the continuation lines, on the sam e inp ut line.
User Interface Conventions
The following p ictu re of a typ ical DataStage dialog box illustrates the
terminology used in describing user interface elements:
The DataStage u ser interface makes extensive u se of tabbed p ages,
sometimes nesting them to enable you to reach the controls you needfrom within a single d ialog box. At the top level, these are called
pages, at the inner level these are called tabs. In the examp le
above, we are looking at the General tab of the Inputs page. When
using context sensitive online help you w ill find that each p age has a
OptionButton
Button
Check
Box
BrowseButton
Drop
ListDown
The Inputs Page
The
TabGeneral
Field
-
8/14/2019 DS Designer Guide
13/280
Preface xiii
separate help top ic, but each tab uses the help topic for the p arent
page. You can jum p to the help p ages for the sepa rate tabs from w ithin
the online help.
DataStage DocumentationDataStage docum entation includ es the following:
DataStage Designer Guide . This guide d escribes the DataStage
Designer, and gives a general descrip tion of how to create, design,
and develop a DataStage app lication.
DataStage Manager Guide . This guide d escribes the DataStage
Manager and describes how to use and maintain the DataStage
Repository.
Dat aStage Serv er Job Dev elopers Guide . This gu ide d escribes thespecific tools that a re used in bu ild ing a server job, and supplies
programm er s reference informa tion.
DataStage Parallel Job Developers Guide . This guide describes
the sp ecific tools that are u sed in bu ilding a p arallel job, and
sup plies programmer s reference information.
XE/390 Job Developers Guide. This guide describes the specific
tools that are used in building a mainframe job, and sup plies
programm er s reference informa tion.
DataStage Director Guide : This guide describes the DataStageDirector and h ow to validate, schedu le, run , and m onitor
DataStage server jobs.
DataStage Administ rator Guide : This gu ide d escribes DataStage
setup, routine housekeeping, and administration.
DataStage Insta ll and Upgrade Guide. This gu ide contains
instru ctions for installing DataStage on Window s and UNIX
platforms, and for u pgr ad ing existing installations of DataStage.
These gu ides are also available online in PDF format. You can read
them with the Adobe Acrobat Reader sup plied w ith DataStage. SeeDataStage Install and Upgrade Guide for deta ils abou t installing the
man ua ls and the Adobe Acrobat Reader.
-
8/14/2019 DS Designer Guide
14/280
xiv Ascent ial DataStage Designer Guide
Extensive online help is also sup plied. This is especially usefu l when
you have become fam iliar with using DataStage and need to look up
particular pieces of information.
-
8/14/2019 DS Designer Guide
15/280
Introduction 1-1
1Introduction
This chapter is an overview of data warehousing an d Da taStage.
The last few years have seen the continued growth of IT (informa tion tech-
nology) and the requiremen t of organizations to make better use of the
da ta they have at th eir d isposal. This invo lves analyzing d ata in activeda tabases and comparing it with data in archive systems.
Although offering th e adv antage o f a competitive edge, the cost of consol-
idating da ta into a d ata mart or d ata warehouse w as high. It also required
the use of da ta warehousing tools from a nu mber of vendors and the skill
to create a data warehouse.
Developing a d ata warehouse or data m art involves design of the data
warehouse and development of operational processes to popu late and
maintain it. In ad dition to the initial setup , you m ust be able to hand le on-
going evolution to accomm oda te new d ata sources, processing, and goals.DataStage simp lifies the d ata w arehousing process. It is an integrated
prod uct that su pp orts extraction of the source da ta, cleansing, decoding,
transformation, integration, aggregation, and loading of target d atabases.
Although p rimarily aimed at d ata warehou sing environm ents, DataStage
can also be used in an y data hand ling, data m igration, or data reengi-
neering projects.
About Data Warehousing
The aim of data w arehousing is to make more effective u se of the dataavailable in an organization and to aid decision-making p rocesses.
A data warehouse is a central integrated d atabase containing d ata from all
the op erational sources and archive systems in an organization. It contains
a copy of transaction d ata sp ecifically structured for qu ery analysis. This
-
8/14/2019 DS Designer Guide
16/280
1-2 Ascent ial DataStage Designer Guide
da tabase can be accessed by all users, ensuring th at each g roup in an orga-
nization is accessing v aluable, stable da ta.
A data warehouse is a snap shot of the op erational d atabases combined
with d ata from archives. The data wa rehouse can be created or up da ted at
any time, with m inimu m d isrup tion to op erational systems. Any n um berof analyses can be performed on the data, which wou ld otherw ise be
impractical on the operational sources.
Operational Databases Versus Data Warehouses
Operational d atabases are usually accessed by many concurren t users. The
da ta in the database changes qu ickly and often. It is very d ifficult to obtain
an accurate picture of the contents of the database at any on e time.
Becau se operational d atabases are task oriented , for examp le, stock inven-
tory systems, they are likely to contain d irty data. The high throu ghp utof data into operational databases makes it difficult to trap mistakes or
incomplete entries. However, you can cleanse d ata before loading it into a
da ta warehou se, ensuring tha t you store only good complete record s.
Constructing the Data Warehouse
A da ta warehouse is created by extracting d ata from on e or more opera-
tional da tabases. The data is tran sform ed to eliminat e inconsistencies,
aggregated to sum marize data, and loaded into the data w arehouse. The
end result is a d ed icated da tabase wh ich contains stable, nonv olatile, inte-grated da ta. This data also represents a nu mber of time variants (for
examp le, da ily, week ly, or month ly values), allowing the user to analyze
trends in the data.
The da ta in a data warehouse is classified based on the subjects of interest
to the organization. For a bank, these sub jects may be custom er, account
nu mber, and transaction d etails. For a retailer, these may includ e p rodu ct,
price, quantity sold, and ord er num ber.
Each d ata warehouse includes d etailed dat a. How ever, where only a
portion of this d etailed d ata is requ ired , a data m art ma y be more suitable.
A data martis generated from the data contained in the data wa rehouseand contains focused da ta that is frequen tly accessed or sum marized, for
example, sales or marketing data.
-
8/14/2019 DS Designer Guide
17/280
-
8/14/2019 DS Designer Guide
18/280
1-4 Ascent ial DataStage Designer Guide
Data is transformed using rou tines based on a tran sforma tion ru le, for
examp le, produ ct cod es can be m app ed to a comm on format using a trans-
form ation ru le that ap plies only to product codes.
After data has been transformed it can be loaded into the data w arehou se
in a recognized and required format.
Advantages of Data Warehousing
A data w arehou sing strategy provides the following ad vantages:
Capitalizes on th e p otential value of the organizations information
Imp roves the quality and accessibility of d ata
Combines valuable archive data with the latest data in operational
sources
Increases the amou nt of information available to users
Redu ces the requirement of users to access operational data
Redu ces the strain on IT d epartments, as they can prod uce one
database to serve all u ser groups
Allows new reports and studies to be introdu ced w ithout
disrup ting operational systems
Promotes users to be self sufficient
About DataStageDataStage has the following features to aid the d esign and processing
required to build a d ata warehouse:
Uses graphical design tools. With simp le point-and -click tech-
niques you can d raw a schem e to represent your p rocessing
requirements.
Extracts data from any num ber or type of database.
Hand les all the meta d ata d efinitions required to define your d ata
warehouse. You can view and modify the table definitions at any
point dur ing the design of your ap plication.
Aggregates data. You can modify SQL SELECT statements used to
extract data.
-
8/14/2019 DS Designer Guide
19/280
Introduction 1-5
Transforms data. DataStage has a set of pred efined transforms and
functions you can use to convert your dat a. You can easily extend
the functionality by defining your ow n tran sforms to use.
Loads the data warehouse.
DataStage consists of a num ber of client an d server comp onents. For more
information, see Client Components on page 1-5 and Server Compo-
nentson page 1-6.
DataStage jobs are comp iled and run on the Da taStage server. The job w ill
connect to da tabases on other machines as necessary, extract data , process
it, then w rite the data to the target d ata warehou se. This type of job is
known as a server job.
If you have XE/ 390 installed, DataStage is able to generate jobs which are
compiled and run on a m ainframe. Data extracted by such jobs is then
loaded into the d ata w arehouse. Such jobs are called mainframe jobs.
Client Components
DataStage has four client comp onents w hich are installed on any PC
runn ing Windows 2000 or Windows NT 4.0 w ith Service Pack 4 or later:
DataStage Designer. A design interface used to create DataStage
applications (know n as jobs). Each job specifies the data sou rces,
the transforms requ ired , and the d estination of the data. Jobs are
compiled to create executab les that are schedu led by th e Director
and run by the Server (mainfram e jobs are transferred an d ru n onthe mainframe).
DataStage Director. A user interface u sed to validate, schedu le,
run , and monitor DataStage server jobs.
DataStage Manager. A user interface used to view an d ed it the
content s of the Repository.
DataStage Administrator. A user interface used to perform admin-
istration tasks such as setting up DataStage u sers, creating and
moving projects, an d setting u p pu rging criteria.
-
8/14/2019 DS Designer Guide
20/280
-
8/14/2019 DS Designer Guide
21/280
Introduction 1-7
Mainframe jobs. These are available only if you have XE/ 390
installed . A mainframe job is compiled and run on the m ainframe.
Data extracted by su ch jobs is then loaded into the data warehouse.
There are two other entities that are similar to jobs in th e way they ap pear
in the Da taStage Designer, and are han dled by it. These are:
Shared containers. These are reusable job elements. They typically
comprise a nu mber of stages and links. Copies of shared containers
can be u sed in any number of server jobs and edited as required.
Job Sequences. A job sequence allows you to specify a sequence of
DataStage jobs to be executed , and actions to take d epend ing on
results.
DataStage jobs consist of ind ividu al stages. Each stage d escribes a partic-
ular d atabase or p rocess. For example, one stage may extract d ata from a
da ta source, while another transforms it. Stages are add ed to a job andlinked together using the Designer.
There are three basic types of stage:
Built-in stages. Sup plied w ith DataStage and used for extracting,
aggregating, transforming , or writing d ata. All types of job have
these stages.
Plug-in stages. Additional stages that can be installed in DataStage
to perform specialized tasks that the bu ilt-in stages do not sup por t.
Only server jobs have th ese.
Job Sequence S tages. Special bu ilt-in stages which allow you to
define sequ ences of activities to run . Only Job Sequen ces have
these.
The follow ing diagram represents one of the simp lest jobs you cou ld hav e:
a da ta sou rce, a Transformer (conversion) stage, and the fina l d atabase.
The links betw een the stages represent the flow of data into or ou t of a
stage.
DataSource
TransformerStage
DataWarehouse
-
8/14/2019 DS Designer Guide
22/280
1-8 Ascent ial DataStage Designer Guide
You m ust specify the data you w ant at each stage, and how it is hand led.
For examp le, d o you w ant all the columns in the source da ta, or only a
select few? Shou ld the d ata be aggregated or converted before being
passed on to the next stage?
You can u se DataStage w ith MetaBrokers in ord er to exchange m eta d atawith other d ata w arehou sing tools. You might, for examp le, import table
definitions from a d ata m odelling tool.
DataStage NLSDataStage has bu ilt-in N ational Language Sup por t (NLS). With N LS
installed, DataStage can do the following:
Process data in a wide range of languages
Accept d ata in an y character set into most DataStage fields
Use local formats for dates, times, and money
Sort data according to local rules
Convert d ata between different encodings of the same language
(for exam ple, for Japanese it can convert JIS to EUC)
DataStage N LS is optiona lly installed a s pa rt of the DataStage server. If
NLS is installed, variou s extra features (such as d ialog box pages and
drop -down lists) app ear in th e prod uct. If NLS is not installed , these
features do n ot app ear.Using N LS, the DataStage server engine hold s data in Unicode format.
This is an in ternational standard character set that contains near ly all the
characters used in langu ages around the world . DataStage map s data to or
from Unicode format as required.
Character Set Maps and Locales
Each DataStage p roject has a langu age assigned to it du ring installation.
This equates to one or more char acter set map s and locales which supp ort
that langu age. One map an d one locale are assigned as project defau lts. The m aps define the character sets that the project can use.
The locales d efine the local formats for da tes, times, sorting order,
and so on that the p roject can use.
-
8/14/2019 DS Designer Guide
23/280
Introduction 1-9
The DataStage client and server comp onen ts also have m aps assigned to
them d ur ing installation to ensure that data is transferred betw een them
in the correct character set. For m ore information, seeDataStage Adminis-
trators Guide.
When you design a Da taStage job, you can override the p roject defaultmap at several levels:
For a job
For a stage within a job
For a colum n w ithin a stage (for Sequential, ODBC, and generic
plug-in stages)
For transforms and routines used to manipulate data within a
stage
For imported meta data and table definitions
The locale and character set information becomes an integral part of the
job. When you package an d release a job, the NLS sup port can be used on
another system, p rovided that th e correct maps an d locales are installed
and loaded.
DataStage Terms and ConceptsThe following terms are used in DataStage:
Term Description
ad ministrator The p erson who is resp onsible for the m ain te-nan ce and configu ration of DataStage, and forDataStage users.
after-job subroutine A routine that is executed after a job runs.
after-stage subroutine A routine that is executed after a stageprocesses data .
Aggregator stage A stage type that computes tota ls o r o therfu nctions of sets of da ta.
Annotation A note attached to a DataStage job in theDiagram w indow.
BCPLoad stage A p lu g-in stage su pp lied w ith DataStage th atbu lk load s da ta into a Microsoft SQL Server orSybase table. (Server jobs on ly.)
-
8/14/2019 DS Designer Guide
24/280
1-10 Ascent ial DataStage Designer Guide
before-job subroutine A routine that is executed before a job is run.
before-stage
subroutine
A rou tine that is executed before a stage
processes any d ata.built-in d ata elements There are two types of built-in d ata elements:
those that represent the base types used byDataStage du ring processing and th ose thatdescribe d ifferent date/ time formats.
built-in t ransforms The transforms supplied with DataStage. SeeDataStage Server Job Developers Gu ide for acomp lete list.
Change Apply stage A parallel job stage that applies a set of captu red changes to a d ata set.
Change Capture stage A parallel job stage that compares two datasets and records th e d ifferences between them.
Cluster Type of system provid ing parallel processing.In cluster systems, there are m ultiple proces-sors, and each has its own hard ware resourcessuch as disk and m emory.
column defin it ion Defines the columns contained in a da ta table.Includ es the colum n n ame and the type of datacontained in the column.
Column Export stage A parallel job stage that exports a colum n of
another typ e to a string or binary colum n.Column Import stage A parallel job stage that imports a column
from a string or binary colum n.
Comb ine Recordsstage
A parallel job stage that combines severa lcolum ns associated by a key field to bu ild avector.
Com pare stage A p arallel job stage that p erform s a colu mn bycolum n comp are of two pre-sorted d ata sets.
Com plex Flat Filestage
A mainframe source stage that extracts datafrom a flat file containing complex data stru c-tures, such as a rrays, groups, and redefines.
Compress stage A par allel job stage that compresses a data set.
container A group of stages and links in a job design.
Term Description
-
8/14/2019 DS Designer Guide
25/280
-
8/14/2019 DS Designer Guide
26/280
-
8/14/2019 DS Designer Guide
27/280
Introduction 1-13
Graphicalperformance monitor
A monitor that d isplays status information an dperformance statistics again st links in a jobopen in the DataStage Designer canvas as the
job ru ns in the Director or debugger.
Hashed File st age A stage tha t ext ract s da ta from or loads da tainto a database that contains hashed files.(Server jobs only)
H ead stage A parallel job stage that copies the specifiednu mber of records from the beginning of adata partition.
Informix XPS stage A parallel job stage that allows you to read andwr ite an Inform ix XPS database.
Inter -process stage A server job stage that allows you to run serverjobs in p arallel on an SMP system.
job A collection of linked stages, data elements,and transforms that define how to extract,cleanse, transform, integrate, and load d atainto a ta rget da tabase. Jobs can either be serverjobs or mainframe jobs.
job control rout ine A routine that is used to create a controllingjob, wh ich invokes and runs other jobs.
job sequ en ce A con trolling job w hich invokes and ru ns oth er
jobs, built using the grap hical job sequ encer.Join stage A mainframe processing stage or parallel job
active stage that joins two inpu t sources.
Link Collector stage A server job stage that collects previouslypar titioned d ata together.
Link Partitioner stage A server job stage that allows you to partition
da ta so that it can be processed in p arallel on an
SMP system.
local con ta in er A con tain er wh ich is local to th e job in w hich itwas created.
Looku p stage A m ainframe p rocessing stage and Parallelactive stage that p erforms table lookup s.
Lookup File st age A parallel job stage that provides storage for alookup table.
Term Description
-
8/14/2019 DS Designer Guide
28/280
1-14 Ascent ial DataStage Designer Guide
m ainfram e job A job that is transferred to a main fram e, th encomp iled and run there.
Make Subrecord stage A parallel job stage that combines a number ofvectors to form a subrecord.
Make Vector stage A parallel job stage that combines a number of fields to form a vector.
Merge stage A p arallel job stage th at combines d ata sets.
meta data Data about d ata, for example, a table d efinitiondescribing column s in w hich data isstructured.
MetaBroker A tool that allows you to exchange meta databetween DataStage and other data w are-
hou sing tools.
MPP Type of system provid ing parallel processing.In MPP (massively par allel processing)systems, there are multiple processors, andeach has its own hard ware resources such asdisk and m emory.
Multi-Format Flat Filestage
A mainframe source stage that hand lesdifferent formats in flat file data sou rces.
N LS National Language Support. With NLSenabled, DataStage can sup port the hand ling
of data in a var iety of character sets.normalization The conversion of records in N F2 (nonfirst-
norm al form) format, containing m ultivaluedda ta, into on e or m ore 1NF (first normal form)rows.
null value A special value representing an unknownvalue. This is not th e sam e as 0 (zero), a blank,or an empty string.
ODBC stage A stage that extracts data from or loads d atainto a da tabase that implemen ts the indu stry
standard Op en Database Connectivity API.Used to represent a d ata source, an ag grega-tion step , or a target data tab le. (Server jobsonly)
operator The person scheduling and monitoringDataStage jobs.
Term Description
-
8/14/2019 DS Designer Guide
29/280
Introduction 1-15
Orabu lk stage A p lu g-in stage su pp lied w ith DataStage thatbu lk loads data into an Oracle d atabase table.(Server jobs only)
Oracle stage A parallel job stage that allow s you to read andwr ite an Or acle da tabase.
par allel extender The Da taStage op tion that a llows you to runparallel jobs.
parallel job A type of DataStage job that allow s you to takead van tage o f para llel processing on SMP, MPP,and cluster systems.
Peek stage A parallel job stage that prin ts column values to
the screen as records are copied from its input
data set to one or more outp ut d ata sets.p lug-in A definition for aplug -in stage.
plu g-in stage A stage that performs specific processing thatis not sup ported by the standard server jobstages.
Promote Subrecordstage
A parallel job stage that prom otes themem bers of a subrecord to a top level field.
Relational stage A ma inframe source/ target stage that readsfrom or writes to an MVS/ DB2 database.
Remove du plicatesstage A parallel job stage that removes d up licateentries from a data set.
Repository A DataStage area w here projects and jobs arestored as w ell as d efinitions for all standardand user-defined d ata elemen ts, transforms,and stages.
SAS Data Set stage A parallel job s tage that provides s torage forSAS data sets.
SAS stage A parallel job stage that allows you to run SASapplications from within the DataStage job.
Sam ple stage A p arallel job stage th at sam ples a d ata set.
Sequen tial File stage A stage tha t extracts data from, or wr ites datato, a text file. (Server job an d parallel job only)
server job A job that is compiled and run on theDataStage server.
Term Description
-
8/14/2019 DS Designer Guide
30/280
1-16 Ascent ial DataStage Designer Guide
shared con ta iner A conta iner wh ich exists as a separ ate item inthe Repository and can be used by any serverjob in the p roject.
SMP Type of system provid ing parallel processing.In SMP (symm etric multiprocessing ) systems,there are multiple processors, bu t these shareother hard ware resources such as disk andmemory.
Sort stage A mainframe processing stage or parallel jobactive stage that sorts inpu t colum ns.
source A source in DataStage terms means any data-base, whether you are extracting da ta from itor writing data to it.
Split Subrecord stage A parallel job stage that separates a numb er ofsubrecords into top level colum ns.
Split Vector st age A parallel job stage that separates a number of vector members into separ ate columns.
stage A component that represents a d ata source, aprocessing step, or the data m art in aDataStage job.
table d efin ition A d efin ition d escribin g the d ata you w an tinclud ing informa tion abou t the data table and
the columns associated w ith it. Also referred toas meta data.
Tail stage A parallel job stage that copies the specifiednu mber of records from the end of a datapartition.
Terad ata stage A p arallel stage th at allow s you to read an dwr ite a Teradata d atabase.
t ransform funct ion A funct ion that takes one va lue and computesanother v alue from it.
Transformer Editor A graphical interface for edit ing Transformer
stages.Transformer stage A stage where data is transformed (converted)
using transform fun ctions.
Term Description
-
8/14/2019 DS Designer Guide
31/280
-
8/14/2019 DS Designer Guide
32/280
-
8/14/2019 DS Designer Guide
33/280
Your First DataStage Project 2-1
2Your First
DataStage Project
This chap ter describes the steps you need to follow to create you r first data
warehouse, using the sample data p rovided . The example builds a serverjob and uses a Un iVerse tab le called EXAMPLE1,which is autom atically
copied into you r DataStage project du ring server installation.
EXAMPLE1 represents an SQL table from a w holesaler who deals in car
parts. It contains details of the wheels they h ave in stock. There are app rox-
imately 255 rows of data and fou r colum ns:
CODE. The produ ct code for each type of wh eel.
PRODUCT. A text description of each typ e of wh eel.
DATE. The d ate new w heels arrived in stock (given in terms ofyear, month, and d ay).
QTY. The num ber of wh eels in stock.
The aim of this examp le is to develop and run a DataStage job that:
Extracts the data from the file.
Converts (transforms) the data in the DATE colum n from a
complete date (YYYY-MM-DD) stored in intern al data format, to a
year and m onth (YYYY-MM) stored as a string.
Loads d ata from the DATE, CODE, and QTY colum ns into a d atawarehouse. The da ta warehou se is a sequential file that is created
wh en you ru n the job.
-
8/14/2019 DS Designer Guide
34/280
2-2 Ascent ial DataStage Designer Guide
To load a data mart or data warehouse, you mu st do the following:
Set up your project
Crea te a job
Develop the job
Edit the stages in the job Compile the job
Run the job
This chapter d escribes the m inimu m ta sks requ ired to create a DataStage
job. In the examp le, you will use the bu ilt-in settings and options su pp lied
with DataStage. How ever, because DataStage allows you to customize and
extend the bu ilt-in fu nctionality provided , it is possible to perform ad di-
tional processing at each step . Where this is possible, add itiona l
procedu res are listed un d er a section called A dvanced Procedures. These
adv anced p rocedu res are discussed in detail in subsequen t chap ters.
Setting Up Your ProjectBefore you create any DataStage jobs, you m ust set u p your p roject by
entering information about you r data. This includes the name an d location
of the tables or files holding you r d ata and a d efinition of the column s they
contain. Information is stored in tab le definitions in the Repository. The
easiest way to enter a table definition is to imp ort d irectly from th e source
data.
If you w ere working on a large d ata warehousing p roject, you w ould p rob-ably use the DataStage Manager to set up the p roject. As this example is
simp le, and requires you on ly to import a single table definition, you are
better doing th is d irectly from th e DataStage Designer.
-
8/14/2019 DS Designer Guide
35/280
Your First DataStage Project 2-3
Starting the DataStage Designer
To start the D ataStage Designer, choose StartPrograms Ascential
DataStage DataStage Designer. The Attach to Project d ialog box
appears:
This dialog box ap pears w hen you start the DataStage Designer, Manager,
or Director client com ponen ts from the DataStage program folder. In all
cases, you mu st attach to a p roject by entering your logon d etails.
Note: The program group may be called something other than
DataStage,dep ending on how DataStage was installed .
To connect to a project:1. Enter the name of your host in the Host system field. This is the n am e
of the system w here the DataStage Server comp onents are installed.
2. Enter your user name in the User name field. This is you r user name
on the server system.
3. Enter your password in the Password field.
Note: If you are connecting to the server v ia LAN Manager, you can
select the Omit check box. The User name and Password fields
gray out and you log on to the server using you r Windows N T
Domain accoun t details.
4. Choose the project to connect to from the Project drop -down list box.
This list box d isplays all the projects installed on your DataStage
server. Choose you r project from the list box. At this point, you m ay
only have one project installed on you r system and this is displayed
by defau lt.
-
8/14/2019 DS Designer Guide
36/280
2-4 Ascent ial DataStage Designer Guide
5. Select th e Save settings check box to save you r logon settings.
6. Click OK. The DataStage Designer w indow app ears with the New
dialog box open, ready for you to create a new job:
Creating a Job
When a DataStage p roject is installed, it is empty and you mu st create the
jobs you need. Each Da taStage job can load one or more d ata tables in th e
final da ta warehouse. The nu mber of jobs you have in a project depend s
on your data sources and how often you wan t to extract d ata or load the
data warehouse.
-
8/14/2019 DS Designer Guide
37/280
-
8/14/2019 DS Designer Guide
38/280
2-6 Ascent ial DataStage Designer Guide
2. Enter Example1 in the Job name field.
3. Enter Example in the Category field.
4. Click OK to save the job. The u pd ated DataStage Designer w indow
displays the nam e of the saved job.
Defining Table Definitions
For most da ta sources, the qu ickest and simp lest way to specify a table
definition is to import it directly from your da ta source or data warehouse.
In this examp le, you mu st specify a table definition forEXAMPLE1.
Importing a Table Definition
The following steps d escribe how to imp ort a tab le definition for
EXAMPLE1:1. In the Repository window of the DataStage Designer, select the Table
Definitions branch, and choose Import UniVerse Table D efini -
tions from th e shortcu t menu . The Import Metadata (UniVerse
Tables) dialog box appears:
2. Choose localuv from the D SN d rop-dow n list box.
-
8/14/2019 DS Designer Guide
39/280
Your First DataStage Project 2-7
3. Click OK. The up da ted Import Metadata (UniVerse Tables) dialog
box displays all the files for the chosen data sou rce nam e:
Note: The screen shot shows an example of tables found un d er
localuv. Your system may contain d ifferent files to the ones
shown h ere.
4. Selectproject.EXAMPLE1 from the Tables list box, wh ereproject is
the name of your DataStage p roject.
5. Click OK. The colum n information from EXAMPLE1 is imported into
DataStage. A table definition is created and is stored u nd er the Table
Definitions UniVerselocaluv branch in the Repository. The
up da ted DataStage Designer w indow d isplays the new table defini-
tion en try in the Repository w indow.
To view the n ew table definition, dou ble-click theproject .EXAMPLE1
item in the Repository wind ow. The Table D efinition dialog box appears.
This dialog box has up to five pages. Click the tabs to d isplay each p age.
The General page contains informa tion abou t where the data is foun d an d
when the d efinition was created .
-
8/14/2019 DS Designer Guide
40/280
-
8/14/2019 DS Designer Guide
41/280
Your First DataStage Project 2-9
Developing a JobJobs are designed and dev eloped using th e Designer. The job d esign is
developed in the Diagram w indow (the one with grid lines). Each data
source, the d ata warehou se, and each p rocessing step is represented by astage in the job d esign. The stages are linked together to show the flow of
data.
This example requires three stages:
A UniVerse stage to representEXAMPLE1 (the d ata source)
A Transformer stage to convert the d ata in the DATE colum n from
a YYYY-MM-DD d ate in internal date format to a string giving just
year and m onth (YYYY-MM)
A Sequen tial File stage to represent the file created at run time (the
data w arehouse in this example)
Adding Stages
Stages are add ed using the tool palette. This palette contains icons that
represent the comp onents you can add to a job. By d efau lt the tool palette
is docked to th e top of the Designer screen, but you can move it anyw here.
The compon ents present depend on what w as installed w ith DataStage. A
typical tool palette is show n below:
Link
ContainerInput Stage
TransformerStage
AnnotationBCP LoadStage
UniVerse
Stage
SequentialFile Stage
Hashed File
Stage
Container
Stage
Container
Output Stage
Description
Annotation
Orabulk
Stage
Folder
Stage
UniData
Stage
ODBC
StageAggregator
StageIPC Stage
Link Partition Link Collector
StageStage
-
8/14/2019 DS Designer Guide
42/280
-
8/14/2019 DS Designer Guide
43/280
Your First DataStage Project 2-11
5. Save the job design by choosing File
Save.Keep th e Designer open as you w ill need it for the n ext step.
Advanced Procedures
For more ad vanced procedu res, see the following top ics in Chapter 4:
Moving Stageson page 4-22
Renam ing Stages on page 4-22
Deleting Stages on page 4-23
Editing the StagesYour job d esign cu rrently displays the stages and the links betw een them .
You mu st edit each stage in the job to specify the data to use and what to
do with it. Stages are edited in the job design by double-clicking each stage
in turn . Each stage type h as its ow n ed itor.
Editing the UniVerse Stage
The data sou rce (EXAMPLE1) is represented by a UniVerse stage. You
mu st specify the d ata you wan t to extract from this file by ed iting thestage.
Dou ble-click the stage to ed it it. The UniVerse Stage dialog box appear s:
-
8/14/2019 DS Designer Guide
44/280
2-12 Ascent ial DataStage Designer Guide
This dialog box has tw o pages:
Stage. Displayed by d efault. This page contains the nam e of the
stage you a re editing. The General tab sp ecifies where the file is
foun d and the connection typ e.
Outputs. Contains information d escribing t he da ta flowing fromthe stage. You edit this page to describe the d ata you want to
extract from the file. In this examp le, the ou tpu t from this stage
goes to the Tran sform er stage.
To ed it the UniVerse stage:
1. Check that you are displaying the General tab on the Stage page.
Choose localuv from the D ata source name d rop-dow n list. localuv
is whereEXAMPLE1 is copied to d uring installation.
The remaining par ameters on the General and Details tabs are used
to enter logon d etails and d escribe wh ere to find the file. BecauseEXAMPLE1 is installed in localuv, you d o not have to complete these
fields, wh ich are d isabled.
2. Click th e Outputs tab. The Outputs p age app ears:
-
8/14/2019 DS Designer Guide
45/280
Your First DataStage Project 2-13
The Outputs page contains the name of the link th e data flows along
and the following four tabs:
General. Contains the name of the table to use and an op tional
description of the link.
Columns. Contains information abou t the columns in the table.
Selection. Used to enter an op tional SQL SELECT clause (anAdvancedprocedure).
View SQL. Disp lays the SQL SELECT statem ent u sed to extract the
data.
3. Choose dstage.EXAMPLE1 from the Available tables drop-down
list.
4. Click Add to add dstage.EXAMPLE1 to the Table names field.
5. Click th e Columns tab. The Columns tab ap pears at the front of the
dialog box.You m ust specify the columns contained in the file you w ant to use.
Because the colum n d efinitions are stored in a table definition in th e
Repository, you can load them d irectly.
-
8/14/2019 DS Designer Guide
46/280
-
8/14/2019 DS Designer Guide
47/280
Your First DataStage Project 2-15
10. ClickOK to save the stage ed its and close the UniVerse Stage dialog
box. Notice that a small table icon ap pear s on the ou tpu t link to indi-
cate that it now has colum n d efinitions associated w ith it.
11. Choose FileSave to save you r job d esign so far.
Note: In server jobs column d efinitions are attached to a link. You can
view or ed it them at either end of the link. If you chan ge them in a
stage at one en d of the link, the chan ges are au tomatically seen in
the stage at the other end of the link. This is how colum n d efini-
tions are prop agated throug h all the stages in a Da taStage server
job, so the colum n d efinitions you loaded into the Un iVerse stage
are viewed when you edit the Transformer stage.
-
8/14/2019 DS Designer Guide
48/280
2-16 Ascent ial DataStage Designer Guide
Editing the Transformer Stage
The Transformer stage performs any d ata conversion required before the
da ta is outpu t to anoth er stage in the job d esign. In this example, the Trans-
form er stage is used to convert the d ata in the DATE colum n from a YYYY-
MM-DD date in internal date format to a string giving just the year and
month (YYYY-MM).
There are two links in the stage:
The input from the data source (EXAMPLE1)
The outpu t to the Sequential File stage
To ena ble the use of one of the bu ilt-in Da taStage tr ansforms, you w ill
assign d ata elemen ts to the DATE colum ns inp ut and ou tpu t from the
Transformer stage. A DataStage d ata element d efines more p recisely the
kind of data that can ap pear in a given colum n.
In this examp le, you assign the Date da ta elemen t to the inpu t colum n, to
specify the date is inp u t to the transform in internal forma t, and the
MON TH .TAG da ta element to the outp ut column , to specify that the tran s-
form prod uces a string of the format YYYY-MM.
Note: If the d ata in the other column s required transforming, you cou ld
assign Da taStage d ata elements to these colum ns too.
-
8/14/2019 DS Designer Guide
49/280
Your First DataStage Project 2-17
Dou ble-click the Transformer stage to ed it it. The Transformer Editor
appears:
Inpu t column s are show n on th e left, outp ut column s on the right. The
up per p anes show the colum ns together with derivation details, the lower
panes show the colum n meta d ata. In this case, inpu t columns have
alread y been defined for inp ut link DSLink3. No ou tpu t column s have
been d efined for outp ut link DSLink4, so the right pan es are blank.
The next steps are to define the colum ns that w ill be output by the Trans-
form er stage, and to specify the transform th at w ill enable the stage toconvert the type and format of dates before they are outpu t.
1. Working in the u pp er-left pan e of the Transformer Editor, select the
inpu t columns that you w ant to derive outp ut column s from. Click on
the CODE, DATE, and QTY colum ns w hile holding dow n the Ctrl
key.
2. Click the left mouse button again and , keeping it held down , drag the
selected colum ns to the outp ut link in the u pp er-right pan e. Drop the
colum ns over the Column Name field by releasing the mou se button.
-
8/14/2019 DS Designer Guide
50/280
-
8/14/2019 DS Designer Guide
51/280
Your First DataStage Project 2-19
is directly d erived from the inpu t DATE colum n. Select the text
DSLink3 and d elete it by pressing th e Delete key.
8. Right-click in the Expression Editor box to open the Suggest
Operand menu:
-
8/14/2019 DS Designer Guide
52/280
-
8/14/2019 DS Designer Guide
53/280
-
8/14/2019 DS Designer Guide
54/280
2-22 Ascent ial DataStage Designer Guide
To ed it the Sequ ential File stage:
1. Click th e Inputs tab. The Inputs page app ears. This page contains:
The nam e of the link. This is automatically set to the link n ame
used in the job d esign. General tab. Contains the path nam e of the file, an op tional
description of the link, and up da te action choices. You can use the
default settings for this example, but you may w ant to enter a file
nam e (by d efau lt the file is named after the inp ut link).
Format tab. Determines how the d ata is written to th e file. In this
example, the d ata is written u sing the d efau lt settings, that is, as a
comma-delimited file.
Columns tab. Contains the colum n d efinitions for the d ata you
want to extract. This tab contains the colum n d efinitions specifiedin the Tran sformer stages ou tput link.
2. Enter the pathname of the text file you w ant to create in the File name
field, for examp le, seqfile.txt. By d efau lt the file is placed in the
server project directory (for example, c:\ Ascen-
tial\ DataStage\ Projects\ datastage) and is named after the inpu t link,
bu t you can enter, or brow se for, a different d irectory.
3. Click OK to close the Sequential File Stage d ialog box.
4. Choose FileSave to save the job design.
The job design is now complete and ready to be comp iled.
-
8/14/2019 DS Designer Guide
55/280
Your First DataStage Project 2-23
Compiling a JobWhen you finish you r design you mu st comp ile it to create an executab le
job. Jobs are compiled using the Designer. To compile the job, do on e of the
following: Ch oose File Compile.
Click the Compile bu tton on the toolbar.
The Compile Job wind ow ap pears:
The job is comp iled . The result of the comp ilation ap pea rs in the disp lay
area. If the resu lt of the com pilation is Job successfully compiled
with no errors you can g o on to schedu le or run the job. The execut-
able version of the job is stored in you r p roject along w ith your job design.
If an error is d isplayed , clickShow Error. The stage w here the p roblem
occurs is highlighted in the job d esign. Check that all the inpu t and outp ut
colum n definitions have been specified correctly, and that you have
entered d irectory paths and file or table names wh ere appropriate.
For more information abou t the error, clickMore. ClickClose to close the
Comp ile Job w indow.
-
8/14/2019 DS Designer Guide
56/280
2-24 Ascent ial DataStage Designer Guide
Running a JobExecutab le jobs are schedu led by the DataStage Director and run by the
DataStage Server. You can start th e Director from the Designer by choosing
Tools
Run Director.When th e Director is started, the DataStage Director w indow appears w ith
the statu s of all the jobs in you r project:
H ighlight you r job in the Job name column. To run the job, choose Job
Run N ow or click the Run but ton on the toolbar. The Job Run Options
dialog box appears and allow s you to specify any p aram eter values and tospecify an y job ru n limits. In th is case, just clickRun . The status changes
to Run ning. When the job is complete, the status changes to Finished .
Choose FileExit to close the DataStage Director wind ow.
Refer toDataStage Director Guide for more information abou t sched uling
and running jobs.
Advanced Procedures
It is possible to run a job from w ithin another job. For more information,
see Job Con trol Rou tines on page 4-55 an d Chapter 6, Job Sequences.
-
8/14/2019 DS Designer Guide
57/280
-
8/14/2019 DS Designer Guide
58/280
2-26 Ascent ial DataStage Designer Guide
-
8/14/2019 DS Designer Guide
59/280
DataStage Designer Overview 3-1
3DataStage Designer
Overview
This chap ter d escribes the m ain featu res of the Da taStage Designer. It tellsyou h ow to start the Designer an d takes a quick tour of the user interface.
Starting the DataStage DesignerTo start the D ataStage Designer, choose StartPrograms Ascential
DataStage DataStage Designer. The Attach to Project d ialog box
appears:
You can also start the Designer from the shortcut icon on the d esktop, or
from the DataStage Suite applications bar if you have DataStage XE
installed.
You must connect to a project as follows:
-
8/14/2019 DS Designer Guide
60/280
3-2 Ascent ial DataStage Designer Guide
1. Enter the name of your host in the Host system field. This is the n am e
of the system w here the DataStage Server comp onents are installed.
2. Enter your user name in the User name field. This is you r user name
on the server system.
3. Enter your password in the Password field.
Note: If you are connecting to the server v ia LAN Manager, you can
select the Omit check box. The User name and Password fields
gray out and you log on to the server u sing your Wind ows NT
Domain accoun t deta ils.
4. Choose the project to connect to from the Project drop -down list box.
This list box d isp lays all the p rojects installed on you r DataStage
server.
5. Select th e Save settings check box to save you r logon settings.
6. Click OK. The DataStage Designer w indow app ears, by d efau lt with
th e New dialog box open, allowing you to choose a type of job to
create. You can set options to specify that th e Designer opens w ith an
emp ty server or m ainfram e job, or noth ing at all, see Specifying
Designer O ptionson page 3-21.
Note: You can also start the DataStage Designer d irectly from the
DataStage Manager or Director by choosing ToolsRun
Designer.
The DataStage Designer Window
By d efault, DataStage initially starts w ith the New dialog box op en. You
can choose to create a new job as follows:
Server job. These run on the DataStage Server, connecting to other
da ta sou rces as necessary.
Mainframe job . These are available only if you have installed
XE/ 390. Mainframe jobs are up load ed to a m ainframe, where they
are comp iled and run. Parallel job. These are available only if you have installed the
Parallel Extend er. These ru n on DataStage servers that are SMP,
MPP, or cluster systems.
-
8/14/2019 DS Designer Guide
61/280
DataStage Designer Overview 3-3
Shared containers. These are reu sable job elements. Copies of
shared containers can be used in any num ber of server jobs and
edited as requ ired .
Job Sequences. A job sequence allows you to specify a sequence of
DataStage server jobs to be executed , and actions to takedepend ing on results.
Or you can choose to op en an existing job of any of these types. You can
use the DataStage options to specify that the Designer a lways opens a new
server or mainframe job, shared container or job sequen ce w hen its starts.
The initial appear ance of the DataStage Designer is shown below:
The design p ane on the right side and the Property browser are both
emp ty, and a limited num ber of menus app ear on the m enu bar. To see a
more fully pop ulated Designer w indow, choose FileNew and choose
the typ e of job to create from the N ew dialog box (this process w ill befamiliar to you if you w orked th rough the examp le in Chapter 2, Your
First DataStage Project.) For the p ur poses of this example, we created a
server job.
-
8/14/2019 DS Designer Guide
62/280
3-4 Ascent ial DataStage Designer Guide
Menu Bar
There are nine pu ll-down menus. The commands available in each m enu
change d epending on w hether you are currently displaying a server job,parallel job, or a m ainframe job.
File. Creates, opens, closes, and
saves DataStage jobs. Also sets up
printers, comp iles server and
parallel jobs, genera tes and
up loads ma inframe jobs, and exits
the Designer.
Mainframe JobServer JobParallel Job
-
8/14/2019 DS Designer Guide
63/280
DataStage Designer Overview 3-5
Edit. Renam es or deletes stages and
links in the Diagram w indow. Defines
job p roper ties (Job Properties item),
and displays the stage dialog boxes
(Properties item). For server jobs andshared containers only, allows you to
construct local or shared containers,
deconstruct local containers, and
convert local conta iners to shared
containers and vice versa.
View. Determines wha t is displayed in
the DataStage Designer w indow.
Displays or hid es the toolbar, tool
pa lette, status bar, Repository w indow,and Property brow ser. For server jobs
and shared containers on ly, allows you
to display or hide the debu g bar. Other
comma nd s allow you to custom ize the
tool palette and refresh the view of the
Repository items in the Repository
window.
Diagram. Determines w hat actions are
performed in the Diagram w indow.Displays or hides the grid or p rint
lines, enables or disables anno tations,
activates or d eactivates the Snap to
Grid option, and zooms in or out of the
Diagram w indow. Also tur ns perfor-
man ce monitoring on for server or
par allel jobs. The snap to grid and
zoom p roperties are app lied to the job
or container wind ow currently
selected. The settings are saved wh enthe job or container is saved and
restored w hen it is open . The other
settings are personal to you, and are
saved betw een DataStage sessions
ready for you to use again. When you
chang e person al settings they affect all
open wind ows immediately.
-
8/14/2019 DS Designer Guide
64/280
-
8/14/2019 DS Designer Guide
65/280
DataStage Designer Overview 3-7
The Property Browser
The property browser is located by d efau lt in th e top left corner of the
DataStage Designer wind ow (you can m ove it if required). It d isplays the
prop erties of the object cur rently selected in the Diagram window. The
prop erties given d epend on the typ e of object selected. It allow s you to ed itsome of the p roperties withou t opening a d ialog box.
For stages and containers, it gives:
Stage type
Shared container nam e (shared container stages only)
Name
Descr ipt ion
You can ed it the name, and add or ed it a description, but you cann otchange the stage typ e.
For links, the property browser gives:
Name
Input link description
Output link description
You can ed it the name, and add or ed it a description.
-
8/14/2019 DS Designer Guide
66/280
3-8 Ascent ial DataStage Designer Guide
The Repository Window
The Repository wind ow gives details of the items associated w ith the
current p roject w hich are h eld in th e DataStage Repository. The w indow
provides a subset of the DataStage Manager functionality. From the
Designer you can ad d, delete, and edit the following:
Data elements
Job and job sequence properties
Mainframe machine profiles
Rou tin es
Shared container properties
Stage type propert ies
Table definitions
Transforms
Detailed information is in D ataStage Developer s H elp an d DataStage
Manager Guide. A guide to d efining and editing table definitions is given
in this gu ide (Chapter 7) because tab le definitions are so central to job
design.
-
8/14/2019 DS Designer Guide
67/280
DataStage Designer Overview 3-9
In the Designer Repository wind ow you can p erform any of the actions
that you can perform from th e Repository tree in the M anager. When you
select a category in the tree, a shortcut menu allows you to create a new
item u nd er that category or a new su bcategory, or, for Table Definition
categories, import a table definition from a d ata sou rce. When you selectan item in the tree, a shortcut menu allows you to perform various tasks
dep end ing on the type of item selected:
Data elements, machine profiles, routines,
transforms
You can create a copy of these items, rename
them , d elete them and display the properties
of the item. Provided the item is not read-only,
you can ed it the prop erties.
Jobs, shared containersYou can create a copy of these items, rename
them, delete them and edit them in the
diagram wind ow.
Sta ge typ es
You can add stage types to the d iagram
window p alette and d isplay their prop erties.
Provided the item is not read-only, you can
edit the p roperties.
Table definit ions
You can create a copy of table definitions,
rename them, delete them and d isplay the
prop erties of the item. Provided the item is not
read -only, you can ed it the p roperties. You can
also imp ort table d efinitions from da ta sou rces.
It is a good idea to choose View Refresh from the m ain menu bar before
acting on any Repository items to ensure that you have a comp letely up-
to-date view.
You can d rag certain types of item from the Repository w indow onto a
diagram w indow or the diagram window area, or onto specific comp o-
nents w ithin a job:
Jobs the job opens in a new d iagram w indow or, if dragged to a
job sequence w indow, is add ed to th e job sequen ce.
-
8/14/2019 DS Designer Guide
68/280
3-10 Ascent ial DataStage Designer Guide
Shared containers if you d rag one onto an open d iagram window,
the shared container ap pears in the job. If you d rag a shared
container onto the background a new diagram wind ow op ens
showing the contents of the shared container.
Stage types drag a stage type onto an open d iagram w indow toad d it to the job or container. You can also drag it to the tool palette
to add it as a tool.
Table definitions drag a table definition on to a link to load the
column d efinitions for that link. The Select Column s d ialog box
allows you to select a subset of colum ns from the table definition to
load if requ ired.
The Diagram Window
The area to the r ight of the DataStage Designer h olds the Diagramwindow s. A Diagram window app ears for each job, job sequence, or
shared container that you open in your p roject. By d efau lt the diagram
-
8/14/2019 DS Designer Guide
69/280
DataStage Designer Overview 3-11
window has a colored backgroun d . You can tu rn this off using the Options
d ialog box (seeDefault Options on page 3-21). The screenshots in this
guide hav e the background turned off.
The diagram w indow is the canvas on w hich you d esign and display your
job. This wind ow has the follow ing comp onents:
Title bar. Displays the nam e of the job or shared container.
Page tabs. If you u se local containers in your job, the contents of
these containers are d isplayed in separate window s within the
jobs diagram wind ow. Sw itch between views using the tabs at th e
bottom of the diagram w indow.
-
8/14/2019 DS Designer Guide
70/280
3-12 Ascent ial DataStage Designer Guide
Grid lines. Allow you to position stages more precisely in the
wind ow. The grid lines are not displayed by d efau lt. Choose
DiagramShow Grid Lines to enable them .
Scroll bars. Allow you to view the job comp onents that do not fit in
the display area.
Print lines. Display the area that is printed wh en you choose File
Print. The p rint lines also ind icate page bound aries. When you
cross these, you have the choice of printing over several pages or
scaling to fit a single page w hen pr inting. The p rint lines are n ot
displayed by default. Choose DiagramShow Print Lines to
enable them.
You can use the resize hand le or the Maximize button to resize a diagram
window. To resize the contents of the wind ow, use the zoom comman ds in
the Diagram shortcut menu . If you maximize a window an ad d itionalmen u ap pears to the left of the File men u, giving access to Diagram
window controls.
By default, any stages you ad d to the Diagram w indow will snap to the
grid lines. You can, how ever, turn th is op tion off by u nchecking Diagram
Snap to Grid, clicking the Snap to Grid bu tton in the toolbar, or from
th e Designer Options d ialog box.
The d iagram w indow has a shortcut menu wh ich gives you access to the
settings on the Diagram menu (see Menu Bar on page 3-4):
Toolbar
The Designer toolbar contains th e following buttons:
-
8/14/2019 DS Designer Guide
71/280
-
8/14/2019 DS Designer Guide
72/280
3-14 Ascent ial DataStage Designer Guide
The following is an exam ple und ocked server job tool palette:
To ad d a stage to the D iagram w indow, choose it from the tool pa lette and
click the Diagram w indow. The stage is add ed at the insertion p oint in the
diagram w indow. If you click and d rag on the diagram w indow to draw a
rectangle as an insertion p oint, the stage will be sized to fit that rectangle.
You can also drag stages from the tool palette or from the Repository
window and drop them on the Diagram window.
To link tw o stages, choose the Link bu tton. Click the first stage, then d rag
the mouse to the second stage. The stages are linked when you release the
mouse button.
You can customize the tool palette to add or remove variou s button s. Youcan add the buttons for plug-ins you h ave installed, and remove the
buttons for stages you know y ou w ill not use. There are various ways in
which you can customize the palette:
In the palette itself.
From the Repository window.
From the Customize Toolbar dialog box.
To custom ize the tool p alette from the p alette itself:
To remove an existing item from th e palette, select it while holding
dow n the CTRL and shift keys, and dr ag it off the palette (toanyw here other than a Diagram w ind ow).
To mov e an item to another position in the p alette, select it while
holding d own the CTRL and sh ift keys and dr ag it to the desired
position.
Link
ContainerInput Stage
TransformerStage
AnnotationBCP LoadStage
UniVerseStage
Sequential
File Stage
Hashed File
Stage
Container
Stage
Container
Output Stage
Description
Annotation
Orabulk
Stage
Folder
Stage
UniData
Stage
ODBC
Stage
Aggregator
StageIPC Stage
Link Partition Link CollectorStageStage
-
8/14/2019 DS Designer Guide
73/280
DataStage Designer Overview 3-15
To add an add itional item to the palette, choose Customize Palette
from th e shortcut m enu . The Customize Toolbar dialog box opens
(see below for information abou t the Customize Toolbar dialog
box).
To customize the p alette from the Repository w indow :
To add an ad d itional item to the palette, drag the item from the tree
in the Repository wind ow to the p alette, or select Add to Palette
from the items shortcut m enu .
To custom ize the p alette using the Customize Toolbar dialog box:
1. Choose View Customize Palette, or choose Customize Palette
from a shortcut menu. The Customize Toolbar dialog box app ears.
2. Thisdialog box lists all available stage types d epend ing on the typ e
of job (server, mainframe, or job sequen ce) whose diagram window iscurren tly active. To ad d items to a p alette, select the icon in the Avail-
able toolbar buttons wind ow and clickAd d. To remove items, select
them in the Current toolbar buttons wind ow and clickRemove.
There are some tools that m ust always be p resent on the p alette, and
th e Remove bu tton is blanked ou t wh en these are selected .
3. To arrange the buttons in the palette, select an item in the Current
toolbar buttons list and use the Move Up an d Move Down buttons.
4. Click Close to close the dialog box and d isplay the customized tool
palette, or clickReset to reset to the d efault p alette settings.
-
8/14/2019 DS Designer Guide
74/280
3-16 Ascent ial DataStage Designer Guide
Status Bar
The status bar appears at the bottom of the DataStage Designer w indow. It
displays one-line help for the wind ow comp onents and information on the
curren t state of job op erations, for example, comp ilation of server jobs. You
can hide the status bar by choosing View Status Bar.
Debugger Toolbar
Server jobs DataStage has a bu ilt-in debu gger that can be u sed w ith server jobs or
shared containers. The d ebugger toolbar contains bu ttons representing
debu gger functions. You can h ide the d ebugger toolbar by choosing View
Debu g Bar. The debug bar h as a drop-dow n list displaying currently
open server jobs, allowing you to select one of these as the d ebug focus.
Shortcut Menus
There are a nu mber of shortcut m enus available wh ich you d isplay by
clicking the right mou se button. The m enu displayed depend s on w here
you clicked .
Background. App ears when you right-
click on the background area in the left of
the Designer (i.e. the space arou nd
Diagram w indows), or in any of the
toolbar/ pa lette background areas. Gives
access to the same items as the View
menu (see page 3-5).
Diagram window background. Appears
wh en you right-click on a window back-
groun d. Gives access to the same items as
th e Diagram menu (see page 3-5).
Go Stop Job
Job Edit
Toggle
Parameters
Breakpoint
BreakpointsClear AllBreakpoints
Step to
Next Link
Step toNext Row
DebugWindow
View Job Log
Target debug job
-
8/14/2019 DS Designer Guide
75/280
-
8/14/2019 DS Designer Guide
76/280
-
8/14/2019 DS Designer Guide
77/280
-
8/14/2019 DS Designer Guide
78/280
3-20 Ascent ial DataStage Designer Guide
Font. Click th is to open a d ialog box w hich allow s you to specify a
different font for the annota tion text.
Color. Click this to open a d ialog box which allows you to specify a
different color for the annotation text.
Background color. Click th is to open a d ialog box w hich allow s
you to specify a different backgroun d color for the annotation.
Border. Select this to specify that the border of the ann otation is
visible.
Transparent. Select this to choose a tran sparent background .
Description Type. Choose wheth er the Description Annotation
displays the full description or short d escription from the job
properties.
Annotation Properties
The Annotation Properties dialog box is as follows:
The properties are the same as d escribed for description an notations,
except there are n o Description Type options.
-
8/14/2019 DS Designer Guide
79/280
-
8/14/2019 DS Designer Guide
80/280
3-22 Ascent ial DataStage Designer Guide
The page has three areas:
When Designer starts. Determines whether the Designer automat-
ically op ens a new job w hen started , or p romp ts you for a job to
create or open .
Nothing Open. This is the defau lt option. The Designer op ens
with no jobs, shared containers, or job sequences open, you can
then d ecide w hether to op en an d existing item, or create a new
one.
Prompt for. Select this and choose New, Existing or Recent from
the d rop-dow n list. The New d ialog box appears wh en you start
the DataStage Designer, with th e New, Existing , or Recent page
on top , allowing you to choose an item to op en.
Create new. Select this and choose Server, Mainframe, Parallel,
Sequencejob or Shared container from th e d rop-dow n list. Ifthis is selected, a new job of the sp ecified typ e is automatically
created w hen the DataStage Designer is started.
New job/container view attributes . Determines whether the snap
to grid option w ill be on or not for any new jobs, job sequences, or
shared containers that are open ed.
Appearance. These options allow y ou to d ecide h ow the Designer
background canvas is displayed and how the stage icons app ear on
the canvas.
By d efau lt the canvas has a backgrou