Datastage Interview Questions - Part I 1. What are the Environmental variables in Datastage? 2. Check for Job Errors in datastage 3. What are Stage Variables, Derivations and Constants? 4. What is Pipeline Parallelism? 5. Debug stages in PX 6. How do you remove duplicates in dataset 7. What is the difference between Job Control and Job Sequence 8. What is the max size of Data set stage? 9. performance in sort stage 10. How to develop the SCD using LOOKUP stage? 12. What are the errors you expereiced with data stage 13. what are the main diff between server job and parallel job in datastage 14. Why you need Modify Stage? 15. What is the difference between Squential Stage & Dataset Stage. When do u use them. 16. memory allocation while using lookup stage 17. What is Phantom error in the datastage. How to overcome this error. 18. Parameter file usage in Datastage 19. Explain the best approch to do a SCD type2 mapping in parallel job? 20. how can we improve the performance of the job while handling huge amount of data 21. HI How can we create read only jobs in Datastage. 22. how to implement routines in data stage,have any one has any material for data stage 23. How will you determine the sequence of jobs to load into data warehouse? 24. How can we Test jobs in Datastage?? 25. DataStage - delete header and footer on the source sequential

26. How can we implement Slowly Changing Dimensions in DataStage?. 27. Differentiate Database data and Data warehouse data? 28. How to run a Shell Script within the scope of a Data stage job? 29. what is the difference between datastage and informatica 30. Explain about job control language such as (DS_JOBS) 32. What is Invocation ID? 33. How to connect two stages which do not have any common columns between them? 34. In SAP/R3, How do you declare and pass parameters in parallel job . 35. Difference between Hashfile and Sequential File? 36. How do you fix the error "OCI has fetched truncated data" in DataStage 37. A batch is running and it is scheduled to run in 5 minutes. But after 10 days the time changes to 10 minutes. What type of error is this and how to fix it? 38. Which partition we have to use for Aggregate Stage in parallel jobs ? 39. What is the baseline to implement parition or parallel execution method in datastage job.e.g. more than 2 millions records only advised ? 40. how do we create index in data satge? 41. What is the flow of loading data into fact & dimensional tables? 42. What is a sequential file that has single input link?? 43. Aggregators What does the warning Hash table has grown to xyz 44. what is hashing algorithm? 45. How do you load partial data after job failed source has 10000 records, Job failed after 5000 records are loaded. This status of the job is abort , Instead of removing 5000 records from target , How can i resume the load 46. What is Orchestrate options in generic stage, what are the option names. value ? Name of an Orchestrate operator to call. what are the orchestrate operators available in datastage for AIX environment. 47. Type 30D hash file is GENERIC or SPECIFIC? . mean?

48. Is Hashed file an Active or Passive Stage? When will be it useful? 49. How do you extract job parameters from a file? 50. 1.What about System variables? 2.How can we create Containers? 3.How can we improve the performance of DataStage? 4.what are the Job parameters? 5.what is the difference between routine and transform and function? 6.What are all the third party tools used in DataStage? 7.How can we implement Lookup in DataStage Server jobs? 8.How can we implement Slowly Changing Dimensions in DataStage?. 9.How can we join one Oracle source and Sequential file?. 10.What is iconv and oconv functions?

51What are the difficulties faced in using DataStage ? or what are the constraints in using DataStage ? 52. Have you ever involved in updating the DS versions like DS 5.X, if so tell us some the steps you have 53. What r XML files and how do you read data from XML files and what stage to be used? 54. How do you track performance statistics and enhance it? 55. Types of vies in Datastage Director? There are 3 types of views in Datastage Director a) Job View - Dates of Jobs Compiled. b) Log View - Status of Job last run c) Status View - Warning Messages, Event Messages, Program Generated Messag 56. What is the default cache size? How do you change the cache size if needed? Default cache size is 256 MB. We can incraese it by going into Datastage Administrator and selecting the Tunable Tab and specify the cache size over there. 57. How do you pass the parameter to the job sequence if the job is running at night? 58. How do you catch bad rows from OCI stage?

59. what is quality stage and profile stage? 60. what is the use and advantage of procedure in datastage? 61. What are the important considerations while using join stage instead of lookups. 62. how to implement type2 slowly changing dimenstion in datastage? give me with example? 63. How to implement the type 2 Slowly Changing dimension in DataStage? 64. What are Static Hash files and Dynamic Hash files? 65. What is the difference between Datastage Server jobs and Datastage Parallel jobs? 66. What is ' insert for update ' in datastage 67. How did u connect to DB2 in your last project? Using DB2 ODBC drivers. 68. How do you merge two files in DS? Either used Copy command as a Before-job subroutine if the metadata of the 2 files are same or created a job to concatenate the 2 files into one if the metadata is different. 69. What is the order of execution done internally in the transformer with the stage editor having input links on the lft hand side and output links?

70. How will you call external function or subroutine from datastage? 71. What happens if the job fails at night? 72. Types of Parallel Processing? Parallel Processing is broadly classified into 2 types. a) SMP - Symmetrical Multi Processing. b) MPP - Massive Parallel Processing. 73. What is DS Administrator used for - did u use it? 74. How do you do oracle 4 way inner join if there are 4 oracle input files? 75. How do you pass filename as the parameter for a job? 76. How do you populate source files? 77. How to handle Date convertions in Datastage? Convert a mm/dd/yyyy format to yyyy-ddmm?

We use a) "Iconv" function - Internal Convertion. b) "Oconv" function - External Convertion. Function to convert mm/dd/yyyy format to yyyy-dd-mm is Oconv(Iconv(Filedname,"D/M

78. How do you execute datastage job from command line prompt? Using "dsjob" command as follows. dsjob -run -jobstatus projectname jobname 79. Differentiate Primary Key and Partition Key? Primary Key is a combination of unique and not null. It can be a collection of key values called as composite primary key. Partition Key is a just a part of Primary Key. There are several methods of 80 How to install and configure DataStage EE on Sun Micro systems multi-processor hardware running the Solaris 9 operating system? Asked by: Kapil Jayne 81. What are all the third party tools used in DataStage? 82. How do you eliminate duplicate rows? 83. what is the difference between routine and transform and function? 84. Do you know about INTEGRITY/QUALITY stage? 85. how to attach a mtr file (MapTrace) via email and the MapTrace is used to record all the execute map errors 86. Is it possible to calculate a hash total for an EBCDIC file and have the hash total stored as EBCDIC using Datastage? Currently, the total is converted to ASCII, even tho the individual records are stored as EBCDIC. 87. If your running 4 ways parallel and you have 10 stages on the canvas, how many processes does datastage create? 88. Explain the differences between Oracle8i/9i? 89. How will you pass the parameter to the job schedule if the job is running at night? What happens if one job fails in the night? 90. what is an environment variable?? 91. how find duplicate records using transformer stage in server edition 92. what is panthom error in data stage 93. How can we increment the surrogate key value for every insert in to target database

94. what is the use of environmental variables? 95. how can we run the batch using command line? 96. what is fact load? 97. Explain a specific scenario where we would use range partitioning ? 98. what is job commit in datastage? 99. hi..Disadvantages of staging area Thanks,Jagan 100. How do you configure api_dump 102. Does type of partitioning change for SMP and MPP systems? 103. what is the difference between RELEASE THE JOB and KILL THE JOB? 104. Can you convert a snow flake schema into star schema? 105. What is repository? 106. What is Fact loading, how to do it? 107. What is the alternative way where we can do job control?? 108.Where we can use these Stages Link Partetionar, Link Collector & Inter Process (OCI) Stage whether in Server Jobs or in Parallel Jobs ?And SMP is a Parallel or Server ? 109. Where can you output data using the Peek Stage? 110. Do u know about METASTAGE? 111. In which situation,we are using RUN TIME COLUMN PROPAGATION option? 112. what is the difference between datasatge and datastage TX? 113. 1 1. Difference between Hashfile and Sequential File?. What is modulus?2 2. What is iconv and oconv functions?.3 3. How can we join one Oracle source and Sequential file?.4 4. How can we implement Slowly Changing Dimensions in DataStage?.5 5. How can we implement Lookup

in DataStage Server jobs?.6 6. What are all the third party tools used in DataStage?.7 7. what is the difference between routine and transform and function?.8 8. what are the Job parameters?.9 9. Plug-in?.10 10.How can we improv 114. Is it possible to query a hash file? Justify your answer... 115. How to enable the datastage engine? 116. How I can convert Server Jobs into Parallel Jobs? 117. Suppose you have table "sample" & three columns in that tablesample:Cola Colb Colc1 10 1002 20 2003 30 300Assume: cola is primary keyHow will you fe