May 13, 2014 DataStage Naming Conventions DataStage Naming Conventions follows the guidelines of ETL Naming Conventions. Contents 1 Job Name Prefixes 2 Stage Names 3 Link Names Job Name Prefixes Job prefixes are optional but they help to quickly identify the type of job and can make job navigation and job reporting easier. Parallel jobs – par Server jobs – ser Sequence jobs – seq Batch jobs – bat Mainframe jobs Stage Names The stage type prefix is used on all stage names so it appears on metadata reports that do not include a diagram of the stage or a description of the stage type. The name alone can be used to indicate the stage type. Source and target stage names indentify the name of the entity such as a table name or a sequential file name. The stage name strips out any dynamic part of the name – such as a timestamp, and file extensions. Database stage – db_table name Dataset – ds_datasetname Hash file – hf_hashfilename Sequential file stage – sf_filename The prefix identifies the source type, the rest of the name indicates how to find that source outside of DataStage or how to refer to that source in another DataStage job. Transformation stages Aggregation – AG_CalculatedContent (Prices, SalesAmounts, YTDPrices) Changed Data Capture – CD Funnel – FO_FunnelType (Continuous, round robin) Lookup – LU Pivot – PI Remove Duplicates – RD Sort – SO_SortFields Transformer – TR_PrimaryFunction (HandleNulls, QA, Map) Link Names The link name describes what data is travelling down the link. Link names turn up in process metadata via the link count statistics so it is very important to use names that make process reporting user friendly. Only some links in a job are important to project administrators. The link naming convention has two types of link names: – Links of importance have a five letter prefix followed by a double underscore followed by link details. – Intermediate links have a link name without a double underscore. Links of Importance: – The first primary link in a job consists of SourceType(char2)pri(primary). – Any link from a reference source consists of SourceType(char2)ref(reference). – Any link loading to a target consists of TargetType(char2)UpdateAction(char3). – Any reject link SourceType(char2)rej(reject). Any project can add new links of importance, such as the output count of a remove duplicates or aggregation stage. Example: dbpri__stockitem is the first link in a job. dbups__stockitem is the link loading to a target database table with an upsert option. dbref__orgcodes is a reference lookup to of orgcodes to a database table. dbrej__stockitems is a reject of upserts to the stockitem table. You can then produce a pivot report against the link row count statistics to show the row counts for a particular job using the five letter prefix as for each type of row count.
May 12, 2014 DataStage Environment variables list 1 Buffering 2 Building Custom stages 1.1 APT_BUFFER_FREE_RUN 2.1 DS_OPERATOR_BUILDOP_DIR 1.2 APT_BUFFER_MAXIMUM_MEMORY 2.2 OSH_BUILDOP_CODE 1.3 APT_BUFFER_MAXIMUM_TIMEOUT 2.3 OSH_BUILDOP_HEADER 1.4 APT_BUFFER_DISK_WRITE_INCREMENT 2.4 OSH_BUILDOP_OBJECT 1.5 APT_BUFFERING_POLICY 2.5 OSH_BUILDOP_XLC_BIN 1.6 APT_SHARED_MEMORY_BUFFERS 2.6 OSH_CBUILDOP_XLC_BIN 3 Compiler 4 DB2 Support 3.1 APT_COMPILER 4.1 APT_DB2INSTANCE_HOME 3.2 APT_COMPILEOPT 4.2 APT_DB2READ_LOCK_TABLE 3.3 APT_LINKER 4.3 APT_DBNAME 3.4 APT_LINKOPT 4.4 APT_RDBMS_COMMIT_ROWS 4.5 DB2DBDFT 5 Debugging 5.1 APT_DEBUG_OPERATOR 6 Decimal Support 5.2 APT_DEBUG_MODULE_NAMES 6.1 APT_DECIMAL_INTERM_PRECISION 5.3 APT_DEBUG_PARTITION 6.2 APT_DECIMAL_INTERM_SCALE 5.4 APT_DEBUG_SIGNALS 6.3 APT_DECIMAL_INTERM_ROUND_MODE 5.5 APT_DEBUG_STEP 5.6 APT_DEBUG_SUBPROC 7 Disk IO 5.7 APT_EXECUTION_MODE 7.1 APT_BUFFER_DISK_WRITE_INCREMENT 5.8 APT_PM_DBX 7.2 APT_CONSISTENT_BUFFERIO_SIZE 5.9 APT_PM_GDB 7.3 APT_EXPORT_FLUSH_COUNT 5.10 APT_PM_SHOW_PIDS 7.4 APT_IO_MAP/APT_IO_NOMAP and APT_BUFFERIO_MAP/APT_BUFFERIO_NOMAP 5.11 APT_PM_XLDB 7.5 APT_PHYSICAL_DATASET_BLOCK_SIZE 5.12 APT_PM_XTERM 5.13 APT_SHOW_LIBLOAD 9 Job Monitoring 9.1 APT_MONITOR_SIZE 8 General Job Admin 9.2 APT_MONITOR_TIME 8.1 APT_CHECKPOINT_DIR 9.3 APT_NO_JOBMON 8.2 APT_CLOBBER_OUTPUT 9.4 APT_PERFORMANCE_DATA 8.3 APT_CONFIG_FILE 8.4 APT_DISABLE_COMBINATION 11 Lookup 8.5 APT_EXECUTION_MODE 11.1 APT_LUTCREATE_NO_MMAP 8.6 APT_ORCHHOME 8.7 APT_STARTUP_SCRIPT 12 Network 8.8 APT_NO_STARTUP_SCRIPT 12.1 APT_IO_MAXIMUM_OUTSTANDING 8.9 APT_STARTUP_STATUS 12.2 APT_IOMGR_CONNECT_ATTEMPTS 8.10 APT_THIN_SCORE 12.3 APT_PM_CONDUCTOR_HOSTNAME 12.4 APT_PM_NO_TCPIP 10 Misc 12.5 APT_PM_NODE_TIMEOUT 10.1 APT_COPY_TRANSFORM_OPERATOR 12.6 APT_PM_SHOWRSH 10.2 “APT_EBCDIC_VERSION” on page 65 12.7 APT_PM_USE_RSH_LOCALLY 10.3 APT_DATE_CENTURY_BREAK_YEAR 12.8 APT_RECVBUFSIZE 10.4 APT_IMPEXP_ALLOW_ZERO_LENGTH_FIXED_NULL 12.9 APT_RECVBUFSIZE 10.5 APT_IMPORT_REJECT_STRING_FIELD_OVERRUNS 10.6 APT_INSERT_COPY_BEFORE_MODIFY 13 NLS 10.7 APT_OLD_BOUNDED_LENGTH 13.1 APT_COLLATION_SEQUENCE 10.8 APT_OPERATOR_REGISTRY_PATH 13.2 APT_COLLATION_STRENGTH 10.9 APT_PM_NO_SHARED_MEMORY 13.3 APT_ENGLISH_MESSAGES 10.10 APT_PM_NO_NAMED_PIPES 13.4 APT_IMPEXP_CHARSET 10.11 APT_PM_SOFT_KILL_WAIT 13.5 APT_INPUT_CHARSET 10.12 APT_PM_STARTUP_CONCURRENCY 13.6 APT_OS_CHARSET 10.13 APT_RECORD_COUNTS 13.7 APT_OUTPUT_CHARSET 10.14 APT_SAVE_SCORE 13.8 APT_STRING_CHARSET 10.15 APT_SHOW_COMPONENT_CALLS 10.16 APT_STACK_TRACE 14 Oracle Support 10.17 APT_WRITE_DS_VERSION 14.1 APT_ORACLE_LOAD_DELIMITED 10.18 OSH_PRELOAD_LIBS 14.2 APT_ORACLE_LOAD_OPTIONS 14.3 APT_ORACLE_NO_OPS 15 Partitioning 14.4 APT_ORACLE_PRESERVE_BLANKS 15.1 APT_NO_PART_INSERTION 14.5 APT_ORA_IGNORE_CONFIG_FILE_PARALLELISM 15.2 APT_PARTITION_COUNT 14.6 APT_ORA_WRITE_FILES 15.3 APT_PARTITION_NUMBER 14.7 APT_ORAUPSERT_COMMIT_ROW_INTERVAL APT_ORAUPSERT_COMMIT_TIME_INTERVAL 16 Reading/Writing files 17 Reporting 16.1 APT_DELIMITED_READ_SIZE 17.1 APT_DUMP_SCORE 16.2 APT_FILE_IMPORT_BUFFER_SIZE 17.2 APT_ERROR_CONFIGURATION 16.3 APT_FILE_EXPORT_BUFFER_SIZE 17.3 APT_MSG_FILELINE 16.4 APT_IMPORT_PATTERN_USES_FILESET 17.4 APT_PM_PLAYER_MEMORY 16.5 APT_MAX_DELIMITED_READ_SIZE 17.5 APT_PM_PLAYER_TIMING 16.6 APT_PREVIOUS_FINAL_DELIMITER_COMPATIBLE 17.6 APT_RECORD_COUNTS 16.7 APT_STRING_PADCHAR 17.7 OSH_DUMP 17.8 OSH_ECHO 18 Sorting 17.9 OSH_EXPLAIN 18.1 APT_NO_SORT_INSERTION 17.10 OSH_PRINT_SCHEMAS 18.2 APT_SORT_INSERTION_CHECK_ONLY 21 SAS Support 19 Teradata Support 21.1 APT_HASH_TO_SASHASH 19.1 APT_TERA_64K_BUFFERS 21.2 APT_NO_SASOUT_INSERT 19.2 APT_TERA_NO_ERR_CLEANUP 21.3 APT_NO_SAS_TRANSFORMS 19.3 APT_TERA_NO_PERM_CHECKS 21.4 APT_SAS_ACCEPT_ERROR 19.4 APT_TERA_NO_SQL_CONVERSION 21.5 APT_SAS_CHARSET 19.5 APT_TERA_SYNC_DATABASE 21.6 APT_SAS_CHARSET_ABORT 19.6 APT_TERA_SYNC_USER 21.7 APT_SAS_COMMAND 21.8 APT_SASINT_COMMAND 20 Transport Blocks 21.9 APT_SAS_DEBUG 20.1 APT_AUTO_TRANSPORT_BLOCK_SIZE 21.10 APT_SAS_DEBUG_IO 20.2 APT_LATENCY_COEFFICIENT 21.11 APT_SAS_DEBUG_LEVEL 20.3 APT_DEFAULT_TRANSPORT_BLOCK_SIZE 21.12 APT_SAS_DEBUG_VERBOSE 20.4 APT_MAX_TRANSPORT_BLOCK_SIZE/ APT_MIN_TRANSPORT_BLOCK_SIZE 21.13 APT_SAS_NO_PSDS_USTRING 21.14 APT_SAS_S_ARGUMENT 21.15 APT_SAS_SCHEMASOURCE_DUMP 21.16 APT_SAS_SHOW_INFO 21.17 APT_SAS_TRUNCATION
May 12, 2014 InfoSphere DataStage Jobstatus returned Codes from dsjob Equ DSJS.RUNNING To 0 This is the only status that means the job is actually running Equ DSJS.RUNOK To 1 Job finished a normal run with no warnings Equ DSJS.RUNWARN To 2 Job finished a normal run with warnings Equ DSJS.RUNFAILED To 3 Job finished a normal run with a fatal error Equ DSJS.QUEUED To 4 Job queued waiting for resource allocation Equ DSJS.VALOK To 11 Job finished a validation run with no warnings Equ DSJS.VALWARN To 12 Job finished a validation run with warnings Equ DSJS.VALFAILED To 13 Job failed a validation run Equ DSJS.RESET To 21 Job finished a reset run Equ DSJS.CRASHED To 96 Job has crashed Equ DSJS.STOPPED To 97 Job was stopped by operator intervention (can’t tell run type) Equ DSJS.NOTRUNNABLE To 98 Job has not been compiled Equ DSJS.NOTRUNNING To 99 Any other status
May 10, 2014 Alphabetical list of stages in DataStage The stages in this document are included and available with IBM InfoSphere DataStage unless otherwise noted in the following key. Palette category Description Data Quality1 Available only with IBM InfoSphere QualityStage Data Quality2 Available as a separately licensed add-on module to InfoSphere QualityStage Application Available as an add-on Pack to IBM InfoSphere DataStage and QualityStage The following key includes additional notes about stages in the Designer client. Stage name Description Stage name3 Available in the Stage Types folder in the repository tree, but is not displayed in the palette by default. Alphabetic A – F: Stages in parallel jobs or server jobs Stage name Job type Palette category Delivery and notes Address Verification Parallel Data Quality2 Aggregator Parallel, server Processing BASIC Transformer3 Parallel Processing Superseded by the Transformer stage CASS Parallel Data Quality2 CDC Transaction Parallel Real Time Delivered as the CDC Transaction patch Change Apply Parallel Processing Change Capture Parallel Processing Checksum Parallel Processing Classic Federation Parallel Database Column Export Parallel Restructure Column Generator Parallel Development/Debug Column Import Parallel Restructure Combine Records Parallel Restructure Command Stage Server Processing Compare Parallel Processing Complex Flat File Parallel, server File Compress Parallel Processing Copy Parallel Processing Data Set Parallel File DB2® Connector Parallel, server Database DB2 Enterprise Parallel Database Superseded by the DB2 Connector stage DB2 UDB API3 Parallel, server Database Superseded by the DB2 Connector stage DB2 UDB Load3 Parallel, server Database Superseded by the DB2 Connector stage DB2Z 3 Parallel Database Superseded by the DB2 Connector stage Decode Parallel Processing Difference Parallel Processing Direct Access for PeopleSoft Enterprise Parallel, server Application Distributed Transaction Parallel Database DPID Parallel Data Quality2 DRS Connector Parallel, server Database Dynamic RDBMS Parallel, server Database Superseded by the DRS Connector stage (Dynamic Relational Stage Connector stage) Encode Parallel Processing Essbase Connector Parallel Application Expand Parallel Processing External Filter Parallel Processing External Source Parallel File External Target Parallel File File Set Parallel File Filter Parallel Processing Folder Server File FTP Enterprise Parallel Processing FTP Plug-in Parallel, server Processing Superseded by the FTP Enterprise stage on the parallel canvas Funnel Parallel Processing Alphabetic G – P: Stages in parallel jobs or server jobs Stage name Job type Palette category Delivery and notes Generic Parallel Processing Hashed File Server File Head Parallel Development/Debug Hierarchy for PeopleSoft Enterprise Parallel, server Application Informix® CLI Parallel, server Database Informix Enterprise Parallel Database Informix Load Parallel, server Database Informix XPS Load3 Server Database InterProcess Server Processing Investigate Parallel Data Quality1 ISD Input Parallel, server Real Time ISD Output Parallel, server Real Time iWay Enterprise Parallel Database Java Client Parallel, server Real Time Java Transformer Parallel, server Real Time JD Edwards EnterpriseOne Parallel, server Application Join Parallel Processing Link Collector Server Processing Link Partitioner Server Processing Lookup Parallel Processing Lookup File Set Parallel File Make Subrecord Parallel Restructure Make Vector Parallel Restructure MapStage (for TX) Parallel Application Delivered as IBM WebSphere Transformation Extender for DataStage Match Frequency Parallel Data Quality1 Merge Parallel, server Processing MNS (Multinational Standardization) Parallel Data Quality1 Modify Parallel Processing MS OLEDB Server Database MS SQL Server Load Server Database Netezza Enterprise Parallel Database Documentation is available only with the installed product ODBC Server Database Superseded by the ODBC Connector stage ODBC Connector Parallel, server Database ODBC Enterprise3 Parallel Database, Application Superseded by the ODBC Connector stage Oracle Applications Direct Access Parallel, server Application Oracle Applications Hierarchy Access Parallel, server Application Oracle Connector Parallel, server Database Oracle Enterprise3 Parallel Database Superseded by the Oracle Connector stage Oracle OCI (Oracle Call Interface)3 Server Database Superseded by the Oracle Connector stage Oracle OCI Load3 Parallel, server Database, Application Superseded by the Oracle Connector stage Peek Parallel Development/Debug Pivot Parallel, server Processing Superseded by the Pivot Enterprise stage on the parallel canvas Pivot Enterprise Parallel Processing Promote Subrecord Parallel Restructure Alphabetic Q – Z : Stages in parallel jobs or server jobs Stage name Job type Palette category Delivery and notes QualityStage Legacy Parallel Data Quality1 RedBrick Load Parallel, server Database Reference Match Parallel Data Quality1 Remove Duplicates Parallel Processing Row Generator Parallel Development/Debug Row Merger Server Processing Row Splitter Server Processing Salesforce Parallel Application Sample Parallel Development/Debug (SAP BW) BW Load Parallel, server Application (SAP BW) BW Open Hub Parallel, server Application (SAP R/3) ABAP Extract Parallel, server Application (SAP R/3) BAPI Parallel, server Application (SAP R/3) IDoc Extract Parallel, server Application (SAP R/3) IDoc Load Parallel, server Application SAS Parallel Application SAS Parallel Data Set Parallel Application Sequential File Parallel, server File SERP (Software Evaluation and Recognition Program) Parallel Data Quality2 Siebel BC Pack (Business Component) Parallel, server Application Siebel DS Pack (Direct Access) Parallel, server Application Siebel EIM Pack (Enterprise Integration Manager) Parallel, server Application Slowly Changing Dimension Parallel Processing Sort Parallel, server Processing Split Subrecord Parallel Restructure Split Vector Parallel Restructure SQA (Standardization Quality Assessment) Parallel Data Quality1 SQLServer Enterprise Parallel Database Standardize Parallel Data Quality1 Stored Procedure Parallel, server Database Surrogate Key Generator Parallel Processing Survive Parallel Data Quality1 Switch Parallel Processing Sybase BCP Load Server Database Sybase Enterprise Parallel Database Sybase IQ 12 Load Parallel, server Database Sybase OC Parallel, server Database, Application Tail Parallel Development/Debug Teradata API3 Parallel, server Database Superseded by the Teradata Connector stage Teradata Connector Parallel, server Database Teradata Enterprise 3 Parallel Database Superseded by the Teradata Connector stage Teradata Load3 Parallel, server Database Superseded by the Teradata Connector stage Teradata Multiload3 Parallel, server Database Superseded by the Teradata Connector stage Transformer Parallel, server Processing Unduplicate Match Parallel Data Quality1 UniData® Server Database UniData 6 Server Database Superseded by the UniData stage UniVerse Server Database Wave Generator Parallel Processing Web Services Client Parallel, server Real Time Web Services Transformer Parallel, server Real Time WebSphere® MQ3 Parallel, server Real Time Superseded by the WebSphere MQ Connector stage WebSphere MQ Connector Parallel, server Real Time Write Range Map Parallel Development/Debug XML Parallel, server Real Time Delivered as the XML Transformer patch XML Input Parallel, server Real Time Superseded by the XML stage XML Output Parallel, server Real Time Superseded by the XML stage XML Transformer Parallel, server Real Time Superseded by the XML stage zOS File Parallel File Stages that are available in sequence jobs Stage name Palette category EndLoop Activity Sequence Exception Handler Sequence Execute Command Sequence Job Activity Sequence Nested Condition Sequence Notification Activity Sequence Routine Activity Sequence Sequencer Sequence StartLoop Activity Sequence Terminator Activity Sequence UserVariables Activity Sequence Wait For File Activity Sequence
May 10, 2014 Datastage Webservices pack for SOAP over HTTP V 8.0 and up already has a Wbeservices pack for Datastage. if in 7.5, this needs to be installed (acquired) separatelyStages:WSClient StageWSTransformer StageDatastage can act as a SOAP client. DataStage Job can reach out and invoke a Web Service. Get the WSDL from the webservices team for the API to be invoked(WISD, or RTI, allows you to publish a DataStage Job or other Information Server asset “as” a Web Service)for more information:http://etlimpact.blogspot.in/2012/08/web-service-integration-through.html http://www.dstagebox.com/2011/07/expose-datastage-job-as-a-web-service/ http://infospheredatastage.blogspot.in/2011/09/infosphere-datastage-and-web-service.html
May 10, 2014 Difference between Hash and Modulus partitioning Hash and Modulus techniques are Key based partition techniques. Hash and Modulus techniques are used for different purpose. If Key column data type is textual then we use has partition technique for the job. If Key column data type is numeric, we use modulus partition technique. If one key column numeric and another text then also we use has partition technique. if both the key columns are numeric data type then we use modulus partition technique.
May 10, 2014 Protecting projects in datastage Only Administrator/root can protect or unprotect the project.
May 10, 2014 Routines Routines are stored in theRoutines branch of the DataStage Repository, where you can create, view oredit. The following are different types of routines: 1) Transform functions 2) Before-after job subroutines 3) Job Control routines
May 9, 2014 Cleanup after an aborted load or update DROP TABLE TableName_LOG; DROP TABLE TableName_ET; DROP TABLE TableName_UV; DROP TABLE TableName_WT; RELEASE MLOAD TableName;