September 7, 2013 \ Ananth TM Datastage Coding Checklist Category Checkpoint General General Ensure that the null handling properties are taken care for all the nullable fields. Do not set the null field value to some value which may be present in the source. General Ensure that all the character fields are trimmed before any processing. Normally extra spaces in the data may lead to some errors like lookup mismatch which are hard to detect. General Always save the metadata (for source, target or lookup defi nitions) in the repository to ensure reusability and consistency. General In case the partition type for the next immediate stage is to be changed then the General ‘Propagate partition’ should be set to ‘Clear’ in the current stage. General Make sure that appropriate partitioning and sorting are used in the stages, where ever possible. This enhances the performances. Make sure that you understand the partitioning being used. Otherwise leave it auto. General Make sure that the pathname/format details are not hard coded and job parameters are used for the same. These details are generally set as environmental variable. General Ensure that all fi le names from external source are parameterized. This will prevent the developer from the trouble of changing the job or file name if the file name is changed. File names/Datasets created in the job for intermediate purpose can be hard coded. General Ensure that the environment variable $APT_DISABLE_COMBINATION is set to ‘False’. General Ensure that $APT_STRING_PADCHAR is set to spaces. General The parameters used across the jobs should be with same name. This helps to avoid unnecessary confusions General Use 4-node confi guration fi le for unit testing/system testing the job. General If there are multiple jobs to be run for the same module. Archive the source fi les in the after job routine of the last job. General Check whether the fi le exists in the landing directory before moving the sequential file.The ‘mv’ command will move the landing directory if the fi le is not found. General Verify whether the appropriate after job routine is called in the job. General Verify the correct link counts are used in the after job routine for ACR log fi le. General Check whether the log statements are correct for that job. General Ensure that the unix fi les created by any Datastage job is created by the same unix user who has run the job. General Check the Director log if the error message does not have readability. General Verify job name, stage name, link name, input fi le name are as per standards. Ensure that the job developed adhere to the naming standards defi ned for the software artifacts. General Job description must be clear and readable. General Make sure that the Short Job Description is filled using ‘Description Annotation’ and it contains the job name as part of the description. Don’t use Annotation for putting the job description. General Check that the parameter values assigned to the jobs through sequencer. General Verify if Runtime Column Propagation (RCP) is disabled or not. File Stages Checklist File Stages Checklist Ensure that reject links are output from the sequential file stage which reads the data file to log the records which are rejected. File Stages Checklist Check whether the dataset are used instead of sequential fi le for intermediate storage between the jobs. This enhances performance in a set of linked jobs. File Stages Checklist Reject records should be stored as sequential files. This helps in the analysis of rejected records outside the datastage easier. File Stages Checklist Ensure that the dataset from another job use the same metadata which is saved in the repository. File Stages Checklist Verify that the intermediate files that are used by downstream jobs have unix read access/permission to all users. File Stages Checklist For fixed width files, final delimiter should be set to “none” in the file format property. File Stages Checklist Verify that all lookup reference files have unix permission as 744. This will ensure that other users don’t overwrite or delete the reference fi le. Processing Stages Checklist Processing Stages Checklist The order of stage variables should be in correct order. Eg: – A stage variable used in calculation should be in the higher order than the calculation variable. Processing Stages Checklist If any processing stage requires a key ( like remove duplicate, merge, join, etc ) the Keys, sorting keys and Partitioning keys should be same and in the same order Processing Stages Checklist Make sure that sparse lookup are not used when large volumes of data are handled. Processing Stages Checklist Check look up keys, if they are correct. Processing Stages Checklist Do not validate on a null fi eld in a transformer. Use appropriate data types for the Stage variables. Use IsNull(), IsNotNull() or Seq() for doing such validations. Processing Stages Checklist In Funnel, all the input links must be hash partitioned on the sort keys. Processing Stages Checklist Verify if any Transformer is set to run in sequential mode. It should be in parallel mode. Processing Stages Checklist RCP should be enabled in the Copy stage before shared container. Processing Stages Checklist Verify whether column generated using column generator is created using ‘part’ and ‘part count’. Processing Stages Checklist In Remove Duplicate stage, ensure that the correct record (according to the requirements) is retained. Database Stages Checklist Database Stages Checklist Every database object referenced is accessed through the parameter schema name. Database Stages Checklist Always reference database object using the schema name. Database Stages Checklist Use Upper Case for column names and table names in SQL queries. Job Sequencer Checklist Job Sequencer Checklist Check that the parameter values are assigned to the jobs through sequencer Job Sequencer Checklist For every Job Activity stage in sequencer, ensure that “Reset if required, then run” is selected where relevant.