Datastage Coding Checklist

Category

Checkpoint

General

 

General

Ensure that the null handling properties are taken care for all the nullable fields. Do not set the null field value to some value which may be present in the source.

General

Ensure that all the character fields are trimmed before any processing. Normally extra spaces in the data may lead to some errors like lookup mismatch which are hard to detect.

General

Always save the metadata (for source, target or lookup defi nitions) in the repository to ensure reusability and consistency.

General

In case the partition type for the next immediate stage is to be changed then the

General

‘Propagate partition’ should be set to ‘Clear’ in the current stage.

General

Make sure that appropriate partitioning and sorting are used in the stages, where ever possible. This enhances the performances. Make sure that you understand the partitioning being used. Otherwise leave it auto.

General

Make sure that the pathname/format details are not hard coded and job parameters are used for the same. These details are generally set as environmental variable.

General

Ensure that all fi le names from external source are parameterized. This will prevent the developer from the trouble of changing the job or file name if the file name is changed. File names/Datasets created in the job for intermediate purpose can be hard coded.

General

Ensure that the environment variable $APT_DISABLE_COMBINATION is set to ‘False’.

General

Ensure that $APT_STRING_PADCHAR is set to spaces.

General

The parameters used across the jobs should be with same name. This helps to avoid unnecessary confusions

General

Use 4-node confi guration fi le for unit testing/system testing the job.

General

If there are multiple jobs to be run for the same module. Archive the source fi les in the after job routine of the last job.

General

Check whether the fi le exists in the landing directory before moving the sequential file.The ‘mv’ command will move the landing directory if the fi le is not found.

General

Verify whether the appropriate after job routine is called in the job.

General

Verify the correct link counts are used in the after job routine for ACR log fi le.

General

Check whether the log statements are correct for that job.

General

Ensure that the unix fi les created by any Datastage job is created by the same unix user who has run the job.

General

Check the Director log if the error message does not have readability.

General

Verify job name, stage name, link name, input fi le name are as per standards. Ensure that the job developed adhere to the naming standards defi ned for the software artifacts.

General

Job description must be clear and readable.

General

Make sure that the Short Job Description is filled using ‘Description Annotation’ and it contains the job name as part of the description. Don’t use Annotation for putting the job description.

General

Check that the parameter values assigned to the jobs through sequencer.

General

Verify if Runtime Column Propagation (RCP) is disabled or not.

File Stages Checklist

 

File Stages Checklist

Ensure that reject links are output from the sequential file stage which reads the data file to log the records which are rejected.

File Stages Checklist

Check whether the dataset are used instead of sequential fi le for intermediate storage between the jobs. This enhances performance in a set of linked jobs.

File Stages Checklist

Reject records should be stored as sequential files. This helps in the analysis of rejected records outside the datastage easier.

File Stages Checklist

Ensure that the dataset from another job use the same metadata which is saved in the repository.

File Stages Checklist

Verify that the intermediate files that are used by downstream jobs have unix read access/permission to all users.

File Stages Checklist

For fixed width files, final delimiter should be set to “none” in the file format property.

File Stages Checklist

Verify that all lookup reference files have unix permission as 744. This will ensure that other users don’t overwrite or delete the reference fi le.

Processing Stages Checklist

 

Processing Stages Checklist

The order of stage variables should be in correct order. Eg: – A stage variable used in calculation should be in the higher order than the calculation variable.

Processing Stages Checklist

If any processing stage requires a key ( like remove duplicate, merge, join, etc ) the Keys, sorting keys and Partitioning keys should be same and in the same order

Processing Stages Checklist

Make sure that sparse lookup are not used when large volumes of data are handled.

Processing Stages Checklist

Check look up keys, if they are correct.

Processing Stages Checklist

Do not validate on a null fi eld in a transformer. Use appropriate data types for the Stage variables. Use IsNull(), IsNotNull() or Seq() for doing such validations.

Processing Stages Checklist

In Funnel, all the input links must be hash partitioned on the sort keys.

Processing Stages Checklist

Verify if any Transformer is set to run in sequential mode. It should be in parallel mode.

Processing Stages Checklist

RCP should be enabled in the Copy stage before shared container.

Processing Stages Checklist

Verify whether column generated using column generator is created using ‘part’ and ‘part count’.

Processing Stages Checklist

In Remove Duplicate stage, ensure that the correct record (according to the requirements) is retained.

Database Stages Checklist

 

Database Stages Checklist

Every database object referenced is accessed through the parameter schema name.

Database Stages Checklist

Always reference database object using the schema name.

Database Stages Checklist

Use Upper Case for column names and table names in SQL queries.

Job Sequencer Checklist

 

Job Sequencer Checklist

Check that the parameter values are assigned to the jobs through sequencer

Job Sequencer Checklist

For every Job Activity stage in sequencer, ensure that “Reset if required, then run” is selected where relevant.

 

Leave a Reply

Your email address will not be published. Required fields are marked *