Code checklist for Datastage

Are naming conventions implemented for all objects (stages, links & datasets)
Are all required parameter sets attached to the job?
Are all required jobs compiled with the latest parameter set? (in case of projects where existing jobs are being changed)
Are all source columns data type set to Varchar and Nullable Yes?
Are all parameters passed correctly as required to the shared container?
Is annotation/description included within the job
Is dataset path, holding area path etc HARDCODED?
When run does the Job produce WARNINGS, Check if they can be avoided?
Is the Short Job Description / Full Job Description provided?
Is the job version number provided?
DataSet names in Lower Case with .ds extension?
pFileName parameter not included, FileName is hardcoded (wherever applicable) ?
Ensure that the null handling properties are taken care for all the nullable fields. Do not set the null field value to some value which may be present in the source.
Ensure that all the character fields are trimmed before any processing. Normally extra spaces in the data may lead to some errors like lookup mismatch which are hard to detect.
In case the partition type for the next immediate stage is to be changed then the ‘Propagate partition’ should be set to ‘Clear’ in the current stage.
Make sure that appropriate partitioning and sorting are used in the stages, wherever possible. This enhances the performances. Make sure that you understand the partitioning being used. Otherwise leave it auto.
Check that the parameter values assigned to the jobs through sequencer.
Check whether the dataset are used instead of sequential file for intermediate storage between the jobs. This enhances performance in a set of linked jobs.
Verify that the intermediate files that are used by downstream jobs have unix read access/permission to all users.
The order of stage variables should be in correct order. Eg: – A stage variable used in calculation should be in the higher order than the calculation variable.
Do not validate on a null field in a transformer. Use appropriate data types for the Stage variables. Use IsNull(), IsNotNull() or Seq() for doing such validations.
While reading columns from a Database stage, use Upper Case for column names and table names in SQL queries as a good practice.
While reading from Database or querying, use “for fetch only” instead of “with ur” to avoid table lock.

For every Job Activity stage in sequencer, ensure that “Reset if required, then run” is selected where relevant.

Leave a Reply

Your email address will not be published.