Why do we use autosys or other job scheduler

  1. Autosys gives u various options, like JOB_ON_ICE, JOB_ON_HOLD
  2. Scheduling is pretty simple, if u hav a job that u want to schedule every one hr. Then through Datastage you have to schedule it 24 times, which would create 24 processes(Distinct PID). whereas in autosys you dont have to take so much pain.
  3. if u want to run a job on first monday of every month, u just have to set a Calender in autosys, in datastage couldn’t think of.
  4. if u want to run a job on first business day(a business day for a client may vary) of every month, u just have to set a Calender in autosys, in datastage couldn’t think of.
  5. In short I would say, it would give various scheduling options, with less effort. Reusability and maintenance is also a factor.

how will u connect datastage job with autosys

Irrespective of the scheduler you use (AutoSys, SeeBeyond, ControlM, at, cron, to name a few) use the command line interface dsjob to specify what you want DataStage to do.

you will have to write a Wrapper shell scripts, in which you need use Datastage CLI(Command level interface). After creation of the wrapper shell script, you need to just execute that shell script through autosys.

 

Other Scheduler posts

Scheduler

Scheduler options

Datastage Job scheduler

1. Be sure that scheduling has been defined with a valid user in your ADMINISTRATOR for the Project you are working.
2. In Director – highlight your job – click on the ‘Add schedule’ tool button – enter when you want it to run (every -which day of week or whatever schedule you want and apply the time you want it to run).
3. If you want to add or remove ‘limits’ you should set these at this time as well (max # of warnings and/or rows?)
4. Then ‘schedule’.

Datastage Scheduler options

For Datastage implementation under UNIX/Linux, Autosys is the most popular scheduler. The reason Autosys is prefered is:

1. Autosys gives u various options, like JOB_ON_ICE, JOB_ON_HOLD
2. Scheduling is pretty simple, if u hav a job that u want to schedule every one hr. Then through Datastage you have to schedule it 24 times, which would create 24 processes(Distinct PID). whereas in autosys you dont have to take so much pain.
3. if u want to run a job on first monday of every month, u just have to set a Calender in autosys, in datastage couldn’t think of.
4. if u want to run a job on first business day(a business day for a client may vary) of every month, u just have to set a Calender in autosys, in datastage this is tougher.
In short, it would give various scheduling options, with less effort. Reusability and maintenance is also a factor.

Other schedulers are –  

SeeBeyond, ControlM, at, cron

Though Datastage includes a scheduling option, it does not have its own. DataStage doesn’t include a “scheduler” so leverages the underlying O/S. For UNIX that means cron and a check of the crontab entries for the scheduling user will have what you need. DataStage leverages cron for recurring schedules and at for ‘one off’ schedules. For Windows, it uses scheduled tasks of Windows.

From the operating system command line, logged in as the scheduling user, a “crontab -l” will list the scheduled jobs.

You can schedule a job to run in a number of ways:
  • Once today at a specified time
  • Once tomorrow at a specified time
  • On a specific day and at a particular time
  • Daily at a particular time
  • On the next occurrence of a particular date and time
short procedure


From Director:

1. Select job
2. Add to Schedule (right-click or clock icon)
3. Select the ‘Every’ radio button
4. Make sure Monday thru Friday are highlighted
5. Set time to 9:00 AM
6. Click OK
7. Set parameters
8. Click Schedule 

Procedure
  1. Select the job or job invocation you want to schedule in the Job Status or Job Schedule view.
    Note: You cannot schedule a job with a status of Not compiled or a web service-enabled job.
  2. Do one of the following to display the Add to schedule dialog box:
    • Choose Job > Add to Schedule .
    • Choose Add To Schedule from the appropriate shortcut menu.
    • Click the Schedule button on the toolbar.Choose when to run the job by clicking the appropriate option button:
      Today runs the job today at the specified time (in the future).
      Tomorrow runs the job tomorrow at the specified time.
      Every runs the job on the chosen day or date at the specified time in this month and repeats the run at the same date and time in the following months.
      Next runs the job on the next occurrence of the day or date at the specified time.
      Daily runs the job every day at the specified time.
  3. If you selected Every or Next in step 3, choose the day to run the job by doing one of the following:
    • Choose an appropriate day or days from the Day list.
    • Choose a date from the calendar.
      Note: If you choose an invalid date, for example, 31 September, the behavior of the scheduler depends upon the operating system of the computer that hosts the engine tier, and you might not receive a warning of the invalid date. Refer to your documentation for the engine tier host for further information.
  4. Choose the time to run the job. There are two time formats:
    • 12-hour clock. Click either AM or PM.
    • 24-hour clock. Click 24H Clock.
      Click the arrow buttons to increase or decrease the hours and minutes, or enter the values directly.
  5. Click OK. The Add to schedule dialog box closes and the Job Run Options dialog box appears.
  6. Fill in the job parameter fields and check warning and row limits, as appropriate.
  7. Click Schedule. The job is scheduled to run and is added to the Job Schedule view.

Rename many/all jobs

  • Create an Excel spreadsheet with new and old names. 
  • Export the whole project as a dsx file.
  • Write a Perl program, which can do a simple rename of the strings looking up the Excel file. 
  • Then import the new dsx file probably into a new project for testing. 
  • Recompile all jobs. 

Be cautious that the name of the jobs has also been changed in your job control jobs or Sequencer jobs. So you have to make the necessary changes to these Sequencers.

Schedule Datastage ETL jobs – Basics

Read about Datastage Scheduler

What is the utility used to schedule the jobs on a UNIX server?

dsjob utility (UNIX commands) are used to schedule datastage jobs. dsjob commands allow us to:

  • Start a job (-run)
  • Stop a job -stop
  • List projects, jobs, stages, links, and parameters
  • Set an alias for a job
  • Retrieve information
  • Access log files (Important )- IBM reference
    dsjob -log [ -info | -warn ] [ -useid ] project job|job_id
  • Generate a report


<h2>There are many tools that help scheduling</h2>

  • Ascential Director
  • AUTOSYS”: Thru autosys u can automate the job by invoking the shell script written to schedule the datastage jobs
    • Autosys is the most popular scheduler
  • SeeBeyond,
  • ControlM,
  • at,
  • cron

For more information read – Popular schedulers in Datastage