Operator role by default has very few options, check how it is setup in your environment.
You could recheck – extract the job log via command line
- sanity test your environment.
- RowGen->Peek … then compile and run
- RowGen -> Transformer -> Peek … compile and run
$DSHOME/bin/dsjob -domain host:port -server host -user u124443 -password some43@$ -lprojects
The Lookup Stage: It memory maps files. This means that you MUST have enough system memory available to store the entire contents of the file AND you must have enough disk space on the resource disk (defined in the APT_CONFIG_FILE) to shadow the file in memory.
‘Each physical process can address only up to 4 GB of memory because it is a 32-bit application. The Windows version of the InfoSphere DataStage Parallel Engine is available only with 32-bit pointers. ‘
It is not that you dont have enough memory on the system, but that to load the whole map into memory hits the limit. The issue was worked around by using a join stage instead of lookup.
Another option is to Change the job to re-partition both the reference input to the lookup and the primary input to lookup to match on the keys. Because the lookup is running 4 way parallel and because we have explicitly partitioned the data, the lookup will disable memory sharing and the per process memory requirement is reduced on the reference input because of the data distribution. This will enable the job to complete.
Fatal Error: APT_Communicator::shmemInitBuffers: createMapFile (/tmp) failed: Not enough space on node.
This message can also be caused by system-wide limit on the number of mmap’ed shared memory segments. Often this issue occurs when the value for APT_DEFAULT_TRANSPORT_BLOCK_SIZE is set too high. Please check the environmental variable APT_DEFAULT_TRANSPORT_BLOCK_SIZE. The default for this variable is 131072 or 128 KB. The maximum value should be is 1048576 or 1MB. Please see page 79 on the Parallel Job Advanced Developer Guide for additional information regarding this environmental variable.
Resolving the problem
To resolve this issue, ensure that there is adequate space in the assigned TMPDIR (temporary directory) or verify the settings on the environmental variable APT_DEFAULT_TRANSPORT_BLOCK_SIZE.
- Autosys gives u various options, like JOB_ON_ICE, JOB_ON_HOLD
- Scheduling is pretty simple, if u hav a job that u want to schedule every one hr. Then through Datastage you have to schedule it 24 times, which would create 24 processes(Distinct PID). whereas in autosys you dont have to take so much pain.
- if u want to run a job on first monday of every month, u just have to set a Calender in autosys, in datastage couldn’t think of.
- if u want to run a job on first business day(a business day for a client may vary) of every month, u just have to set a Calender in autosys, in datastage couldn’t think of.
- In short I would say, it would give various scheduling options, with less effort. Reusability and maintenance is also a factor.
how will u connect datastage job with autosys
Irrespective of the scheduler you use (AutoSys, SeeBeyond, ControlM, at, cron, to name a few) use the command line interface dsjob to specify what you want DataStage to do.
you will have to write a Wrapper shell scripts, in which you need use Datastage CLI(Command level interface). After creation of the wrapper shell script, you need to just execute that shell script through autosys.
Other Scheduler posts
Looping from V8.5 and up
Aggregation operations make use of a cache that stores input rows.
Two functions, SaveInputRecord() and GetSavedInputRecord(), are used to add input rows to the cache and retrieve them.
- SaveInputRecord() is called when a stage variable is evaluated, and returns the count of rows in the cache (starting at 1 when the first row is added).
- GetSavedInputRecord() is called when a loop variable is evaluated.
In the Transformer Stage settings, these are the various inputs
Define the stage variables:
- can use functions such as
- or use functions with link columns such as
- can use IF THEN ELSE, such as
- IF IsBreak THEN 0 ELSE SummingPrice + inlink.Price
- Loop condition
- Enter the expression as the loop condition:
@ITERATION <= NumRows
The loop continues to iterate for the count specified in the NumRows variable.
Define the loop variable, for instance:
Output link metadata and derivations
Define the output link columns and their derivations, example:
- Col1 – inlink.Col1
- Price – inlink.Price
- Percentage – (inlink.Price * 100)/TotalPrice
- If your Transformer stage calls GetSavedInputRecord before SaveInputRecord, then a fatal error similar to the following example is reported in the job log:
APT_CombinedOperatorController,0: Fatal Error: get_record() called on record 1 but only 0 records saved by save_record()
- If your Transformer stage calls GetSavedInputRecord more times than SaveInputRecord is called, then a fatal error similar to the following example is reported in the job log:
APT_CombinedOperatorController,0: Fatal Error: get_record() called on record 3 but only 2 records saved by save_record()
- If your Transformer stage calls SaveInputRecord but does not call GetSavedInputRecord, then a fatal error similar to the following example is reported in the job log:
APT_CombinedOperatorController,0: Fatal Error: save_record() called on record 3, but only 0 records retrieved by get_record()
- If your Transformer stage does not call GetSavedInputRecord as many times as SaveInputRecord, then a fatal error similar to the following example is reported in the job log:
APT_CombinedOperatorController,0: Fatal Error: save_record() called on record 3, but only 2 records retrieved by get_record()
1. Be sure that scheduling has been defined with a valid user in your ADMINISTRATOR for the Project you are working.
2. In Director – highlight your job – click on the ‘Add schedule’ tool button – enter when you want it to run (every -which day of week or whatever schedule you want and apply the time you want it to run).
3. If you want to add or remove ‘limits’ you should set these at this time as well (max # of warnings and/or rows?)
4. Then ‘schedule’.
What is CLOB
A Character Large OBject (or CLOB) is a collection of character data in a database management system, usually stored in a separate location that is referenced in the table itself.
A CLOB (character large object) value can be up to 2,147,483,647 characters long. A CLOB is used to store unicode character-based data, such as large documents in any character set.
The length is given in number characters for both CLOB, unless one of the suffixes K, M, or G is given, relating to the multiples of 1024, 1024*1024, 1024*1024*1024 respectively.
Length is specified in characters (unicode) for CLOB.
From 8.5 version, The Oracle connector supports reading and writing Oracle LOB data types BFILE,BLOB, CLOB, NCLOB, LONG RAW, RAW, and XMLType.
One of the column in a particular table contains HTML or XML data stored as a CLOB, The length could be large say ‘1048576’.
Use LongVarchar(n) datatype within DataStage.
In ETL job, replace CLOB with LongVarChar of same size (1048576) as CLOB is not defined in data Stage. Where “n” depends on what is in the CLOB field.
For example, DBA/Analyst confirms it will not be more than 1500 chars in the field. Use Varchar(2000) – not LongVarChar.
images stored as CLOBs
BLOBs are generally used to store images. Sometimes though, CLOBs have been used to store images. Sometimes only the Path is stored while the images are physically stored on the disk.
In cases where we can encode the images and store the text and the encoding metadata, we use BLOB/CLOB to store the images.
While doing the View Data shows you “un-American” characters. View data wont give the correct result ,please check the final out put (file or table). Don’t trust what View Data shows you in regard to “un-American” characters. Use a viewer that can display (Eg. the Euro symbol, accented characters, etc.) properly.
Use a viewer that can display the Euro symbol, accented characters, etc. properly.
Use “UNICODE” to handle LATIN source data/Chinese characters .
Apply stringtoUstring(columnvalue,’UTF-8′) and make column datatype as nvarchar
– Server Maps=UTF-8
– Parallel Maps=UTF-8
code page IBM 1388
Error and resolution
APT_CombinedOperatorController(1),0: Invalid character(s) ([xE4]) found converting string (code point(s): [xEF][xBC][x8A][xE3][x80][x80][xE3][x80][x80][xE3][ x80][x80][xE3][x80][x80][xE3][x80][x80][xE3][x80][ x80][xE3][x80][x80][xE3][x80][x80][xE3][x80][x80][ xE3][x80][ x80][xE3][x80][x80][xE3][x80][x80][xE3][x80][x80][ xE3][x80][ x80][xE3][x80][x80][xE3][x80][x80] [xE3][x80][x80][xE3][x80][x80][xE3][x80][x80][xE3][ x80][ x80][xE3][x80][x80][xE3][x80][x80][xE3][x80][x80][ xE3][x80][ x80][xE3][x80][x80][xE3]…) from codepage UTF-8 to Unicode, substituting.
the job (or project) was setup to default to UTF-8 character set, but the stage used to read source uses ISO-8859-1 characters.