Datastage 9.1 – the latest version

Excel Compatibility

Excel is feeling the love at IBM IOD 2012 thanks to DataStage 9.1. The new version can read MS Excel naively from any platform – you can read Excel even when your DataStage server is install on Unix or Linux – without the pain of Unix to Windows bridges. Excel spreadsheets have never been easier to access as a data source. Previously DataStage installed on Windows read Excel files easily thanks to all the Windows libraries but was damn hard to get working from Unix or Linux as the operating system could not understand MS Office files.

DataStage 9.1 comes with an unstructured text stage that makes Excel easier to read even if it is not a perfected formatted table. DataStage can go in and find column headings whether they are on row 1 or row 10. It can parse the columns and turn them into relational data and even add on extra text strings such as a single comment field. DataStage 9.1 can overcome a lot of the problems you get from Excel as a source when people muck around with the spreadsheet format.

Support for Big Data

Balanced Optimization for DB2 z/OS and for Hadoop

Faster Database Writes

DataStage 9.1 is faster when writing to DB2 and Oracle thanks to improved big buffering of data. Oracle Bulk Load and DB2 Bulk Load are faster. Oracle Upsert is faster. The Connector stages are better.

Faster Netezza Table Copies

IBM introduces a new interface called Data Click – it promises to load tables from a source Oracle or DB2 table into Netezza with just two clicks. Two clicks! The new console will automatically generate the DataStage job required to move the tables selected into Netezza. Great for fast PoC or agile BI development. It can repeat the data movement later or automatically generate InfoSphere CDC table subscriptions to keep those tables up to date in real time.

IlOG comes back to DataStage

ILOG is now called Operational Decision Management. This must be licensed for the stage to work properly.

Many a year ago there was a JRules plugin for DataStage so you could run the iLog JRule engine in stream with a DataStage job. IBM has brought this back to DataStage 9.1 Parallel Jobs. What this means for DataStage is one of the best rules engines on the market is available to ETL processing. What it means for iLog is massive scalability, running iLog on a grid or cluster of DataStage parallel servers with automatic partitioning. It also means iLog can read and write data in ANY format – DataStage can read any database, flat file, mainframe file, InfoSphere CDC subscription, SOA message or hundreds of other data formats and write to or read from iLog JRules.

Second Generation Java Stage

In the new and improved Java Stage for DataStage you will be able to view and browse java classes and access a huge range of data sources or targets. Better access to web applications and web data and easier integration of Java code.

Write to Multiple Files

DataStage 9.1 makes it easier to take a data stream and write it to a bunch of output files, such as splitting data up by customer.

New Expressions

A couple old Server Edition functions finally make their way onto the Parallel canvas – such as the popular EReplace function for replacing values in text strings.

WorkLoad Management

The DataStage Operations Console, the web application introduced in DataStage 8.7, has a new tab for workload management. Included in the bundles of goodies is the ability to cap the number of jobs running at once, the amount of CPU being used and the amount of RAM being used by parallel jobs. This should avoid problems you get when you pass 100% RAM utilisation and stagger the start up of hundreds of simultaneous job requests. There is also the ability to create queues and apply workload priorities to those queues.

Datstage on Windows

For DataStage on Windows far more use is made of native Windows processes, and very little of DataStage now uses MKS Toolkit processes (MKS Toolkit still installs, for all those who have created invocations of “UNIX” commands/scripts from DataStage).

What DataStage 9.1 Does Not Have

It is with some disappointment that I report that the following oft requested features will not be in DataStage 9.1:

DataStage will not have a plugin that chops fruit and vegetables. It will not be possible to chop up an entire soup in three seconds using parallel chop-o-matics. I’m afraid we will need to continue to chop using the old fashioned methods.

DataStage 9.1 National Language Support lets you process data in different character sets such as Mandarin and Japanese. Unfortanately still no support for Klingon and it is still not possible to process customer sentiments on Twitter that are written in Klingon, which as we all know has a much richer dialect for insults.

Source: http://it.toolbox.com/blogs/infosphere/ibm-launches-datastage-91-at-ibm-iod2012-53350#4993356

Datastage 9.1 – the latest version

One thought on “Datastage 9.1 – the latest version”

Leave a Reply to devid maxwel Cancel Reply