Information Analyzer

IA or Information Analyzer gives one frequency distributions of values, formats, etc. IA provides data sampling (random, sequential and Nth). At a fraction of the cost, a lot of useful information can be generated:
– Is the column unique?
– Are there nulls?
– What are the high frequency values?
– What are the high frequency patterns?
– Are there many outliers?

Note: Sampling from the IA perspective may still incur a full table scan on the source system. All rows may still be read and sent to IA whereby IA only chooses to analyze every 100th record or randomly ignores all but 10% of the records depending on which sampling options you selected.

“Frequency by Frequency” report under the “column frequency” option

Column Level Summary
Cardinality Count : 1,034,710
Null Count : 0
Actual Row Count : 17,937,920
Frequency Cut Off : 0
Total Rows Covered : 12,633,758
%Rows Covered : 70.4305

Check Display options: Default may the 1000 distinct values with counts.

 

if you are using the Where clause in the Analysis Settings, that will only apply to Column Analysis not to Data Rules(e.g. to get rid of the suspended accounts etc.).

For Data Rules, create a Virtual Table (which appears under the Column Analysis menu).  The Virtual Table allows you to build the where clause that is relevant and that can be used in both Column Analysis and in any of the data rules.

scheduling the Column Analysis (CA) Job from Information Analyzer (IA)

  • Go into the ‘Web Console for Information Server’ ( https://server:port/ibm/iis/console ) and
  • create a scheduling view (Schedule Monitoring > Views of Schedules > New Scheduling View) and select at least an ‘Origin’ of Information Analyzer.
  • You should be able to see the scheduled Information Analyzer processes.

Leave a Reply

Your email address will not be published. Required fields are marked *