Indicius Full Page OCR

Indicius Full Page OCR

Postby » Wed Nov 19, 2008 5:18 pm


We are passing Kofax batch through classification & separation which classifies and assign document type description to documents. Out of these documents, i only need to run one document through recognition module to carry out full page OCR and further use anchor-target mechanism to extract specific index values. Even though Kofax batch is sent to Recognition module, i want recognition to happen only on this specific document type. All other documents going through recognition add time & also unnecessarily reduce full page OCR click/count. But recognition script hooks: start_batch,end_batch & process_doc come into picture only after all pages from all documents of Kofax batch are OCRed. So the question is how can we avoid OCR based on document type description when batch is sent through recognition module.
Posts: 7
Joined: Fri Nov 14, 2008 8:59 am

Postby » Fri Nov 21, 2008 7:16 am

Hi Mayur,

I have a few observations and questions about your setup.

1) How did you create this configuration? Was it created manually, or did you create it through Project Planner? (only available in INDICIUS 6.0)

2) Are you aware that OCR can be imported from the first instance of Recognition to the second? So, if you are using OCR as part of your classification process in the first instance, you can sidestep this entire issue by simply importing (almost instant, and no license cost) the OCR from the Classification & Separation to the Extraction instances of Recognition.

3) Otherwise, how are you doing the OCR in Extraction? Are you linking a definition (.idf) file in the Recognition setup dialog, or are you calling the definition file from script (either using the Engine.Process function or an IDF Locator)?

If you're already calling the OCR from script, then you only have to add a conditional If Recog.DocumentType = "xyz" around the call to perform OCR.

If you're currently doing OCR in the definition file, then you can either add one IDF file per document type, with the exact same name as the document type, and add the statement DOCUMENTTYPE xyz at the top of the file (please check the help for more information on this parameter), adding OCR only in the definition files for the document types you want to OCR, or you can move the OCR into script and then place a conditional statement around it as described above.
Stephen Bottomley
Senior Product Specialist
Tel: +44 (0)1223 226012
Posts: 675
Joined: Mon Jul 11, 2005 8:31 am
Location: Cambridge

Return to Indicius General Discussion

Who is online

Users browsing this forum: No registered users and 1 guest