Xtrata 1.7(Sp2) classification issues

Xtrata 1.7(Sp2) classification issues

Postby arlindo » Fri Oct 10, 2008 10:00 am

Ascent Capture: 7.5 (SP6)
Xtrata: 1.7 (SP2)

According to the Xtrata "getting started" guide Xtrata states that it can do the following:

"Ascent Xtrata performs document separation based on classification results. The first page of a document is detected by matching a loose page with all first sample pages defined for the batch class. Additional loose pages are added to the document if they match a second page or do not match any first page. If the batch contains already
separated documents, these will be used for classification."

I am running some tests where I have 3 forms and have defined them in Ascent and loaded a classification sample of the first page of each form. Now, when I run a pile of documents through the scanner it seem to only classify the very first document within the loose pages and thinks the rest of the pages within the pile belong to the very first form. I would expect the software to detect that within that pile of loose pages a page matches the first page of another form and should break up the loose pages into 2 separate document but that is not what is happening? Any suggestions, recommendations, maybe some configuration tweaking?
arlindo
Participant
 
Posts: 113
Joined: Tue Feb 26, 2008 5:43 pm

Postby russell@centuryc.com » Fri Oct 10, 2008 12:01 pm

First, make sure you are sending loose pages to Xtrata and they have not already been broken into documents.

Second, make sure you've added your additional forms as forms and not as "page 2" or "additional samples".
russell@centuryc.com
Participant
 
Posts: 3374
Joined: Wed May 17, 2006 12:53 pm
Location: USA

Postby arlindo » Fri Oct 10, 2008 12:14 pm

1. Yes I am sending loose pages to Xtrata as it is smart enough to classify the first document within the batch but can't separate the further embeded forms out.

2. I have defined the different form types with 1 sample page in ascent capture which is the first page of the form and have also added "classification sample pages" to each form type through the xtrata gui.

Do you have any more suggestions?
arlindo
Participant
 
Posts: 113
Joined: Tue Feb 26, 2008 5:43 pm

Postby russell@centuryc.com » Fri Oct 10, 2008 1:01 pm

You may want to run a collection of first pages though the XtrataDefinition batch class to see how Xtrata "sees" the different forms.

It has also been suggested that at least one sample sheet should be a blank form. I usually take a scanned sample and electronically erase all the filled in information to do that.

Keep in mind that Xtrata is only designed to deal with forms that are exactly alike, not just "looks the same to a human" alike. An exact form would be one done by a professional printer. An example of the latter would be tax forums as many tax preparation software will generate it's own form which will appear identical to a human but not identical to Xtrata.

Lastly, some "forms" just don't work in Xtrata at all. There simply isn't enough form and too much "data" that changes all the time. An example might be a letterhead page. Only the letterhead is the same while the body of the letter changes for each sample. Another example might be some invoice types where the whole page is "data".
russell@centuryc.com
Participant
 
Posts: 3374
Joined: Wed May 17, 2006 12:53 pm
Location: USA

Postby arlindo » Wed Oct 15, 2008 9:18 am

Hi Russel,

This is what I found out please let me know if this makes any sense to you.

It's was my general assumption that when the scan module creates documents based on the separator sheet it seems to create these documents with a formtype "1" for example. I though that this formtype of 1 is considered loose pages to xtrata but I don't think it is. In either case I am not sure why xtrata would want to even attempt to classify the very first page if it thinks it's already a document and not a bunch of loose pages?
arlindo
Participant
 
Posts: 113
Joined: Tue Feb 26, 2008 5:43 pm

Postby russell@centuryc.com » Wed Oct 15, 2008 11:56 am

OK, I need to back up - at what point are documents created? Xtrata will not break down documents into smaller documents.

Document/Form classification is only relevant if there's more then one Form type defined in the Batch Class.

Typically when scanned, the Document/Form type is "unknown". If it's unknown and there's more then one Form Type, Xtrata will attempt to classify it.
russell@centuryc.com
Participant
 
Posts: 3374
Joined: Wed May 17, 2006 12:53 pm
Location: USA

Postby arlindo » Wed Oct 15, 2008 12:21 pm

well basically I have a pile of papers that get loaded into the scanner and in the batch class I have multiple formtypes defined all loaded with sample xtrata classification pages. I am also using kofax patch code separtor sheets and using the advanced feature to treat each document as a batch. So what is happening is that when the scan module sees this patch code sepator sheet it seems to automically break it up into a new batch (which is what I want) and at the same time it assigns a formtype of "1" which is not what I want, I want loose pages because I want xtrata to perform the work of classification.

I would think that the scan module should just treat them as loose pages but that is not what it does. Then when the batch flows to Xtrata it's not loose pages but a document with a form type of "1". Again Xtrata still tries to classify the very first page because xtrata is smart enought to change the form type of 1 to a real form type that I defined, but that is where it stops as it does not check every single page to try to further break apart the document into new documents. Hope all this make sense.
arlindo
Participant
 
Posts: 113
Joined: Tue Feb 26, 2008 5:43 pm

Postby russell@centuryc.com » Wed Oct 15, 2008 12:35 pm

Are you using unattended scanning?

Does your patch codes also contain a barcode?
russell@centuryc.com
Participant
 
Posts: 3374
Joined: Wed May 17, 2006 12:53 pm
Location: USA

Postby arlindo » Thu Oct 16, 2008 12:16 pm

I can't use unattending scanning in my case because I require the use of batch level index fields that are to be set by the scan operator.

No the patch codes don't have a barcode they are just the simple patch "T" (I believe that is what it is called).
arlindo
Participant
 
Posts: 113
Joined: Tue Feb 26, 2008 5:43 pm

Postby rpapa » Fri Oct 17, 2008 1:55 am

looks like you are already separating using patch codes, so I'm not sure if you can use Xtrata & the advanced feature to treat each document as a batch
rpapa
Participant
 
Posts: 3552
Joined: Mon Mar 13, 2006 12:00 pm
Location: Livonia, Michigan

Postby arlindo » Fri Oct 17, 2008 6:57 am

Yes that was my conclusion as well but again I am not sure why Xtrata still tries to classify the very first page because xtrata is smart enough to change the form type of 1 to a real form type that I defined, but that is where it stops as it does not check every single page to try to further break apart the document into new documents.
Again not sure why the product works like this?

Anyway I found a workaround using a workflow agent that basically runs after scan and checks the formtype if formtype is nothing then move all the pages from this formtype of "1" to loose pages prior to going into xtrata. I have tested using this new agent and it works quite well now.

The things we need to do sometimes!
arlindo
Participant
 
Posts: 113
Joined: Tue Feb 26, 2008 5:43 pm

Postby rpapa » Fri Oct 17, 2008 8:19 am

If you already have a document, Xtrata will not break it down any further. It will just use the first page to identify what form type it is.

Glad that workflow agent solved your problem.
rpapa
Participant
 
Posts: 3552
Joined: Mon Mar 13, 2006 12:00 pm
Location: Livonia, Michigan

Postby russell@centuryc.com » Mon Oct 20, 2008 5:32 pm

arlindo wrote:I can't use unattending scanning in my case because I require the use of batch level index fields that are to be set by the scan operator.


Then what are the patch code sheets used for?
russell@centuryc.com
Participant
 
Posts: 3374
Joined: Wed May 17, 2006 12:53 pm
Location: USA

Postby rpapa » Tue Oct 21, 2008 2:49 am

looks like it's used for document separation.
rpapa
Participant
 
Posts: 3552
Joined: Mon Mar 13, 2006 12:00 pm
Location: Livonia, Michigan

Postby arlindo » Tue Oct 21, 2008 9:43 am

Hi Russel,

Was that question about patch code sheets for me or for rpapa?
arlindo
Participant
 
Posts: 113
Joined: Tue Feb 26, 2008 5:43 pm

Next

Return to Ascent Xtrata General Discussion

Who is online

Users browsing this forum: No registered users and 1 guest

cron