Xtrata Recognition

Xtrata Recognition

Postby » Mon Oct 09, 2006 5:15 am

Hello-

I'm trying to use Xtrata to compensate for the shifts and skews of our form. I have 2 identical forms, except one is a bit larger than the other. The fields on the form are identical.

Is there a way to have Xtrata adjust the image, so the index fields would line up on the document? Or will it shift the document for classification purposes only? I have seen an Auto Registration Demo which seem to confirm the notion that it is for identification of the document only.

Also, is there anything out there that will shift the image, so the index fields will line up?

Jenn
Participant
 
Posts: 206
Joined: Mon Sep 25, 2006 4:00 pm

Postby » Mon Oct 09, 2006 10:14 am

Yes, Xtrata will shift the index fields to match the form. Works great.
Participant
 
Posts: 3374
Joined: Wed May 17, 2006 12:53 pm
Location: USA

Postby » Mon Oct 09, 2006 10:26 am

I'm totally confused. I've added the Xtrata and the images are still not lined up. What am I doing wrong?

-Jenn
Participant
 
Posts: 206
Joined: Mon Sep 25, 2006 4:00 pm

Postby » Mon Oct 09, 2006 12:25 pm

Have you added the sample forms that Xtrata needs to classify the images? Is "Classification Only" box checked in Xtrata? And the real killer, is it *really* the same form. I get into this all the time with the people I work with. They keep saying "it's the same". Well, it's got the same layout, but things here are there are positioned differently. The most recent example was a Word "form". Depending on how many lines were used for things like the address block, the bottom 2/3 of the form would move.

Oh, and I'd remove any registration points you may have in that Batch Class. They may override what Xtrata has done for you.
Participant
 
Posts: 3374
Joined: Wed May 17, 2006 12:53 pm
Location: USA

Postby » Mon Oct 09, 2006 2:26 pm

I have added the sample forms, but I believe they are slightly different. I think the confidence level for the sample forms is set at 90%. Should the sample pages all be the same form? If I don't have seperate images, can I add one image 5 times?

About the classification only box...Is that the only one that should be checked? Or is it okay to have OCR?

I have registartion points and form identification points, should both be removed?

Currently, I have created two sample forms one for each version of the image. Should I delete one of them, since they are the same form?

The one other thibg I am concerned about is the amount of shift Xtrata will do. I read somewhere that it was +/- 2%. What happens if the shift is greater? Is that when I would need another form type?

Thanks for your help.

-Jenn
Participant
 
Posts: 206
Joined: Mon Sep 25, 2006 4:00 pm

Postby » Mon Oct 09, 2006 3:06 pm

Sorry, Typo. "Classification Only" should be UNchecked.

For each form type, you need to add 4 sample pages. No, I wouldn't use the same image multiple times. You can however scan the same sheet multiple times.

I would remove the registration points.

See if that helps.
Participant
 
Posts: 3374
Joined: Wed May 17, 2006 12:53 pm
Location: USA

Postby » Tue Oct 10, 2006 4:58 am

In the classification settings I have the confidence level at 80% with a 10% difference. Also, I have the auto rotation, undefined scanning and OCR checked and no default image defined.

I scanned one image multiple times and placed those as the sample pages. I also deleted the registration points, but kept the form identification. Should I have deleted that as well?

Let me give you some backround about the image. It's a tax form. So, there will always be slight variations as well as taxpayers using an old form.

It seems to line up better, but some of the fields are still off. Is there anything more that can be done? Would adding more sample pages help in the line up of the image? Or is it only able to go so far?

-Jenn
Participant
 
Posts: 206
Joined: Mon Sep 25, 2006 4:00 pm

Postby » Tue Oct 10, 2006 9:42 am

I think what you need to do is make a separate form type for each variation of the form. I believe that Xtrata is designed to compensate for variation caused by the scanner and laser printer (form stretched or sightly off in registration). In that situation I've found Xtrata locks dead on. But I don't think it's designed to compensate for variations in form layout.

Also, while you might be able to cheat by scanning the same form multiple times, ideally you should have multiple samples of it. It's been recommended that at least one sample be blank. This helps Xtrata figure out what part is the form and what is the data.

You might take out the form identification zones. Because if Xtrata can't recognize the form, then it's not going to register the form. As you find samples that aren't recognized, you can set them aside in QC to be added to help train the system.
Participant
 
Posts: 3374
Joined: Wed May 17, 2006 12:53 pm
Location: USA

Postby » Tue Oct 10, 2006 9:58 am

I understand about the different forms from year to year that need to be seperate form types. The form I'm dealing with is the latest form which I have seen a few different renderings. All of the fields and text are the same, just the image placement on the page (skewed) and overall size is different (enlarged). It just depends on where the tax form came from, the internet, a tax book, some software, etc, and who knows if it is manipulated further by preparers scanners and copiers.

I don't have one that is blank and I'll try deleting the form ID zones. However, I do think that there is some limit on how much Xtrata will do. Is there a limit on the amount it will compensate? That +/- 2%?

Is it ok to have a few form types that vary in the range of skewed/enlarged images? Or will that just confuse Xtrata?

Also, everytime an image is not classified, the module goes to the classification module so I can classify the unidentified images. Once I define them, should I be doing something else with them? Add them as a sample page? Define them?

Thank.

-Jenn
Participant
 
Posts: 206
Joined: Mon Sep 25, 2006 4:00 pm

Postby » Tue Oct 10, 2006 12:35 pm

As I said, Xtrata can compensate for the types of problems one typically finds in the scanning process. That is skew, overall enlargement/shrinkage and overall stretch. The operative word is "overall". It's designed to compensate for mechanical errors. It stretches, shrinks and twists the scanned form until it matches your sample. Then it knows where the fields are. It has quite a bit of latitude to fix those things. I've actually seen it do a passable job on forms that were accidentally scanned at the wrong dpi (300 instead of 200). Now that's quite an error!

What I don't believe it's designed for is to compensate for different renderings caused by software such as one field moving slightly with respect to other fields. That's a specific change, not an overall one. It's simply not designed to stretch and twist each part of the image to match up to the sample.

So I think you're going to need to create a different form type for each rendering of the form. There is a cab file on the disk you can import that allows Xtrata to dig though a scanned batch and break out all the forms it thinks are the same and what are different. That will help you identify all the renderings you're seeing.
Participant
 
Posts: 3374
Joined: Wed May 17, 2006 12:53 pm
Location: USA

Postby » Wed Oct 11, 2006 7:47 am

Would a software actually alter the image to that extent? I would this the main adjustments would be for margins.

Is that the Xtrata Definitions cab file? Would I add a sample page to it of what I'm looking for?

-Jenn
Participant
 
Posts: 206
Joined: Mon Sep 25, 2006 4:00 pm

Postby » Wed Oct 11, 2006 9:58 am

I don't know the limits of Xtrata's manipulation, but I can say what I've seen - that it compensates for common image errors but not for variations of a form.

The cab file is a automated tool to help you identify the different forms in your paper stream. You have to build your own job from scratch, but you'll be starting by knowing what Xtrata classifies as "different".
Participant
 
Posts: 3374
Joined: Wed May 17, 2006 12:53 pm
Location: USA


Return to Ascent Xtrata General Discussion

Who is online

Users browsing this forum: No registered users and 1 guest