QAID # 17790 Published
Question / Problem:
Suppress OCR on Invoice Attachments
Answer / Solution:
This example shows how to supress OCR on certain pages of a Document. This improves extraction performance, because OCR will not be executed unecessarily. This also may help extraction accuracy because erroneous values will not be inadvertently extracted from these pages.
Typically, the following occurs when invoices arrive in a mailroom office:
- The envelope containing the invoice document is opened and the copy is disposed.
- A running number bar code is applied to the first page of the invoice document.
- A bar code with a constant value is applied to the first page of the attachment in the invoice document, example:
This example makes use of the following components:
- Barcode Locator
The Barcode Locator is at Project level and is executed before any OCR on demand is performed.
The script checks the project level bar code for an "Attachment Barcode" and supresses the OCR on the attachment pages.
'# Before processing the document, '# loop through the alternatives of the project level barcode reader '# Look for the Attachment barcode and supress OCR on the attachment pages, '# This will save processing time onr the extraction server Private Sub Document_BeforeClassifyXDoc(pXDoc As CASCADELib.CscXDocument, bSkip As Boolean) Dim i As Long Dim AttachmentStartPage As Long Const ATTACHMENT_BARCODE = "000000" '# Are there any attachments? '# Check the project level barcode reader for the attachment barocode If pXDoc.Locators.ItemByName("Barcode").Alternatives.Count > 0 Then For i = 0 To pXDoc.Locators.ItemByName("Barcode").Alternatives.Count - 1 If pXDoc.Locators.ItemByName("Barcode").Alternatives(i).Text = ATTACHMENT_BARCODE Then AttachmentStartPage = pXDoc.Locators.ItemByName("Barcode").Alternatives(i).PageIndex End If Next i End If '# If there is an attachment barcode in the document, '# then supress the OCR full text on the attachment pages If AttachmentStartPage > 0 Then For i = AttachmentStartPage To pXDoc.CDoc.Pages.Count - 1 pXDoc.CDoc.Pages(i).SuppressOCR = True Next i End If End Sub
Example Project: Extraction - Invoice Attachments (Created with V3.1 SP1)