Skip to main content

Suppress OCR on Invoice Attachments


QAID # 17790 Published

Question / Problem:

Suppress OCR on Invoice Attachments

Answer / Solution:

This example shows how to supress OCR on certain pages of a Document. This improves extraction performance, because OCR will not be executed unecessarily. This also may help extraction accuracy because erroneous values will not be inadvertently extracted from these pages.

Typically, the following occurs when invoices arrive in a mailroom office:

  1. The envelope containing the invoice document is opened and the copy is disposed.
  2. A running number bar code is applied to the first page of the invoice document.
  3. A bar code with a constant value is applied to the first page of the attachment in the invoice document, example: '000000'

This example makes use of the following components:

  • Barcode Locator
    The Barcode Locator is at Project level and is executed before any OCR on demand is performed.
  • Script
    The script checks the project level bar code for an "Attachment Barcode" and supresses the OCR on the attachment pages.
    '# Before processing the document,
    '#  loop through the alternatives of the project level barcode reader
    '# Look for the Attachment barcode and supress OCR on the attachment pages,
    '# This will save processing time onr the extraction server
    Private Sub Document_BeforeClassifyXDoc(pXDoc As CASCADELib.CscXDocument, bSkip As Boolean)
       Dim i As Long
       Dim AttachmentStartPage As Long
       Const ATTACHMENT_BARCODE = "000000"
       '# Are there any attachments?
       '# Check the project level barcode reader for the attachment barocode
       If pXDoc.Locators.ItemByName("Barcode").Alternatives.Count > 0 Then
          For i = 0 To pXDoc.Locators.ItemByName("Barcode").Alternatives.Count - 1
             If pXDoc.Locators.ItemByName("Barcode").Alternatives(i).Text = ATTACHMENT_BARCODE Then
                AttachmentStartPage = pXDoc.Locators.ItemByName("Barcode").Alternatives(i).PageIndex
             End If
          Next i
       End If
       '# If there is an attachment barcode in the document,
       '#  then supress the OCR full text on the attachment pages
       If AttachmentStartPage > 0 Then
          For i = AttachmentStartPage To pXDoc.CDoc.Pages.Count - 1
             pXDoc.CDoc.Pages(i).SuppressOCR = True
          Next i
       End If
    End Sub

Example Project: Extraction - Invoice Attachments (Created with V3.1 SP1)

Screen Shot 2018-09-12 at 4.17.46 PM.png

Applies To: 

Product Version Category
AXPRO 4.0 Script
AXPRO 4.5 Script
AXPRO 5.0 Script
AXPRO 5.5 Script
  • Was this article helpful?