Skip to main content
Kofax

Documents are being automatically rotated in the wrong direction by OCR. There is some sideways text, but there is much more text in the correct direction — why is it choosing this orientation? 19794

19794

QAID # 19794 Published

Question / Problem:

Documents are being automatically rotated in the wrong direction by OCR. There is some sideways text, but there is much more text in the correct direction — why is it choosing this orientation?

Answer / Solution:

When the "Automatic Rotation" feature is enabled, the OCR engines in KTM will perform a "quick" OCR on a few lines in each direction to see which is the first acceptable orientation.

On some documents where there is sideways text that is closer to the edge of the image than the main text (such as fax sheets), the OCR engine will rotate to this orientation.

This is an unfortunate side-effect of a reasonable design because the engine cannot afford to spend too long recognizing enough text lines to know the relative proportions; it is optimized for speed, so only analyzes the lines close to the edge.

Such document layouts are unfortunate enough to fall into the small proportion of images that are mis-rotated, and experience the bad side of this design, but should be considered an edge case.

The workaround is to use a script to remove a small border from the image (in memory), forcing the automatic rotation to look deeper in the image, where there is more good text than sideways text.

The example code below can be added to the Project script.

NOTE: This will only work at runtime, not in Project Builder testing.
Private Sub Document_BeforeClassifyXDoc(ByVal pXDoc As CASCADELib.CscXDocument, ByRef bSkip As Boolean)

    Dim oImage As CscImage
    Dim lMargin As Long

    lMargin = 100

    'Get current image for page 1
    Set oImage = pXDoc.CDoc.Pages(0).GetBitonalImage(Project.ColorConversion)
    'Erase a margin around the edge of the image
    oImage.EraseRect 0, 0, lMargin, oImage.Height
    oImage.EraseRect oImage.Width-lMargin, 0, lMargin, oImage.Height
    oImage.EraseRect 0, 0, oImage.Width, lMargin
    oImage.EraseRect 0, oImage.Height-lMargin, oImage.Width, lMargin

    'Clean up memory
    Set oImage = Nothing

End Sub

 

Applies to:

Product Version Category
AXPRO 5.5 Recognition Engines