File Size Increase Converting PDF to TIFF
Issue
When importing electronically-produced PDF files containing text, what causes the file size of those pages to increase significantly when converted into TIFF files?
Cause
PDF is a Portal Document Format, which places text and other objects within page boundaries. The page itself is empty and is only defined by the coordinates of the four corners. Text is stored as single-byte or possibly 2 bytes characters with either a reference to an external font or sometimes an embedded subset of font data. Text placement boundaries are defined by coordinate values. Embedded images are stored along with their placement coordinates. A PDF reader renders each page, filling the page boundaries with a background color and placing the text characters and embedded images at the designated coordinates, resulting in the appearance of a fully realized page.
TIFF is a raster image format. In raster images, the entire area of the image is divided into pixels, and each pixel is represented by 1 bit (bitonal), 8 bits (grayscale), or 24 bits (color) of data.
When Kofax VRS / ImageControls converts a PDF page into a TIFF image, it must translate the rendered page into raster image data, defining each pixel using either 1 bit, (black and white), 8 bits (grayscale), or 24 bits (color). Therefore, instead of blank space within a mathematical boundary, the image contents are represented by 200-300 of these pixels for every linear inch of image (40,000-90,000 pixels for every square inch). This increases the file size exponentially.
The conversion of PDF document data into raster image data, and the accompanying file size increase, is the expected behavior when converting PDF files to TIFF image files using Kofax VRS / ImageControls or any other PDF to TIFF converter.
If the PDF document pages all contain full-page embedded images, and no upscaling is performed during the conversion, the resulting TIFF image file size should be reasonably comparable to the original PDF document file size.
Solution
This behavior is considered to be "as designed".
Level of Complexity
Easy
Applies to
Product | Version | Build | Environment | Hardware |
---|---|---|---|---|
Kofax VRS | 5.2 5.1.2 5.1.1 5.1 |
ALL | ALL | N/A |
Kofax Express | 3.3 3.2 |
ALL | ALL | N/A |
Kofax Capture | 11.1 11.0 10.2 10.1 10.0 |
ALL | ALL | N/A |
Kofax Import Connector | 2.10 2.9 2.8 2.7 |
ALL | ALL | N/A |
References
N/A