Skip to main content
Kofax

Field extraction

Article # 3021256 - Page views: 810
 

Fields in ReadSoft Online

Fields are extracted using default document type settings or customer specific learning using Knowledge Processing (KP).

  • Unknown supplier: For first time recognition system default document type settings and system default field formats are used.
  • Known supplier: When a document is received from a known supplier (i.e. customer has processed document(s) from the supplier before) a combination of default document type settings and KP is used. 

Validations are performed in the interpret engine, in KP, and in ReadSoft Online Office. Some validations are selectable and are found in Extraction service on each document type in the Rules tab. Validations in the interpret engine and KP cannot be adjusted or disabled by users. 

Field extraction with Knowledge Processing (KP)

For known suppliers customers have dedicated KP database(s). When a customer uses Supplier Master Data on buyer level, each buyer has a separate KP database. Field position(s) and closest surrounding words are used to find the desired fields. Desired values are searched for in specific locations on every page based on previous learning. When a field is not found in the presumed position surrounding words are used to search for the value in the entire document. 

Fields are extracted using default field formats. When a customer is trying to capture a value which does not match the default field format, a custom field format is calculated and saved in customer KP database. The custom field format is used for future documents from that supplier. 

Customer KP database is constantly updated by operator input (i.e. the system is continuously learning), hence manual optimization is redundant. In other words, optimization is performed by operators correctly verifying documents. 

Common field types

InvoiceCredit

Extracted using system default document type settings and validated using data from customer KP database. In customer KP database, all known suppliers have separate records for invoice and credit note. When a document is received KP database is used to match both stored records in order to validate if it's an invoice or a credit note.

InvoiceNumber

Format validation checks are performed both against the system default field format and against field formats stored in customer KP database.

  • System default field format is X(2-20), numeric, uppercase letters and special characters. 
  • The last five invoice numbers and the last five credit note numbers for each known supplier are stored in customer KP database. In order to pass format validation, the format and at least one digit needs to match.
    • Example: If stored invoice number is 1693682, the captured invoice number needs to be 7 numeric digits and match at lest one of the stored characters. If the value is 5547400 it will result in a warning but if it is 5597400 it will be a match since the third digit is a 9 on both the stored and the captured invoice number. 
  • Field position is checked against customer KP database, if there is a significant difference in position a warning will be generated. 

InvoiceDate

Field value is extracted using field position and field format from customer KP database. When a field is not found in the presumed position surrounding words are used to search for the value in the entire document. If a date still cannot be found the system default date formats are used to search for a date in the document.

InvoiceDueDate

Field value is extracted using field position and field format from customer KP database. When a field is not found in the presumed position surrounding words are used to search for the value in the entire document. If a due date still cannot be found the system default date formats are used to search for a due date in the document.

InvoiceOrderNumber

The order number field is described in detail here

InvoiceTotalVatRatePercent

Field value is calculated and compared with allowed VAT rates for the document type. Mismatches generate a validation error. 

InvoiceCurrency

Determination of currency is based on following steps:

  1. If a currency is present on the document it will always be used.
  2. If no currency is found on the document, the value from customer KP database is used.
  3. If no KP data is found the default currency specified on the document type is used.

SupplierBankAccountNumber

Format validation checks are performed both against the system default field format and against field formats stored in customer KP database.

  • System default field format varies between countries but in general it allows numeric digits only. 
  • When a customer is trying to capture a value which does not match the default field format, a custom field format is calculated and saved in customer KP database. The custom field format is used for future documents from that supplier. 
  • The last verified account number for each supplier is stored in customer KP database. In order to pass validation, the captured value must be an exact match to the stored value.

For Swedish document types there are additional validations for Plus-/Bankgiro which are using check sums to validate the number. This validation is selectable in Extraction service in the Rules tab for Swedish document types.

SupplierIBAN

Format validation checks are performed both against the system default field format and against field formats stored in customer KP database.

  • System default field format is A(2)N(2)X(10-30), numeric and capital letters for all European countries except Belgium and Netherlands, for which there are stricter field formats. 
  • When a customer is trying to capture a value which does not match the default field format, a custom field format is calculated and saved in customer KP database. The custom field format is used for future documents from that supplier. 
  • The last verified IBAN number for each supplier is stored in customer KP database. In order to pass validation, the captured value must be an exact match to the stored value.

SupplierVATRegistrationNumber

Format validation checks are performed both against the system default field format and against field formats stored in customer KP database.

  • Field format is country specific by default, which means that it is primary meant to match only the VAT Registration number in that country.
  • When a customer is trying to capture a value which does not match the default field format, a custom field format is calculated and saved in customer KP database. The custom field format is used for future documents from that supplier. 
  • The last verified supplier VAT reg number for each supplier is stored in customer KP database. In order to pass validation, the captured value must be an exact match to the stored value.

Level of Complexity 

Moderate

Applies to  

Product Version Build Environment Hardware
ReadSoft Online Current      

References

Information about Supplier identification
https://knowledge.kofax.com/ReadSoft...identificatio

  • Was this article helpful?