Skip to main content
Kofax

Field extraction

Article # 3021256 - Page views: 1398

Fields in AP Essentials (formerly ReadSoft Online)

Fields are extracted using default document type settings or customer specific learning using Knowledge Processing (KP).

  • Unknown supplier: For first time recognition system default document type settings and system default field formats are used.
  • Known supplier: When a document is received from a known supplier (i.e. customer has processed document(s) from the supplier before) a combination of default document type settings and KP is used. 

Validations are performed in the interpret engine, in KP, and in AP Essentials (formerly ReadSoft Online) Office. Some validations are selectable and are found in Extraction service on each document type in the Rules tab. Validations in the interpret engine and KP cannot be adjusted or disabled by users. 

 

Field extraction with Knowledge Processing (KP)

For known suppliers customers have dedicated KP database(s). When a customer uses Supplier Master Data on buyer level, each buyer has a separate KP database. Field position(s) and closest surrounding words are used to find the desired fields. Desired values are searched for in specific locations on every page based on previous learning. When a field is not found in the presumed position surrounding words are used to search for the value in the entire document. 

Fields are extracted using default field formats. When a customer is trying to capture a value which does not match the default field format, a custom field format is calculated and saved in customer KP database. The custom field format is used for future documents from that supplier. 

Customer KP database is constantly updated by operator input (i.e. the system is continuously learning), hence manual optimization is redundant. In other words, optimization is performed by operators correctly verifying documents. 

 

Common field types

InvoiceCredit

Classified using system default document type settings and validated using data from customer KP database.

If the word credit (or equivalent in local language) exists the document is classified as a credit note.

In customer KP database, all known suppliers have separate records for invoice and credit note. When a document is received KP database is used to match both stored records in order to validate if it's an invoice or a credit note.

 

InvoiceNumber

Format validation checks are performed both against the system default field format and against field formats stored in customer KP database.

  • System default field format is X(2-20), numeric, uppercase letters and special characters. 
  • The last five invoice numbers and the last five credit note numbers for each known supplier are stored in customer KP database. In order to pass format validation, the format and at least one digit needs to match.
    • Example: If stored invoice number is 1693682, the captured invoice number needs to be 7 numeric digits and match at lest one of the stored characters. If the value is 5547400 it will result in a warning but if it is 5597400 it will be a match since the third digit is a 9 on both the stored and the captured invoice number. 
  • Field position is checked against customer KP database, if there is a significant difference in position a warning will be generated. 

For documents missing invoice number the system can learn to combine other values to create an unique invoice number. In order for the system to automatically add an additional value, it must first be added manually during verification and be either a date (MMYY) or consist of at least 6 characters, the latter has to be found on the invoice. Example:

  • Value captured in InvoiceNumber field: 123456
    Added manually: 0321 (march 2021)
    System will learn to include the date automatically: 123456 0321
    More detailed info in Help
  • Value captured in InvoiceNumber field: 123456
    Value from another part of invoice: ABC123
    System will learn to combine the two fields into one: 123456 ABC123

 

InvoiceDate and InvoiceDueDate

Field value is extracted using field position and field format from customer KP database. When a field is not found in the presumed position surrounding words are used to search for the value in the entire document. If a date still cannot be found the system default date formats are used to search for a date in the document.

For dates where the full month name is spelled out, the local language for the document type and English are supported. For example in German document type German and English month names are supported. 

 

InvoiceOrderNumber

The order number field is described in detail here.

 

InvoiceTotalVatRatePercent

Field value is calculated and compared with allowed VAT rates for the document type. Mismatches generate a validation error. 

 

InvoiceCurrency

Determination of currency is based on following steps:

  1. If a currency is present on the document it will always be used.
  2. If no currency is found on the document, the value from customer KP database is used.
  3. If no KP data is found the default currency specified on the document type is used.

 

SupplierBankAccountNumber

Format validation checks are performed both against the system default field format and against field formats stored in customer KP database.

  • System default field format varies between countries but in general it allows numeric digits only. 
  • When a customer is trying to capture a value which does not match the default field format, a custom field format is calculated and saved in customer KP database. The custom field format is used for future documents from that supplier. 
  • The last verified account number for each supplier is stored in customer KP database. In order to pass validation, the captured value must be an exact match to the stored value.

For Swedish document types there are additional validations for Plus-/Bankgiro which are using check sums to validate the number. This validation is selectable in Extraction service in the Rules tab for Swedish document types.

 

SupplierIBAN

Format validation checks are performed both against the system default field format and against field formats stored in customer KP database.

  • System default field format is A(2)N(2)X(10-30), numeric and capital letters for all European countries except Belgium and Netherlands, for which there are stricter field formats. 
  • When a customer is trying to capture a value which does not match the default field format, a custom field format is calculated and saved in customer KP database. The custom field format is used for future documents from that supplier. 
  • The last verified IBAN number for each supplier is stored in customer KP database. In order to pass validation, the captured value must be an exact match to the stored value.

 

SupplierVATRegistrationNumber

Format validation checks are performed both against the system default field format and against field formats stored in customer KP database.

  • Field format is country specific by default, which means that it is primary meant to match only the VAT Registration number in that country.
  • When a customer is trying to capture a value which does not match the default field format, a custom field format is calculated and saved in customer KP database. The custom field format is used for future documents from that supplier. 
  • The last verified supplier VAT reg number for each supplier is stored in customer KP database. In order to pass validation, the captured value must be an exact match to the stored value.

 

Level of Complexity 

Moderate

 

Applies to  

Product Version Build Environment Hardware
AP Essentials (formerly ReadSoft Online) Current      

 

References

Information about Supplier identification
https://knowledge.kofax.com/ReadSoft...identificatio

  • Was this article helpful?