Field format

You can specify the field format of custom fields, and some buyer fields, using regular expressions or a "compact format" that is specific to Kofax AP Essentials. When you specify the field format in the Field view, you determine how many characters of a certain type are allowed. When you specify a field format, the field must match the format you specify in order to pass validation.

The compact field format is used during extraction to help find the correct field, whereas the regular expressions are only used for validation with the exception of the InvoiceOrderNumber field which can also use regular expressions for extraction.

When specifying the field format, try to be as strict as possible. For example, if the field contains only numbers, exclude letters from the format specification. Likewise, if the field is always a specific length, specify the length as well.

You can also specify the field format for these predefined fields:

  • BuyerContactPersonName

  • BuyerContactReference

  • InvoiceOrderNumber

  • LIT_OrderNumber

Regular expression

Specifying the field format using regular expressions requires prior knowledge of regular expressions that is not described in help. You can use sites like Regular Expressions 101 to test your regular expressions. The Field view also has a button for testing.

Examples

Regular expression Meaning
^\d{7}$

A numeric field exactly seven digits long.

Regex Meaning
^ Anchors the match to the start of the string.
\d Matches any digit (0-9).
{7} Specifies that exactly 7 digits should appear.
$ Anchors the match to the end of the string.
^\d{5,7}$

A numeric field, five to seven digits long.

Regex Meaning
^ Anchors the match to the start of the string.
\d Matches any digit (0-9).
{5,7} Specifies a quantifier that matches the previous \d digit pattern between 5 and 7 times.
$ Anchors the match to the end of the string.
^[A-Za-z]{2}\d{3,5}$

A field that is five to seven characters long, where the first two characters are letters, and the remaining characters are numerical.

Regex Meaning
^ Anchors the match to the start of the string.
[A-Za-z] Matches any uppercase or lowercase letter.
{2} Specifies that exactly 2 letters should appear.
\d Matches any digit (0-9).
{3,5} Specifies a quantifier that matches the previous \d digit pattern between 3 and 5 times.
$ Anchors the match to the end of the string.

Named groups for purchase orders

You can use named groups to validate and manipulate purchase order numbers (header and line item). This is helpful, for example, if you want to exclude certain characters from extracted purchase order numbers. To accomplish this, you specify the field type name as a named group in your regular expression. The named group corresponds to the extracted field value. Specify InvoiceOrderNumber as the named group for PO numbers in your regular expression. For line item PO numbers, specify LIT_OrderNumber. If you specify anything other than the field type name as a named group, it is ignored. You can only use named groups for order number fields, InvoiceOrderNumber and LIT_OrderNumber.

Examples

Regular expression Meaning
^(?:.*\/)(?<InvoiceOrderNumber>\d+)$

Extract the digits that follow the last instance of a forward slash.

Regex Meaning
^ Anchors the match to the start of the string.
? Matches everything enclosed in the group.
: Specifies a non-capturing group.
. Matches any character (except for line terminators).
* Matches the previous token any number of times.
(?<InvoiceOrderNumber>\d+) Captures one or more digits and assigns them to the named group, InvoiceOrderNumber, which is also the field type name for the order number header field.
$ Anchors the match to the end of the string.

This table shows example inputs and the values returned by the regular expression.

Captured field value Returned by the regular expression
ABC/123/1234567 1234567
AAA/123456789 123456789
AA/0000/123/123456789 123456789
123456 No match.

^(\d-)(?<InvoiceOrderNumber>(\d{4}))$

Exclude the initial digit and dash from a field value matching the pattern X-XXXX.

Regex Meaning
^ Anchors the match to the start of the string.
\d Matches any digit (0-9).
- Matches a dash.
? Matches everything enclosed in the group.
(?<InvoiceOrderNumber>(\d{4})) Captures four digits and assigns them to the named group, InvoiceOrderNumber, which is also the field type name for the order number header field.
$ Anchors the match to the end of the string.

This table shows example inputs and the values returned by the regular expression.

Captured field value Returned by the regular expression
1-2345 2345
9-0001 0001
1-23456 No match.
2345 No match.

^(.[\/-]){0,1}(?<LIT_OrderNumber>(\d{6}))(-\d){0,1}$

Exclude the first character followed by a dash or forward slash if any exist, and exclude the last dash and digit at the end, if any exist, and return the remaining six digits.

Regex Meaning
^ Anchors the match to the start of the string.
. Matches any character (except for line terminators).
[\/-]

Matches a forward-slash character or a dash.

{0,1} Matches the previous token zero to one times. In this example, the previous token is (.[\/-]), so any character and a forward slash or dash matches.
(?<LIT_OrderNumber>(\d{6})) Captures six digits and assigns them to the named group, LIT_OrderNumber, which is also the field type name for the order number line-item field.
(-\d){0,1} Matches a dash and a digit zero to one times.
$ Anchors the match to the end of the string.

This table shows example inputs and the values returned by the regular expression.

Captured field value Returned by the regular expression
A-123456-7 123456
A/123456-7 123456
123456-7 123456
123456 123456
A-123456 123456
X123456-7 No match.
1234567 No match.

Compact format

You can specify the compact field format, for example, using the format C(f-t), where C stands for the type of character, f stands for "length from" and t stands for "length to". Spaces are not allowed in the format specification.

When you specify a compact format in the Field view, the field value you specify cannot exceed 511 characters. For example, you cannot specify N(1-512).

Symbol Meaning Example
N Numeric characters Example: N(7)

Meaning: A seven-digit numeric field.

Field: 1234567

A Alphabetic characters Example: A(2-5)

Meaning: An alphabetic field containing two to five characters.

Field: Abc

X Alphanumeric characters and special characters such as "#", ">", etc. Example: X(5)

Meaning: A field containing five characters, each of which can be a letter, number, or a special character.

Field: Abc3D

W White space Example: N(3)WN(2)

Meaning: A field containing three digits, a space, and two more digits.

Field: 123 45

^ Disallowed character Example: N[^0]

Meaning: Any single digit except 0.

Field: 2

( ) Required Example: A(5)

Meaning: The field must contain five alphabetic characters.

Field: Abcde

[ ] Specified character (case sensitive) Example: N[139]

Meaning: A single digit that can be 1, 3, or 9.

Field: 3

{ } Multiple formats Example: {N(5)|N(10)}

Meaning: Either a five-digit field or a ten-digit field.

Field: 12345

| The "or" operator used to separate multiple formats when more than one is allowed Example: {N(5)|N(10)}

Meaning: Either a five-digit field or a ten-digit field.

Field: 1234567890

By using brackets, you can limit the valid characters in a format specification.

Examples

Compact format Meaning
N[1357] One digit: 1, 3, 5, or 7.
N[1357](3) Three digits, each of which can be 1, 3, 5, or 7.
X[#]N[4] A "#" character and then 4.
X[AaBb] One character: A, a, B, or b.

An exclusion symbol, "^", is available to disallow invalid characters. In order to disallow characters, the "^" symbol must be in the first position of the string. All characters following the "^" are disallowed.

If the "^" is not at the beginning of the string, then "^" is a valid character.

Examples

Compact format Meaning
A[^QVZ] Any letter of the alphabet except Q, V, or Z.
X[^#](2-4) Two to four characters; not an hash.
X[ #<^] One of three possible characters: "#", "<", or "^".
X[ #<^135] One of six possible characters: "#", "<", "^", 1, 3, or 5.
X[ #<^^] An invalid specification.