Skip to content

Processors

A processor is an extraction template that defines what structured data to pull from your documents. It specifies the fields, their types, and how they map to regions in a PDF.

  1. Navigate to Configure > Processors and click New Processor.
  2. Fill in:
    • Name — a descriptive name (e.g. “Invoice Processor”, “Service Agreement”).
    • Description — optional context about what documents this processor handles.
    • Tags — optional labels for organization and filtering.
  3. Click Save to create the processor, then begin adding fields.

Each field represents a single piece of data to extract. For every field, configure:

Property Description
Name Identifier used in expressions and results (e.g. invoice_number). Use snake_case.
Type One of text, number, money, date, or boolean. See Field Types.
Required Whether extraction must produce a value for this field.
Controlled Values Optional link to a Constant that restricts valid values.
Group Optional group name for repeating field sections.

Fields can be reordered via drag-and-drop.

Groups let you define repeating sections of fields — for example, line items on an invoice.

  • Assign multiple fields to the same group name to create a group.
  • Toggle Multiple on the group to allow multiple instances (e.g. many line items per document).
  • During review, grouped fields appear in an accordion. Reviewers can add or remove instances.

Link a field to a Constant to restrict its extracted values to a predefined list. When a controlled-values constant is assigned:

  • The review UI shows a dropdown instead of a free-text input.
  • Extraction attempts to match values against the list.

Improve extraction accuracy by providing labeled examples:

  1. In the processor editor, upload one or more sample PDFs.
  2. For each field, draw a selection on the PDF to indicate where the value appears.
  3. The labeled data helps the extraction model locate fields in similar documents.

Upload multiple samples to cover layout variations.

Link one or more Contracts to a processor so that validation runs automatically after extraction:

  1. In the processor editor, find the Contracts section.
  2. Select the contracts you want to apply.

When a job runs with this processor, all associated contracts are evaluated against each document.

To quickly set up fields from an existing spreadsheet:

  1. Click Import CSV on the processor editor.
  2. Upload a CSV file — each column header becomes a field name.
  3. Review the auto-created fields and adjust types as needed.
  • Naming conventions — Use consistent snake_case names (e.g. vendor_name, invoice_date). Field names are used in logic block expressions as $field_name.
  • Iterative approach — Start with a few key fields, run a test job, review results, then add more fields as needed.
  • Primary date — Mark one date field as the primary_date to use it for document-level date sorting and filtering.