Processors
A processor is an extraction template that defines what structured data to pull from your documents. It specifies the fields, their types, and how they map to regions in a PDF.
Creating a Processor
Section titled “Creating a Processor”- Navigate to Configure > Processors and click New Processor.
- Fill in:
- Name — a descriptive name (e.g. “Invoice Processor”, “Service Agreement”).
- Description — optional context about what documents this processor handles.
- Tags — optional labels for organization and filtering.
- Click Save to create the processor, then begin adding fields.
Defining Fields
Section titled “Defining Fields”Each field represents a single piece of data to extract. For every field, configure:
| Property | Description |
|---|---|
| Name | Identifier used in expressions and results (e.g. invoice_number). Use snake_case. |
| Type | One of text, number, money, date, or boolean. See Field Types. |
| Required | Whether extraction must produce a value for this field. |
| Controlled Values | Optional link to a Constant that restricts valid values. |
| Group | Optional group name for repeating field sections. |
Fields can be reordered via drag-and-drop.
Field Groups
Section titled “Field Groups”Groups let you define repeating sections of fields — for example, line items on an invoice.
- Assign multiple fields to the same group name to create a group.
- Toggle Multiple on the group to allow multiple instances (e.g. many line items per document).
- During review, grouped fields appear in an accordion. Reviewers can add or remove instances.
Controlled Values
Section titled “Controlled Values”Link a field to a Constant to restrict its extracted values to a predefined list. When a controlled-values constant is assigned:
- The review UI shows a dropdown instead of a free-text input.
- Extraction attempts to match values against the list.
Training with Labeled PDFs
Section titled “Training with Labeled PDFs”Improve extraction accuracy by providing labeled examples:
- In the processor editor, upload one or more sample PDFs.
- For each field, draw a selection on the PDF to indicate where the value appears.
- The labeled data helps the extraction model locate fields in similar documents.
Upload multiple samples to cover layout variations.
Associating Contracts
Section titled “Associating Contracts”Link one or more Contracts to a processor so that validation runs automatically after extraction:
- In the processor editor, find the Contracts section.
- Select the contracts you want to apply.
When a job runs with this processor, all associated contracts are evaluated against each document.
CSV Import
Section titled “CSV Import”To quickly set up fields from an existing spreadsheet:
- Click Import CSV on the processor editor.
- Upload a CSV file — each column header becomes a field name.
- Review the auto-created fields and adjust types as needed.
- Naming conventions — Use consistent
snake_casenames (e.g.vendor_name,invoice_date). Field names are used in logic block expressions as$field_name. - Iterative approach — Start with a few key fields, run a test job, review results, then add more fields as needed.
- Primary date — Mark one date field as the
primary_dateto use it for document-level date sorting and filtering.