It starts with a content object.
Every piece of content in Kodexa is one object the workflow can act on. The documents, the structure parsed out of them, the attributes linked across them, and the audit trail of everything that touched them — held together as a single, addressable unit.
One object. Four things the workflow can read.
Open a content object and you find the documents, the structure parsed out of them, the attributes that link both back to where they came from, and the audit trail of everything any actor — AI, agent, rule or human — did along the way.
Documents
Originals, attachments, variants. Every page kept in its native form, addressable by coordinate.
- PDF, Word, email, scan
- Pages and bounding boxes
- Versioned through the lifecycle
Content structure
The document broken into a tree the workflow can reason about: sections, headings, tables, cells.
- Sections, headings, paragraphs
- Tables with rows and cells
- Coordinates back to source
Linked attributes
The data extracted, validated, taxonomy-aligned, and tied back to the place it came from.
- Vendor, total, tax code
- Taxonomy and types
- Provenance for every value
Audit trail
Every read, edit, decision and approval — by AI, agents and people — is appended to the content object itself.
- Who or what acted, and when
- Prompt, model and version
- Reviewer, decision and reason
The content object is what the workflow passes from step to step. Everything below is built on it.
The content object moves through steps. Each step acts on it.
A Kodexa activity plan is a sequence of typed steps. Some run Kodexa modules (often LLM-backed), some run custom scripts, some call external services, some create tasks for people. Every action lands in the content object's audit trail.
Receive
Mark the document as received in the downstream system of record.
Preprocess
OCR the pages, classify document types, segment and deskew.
AI Labeling
Extract structured attributes against a taxonomy using an LLM.
Enrich
Build shipment data, look up references, normalise codes.
Review
A person approves the attributes that carry risk, or rejects them.
Post
Persist the approved content object to the system of record.
The content object holds the work.
The activity plan is how the work gets done.