Kodexa
The Kodexa content object

It starts with a content object.

Every piece of content in Kodexa is one object the workflow can act on. The documents, the structure parsed out of them, the attributes linked across them, and the audit trail of everything that touched them — held together as a single, addressable unit.

or click the object
Inside the content object

One object. Four things the workflow can read.

Open a content object and you find the documents, the structure parsed out of them, the attributes that link both back to where they came from, and the audit trail of everything any actor — AI, agent, rule or human — did along the way.

Documents

Originals, attachments, variants. Every page kept in its native form, addressable by coordinate.

  • PDF, Word, email, scan
  • Pages and bounding boxes
  • Versioned through the lifecycle

Content structure

The document broken into a tree the workflow can reason about: sections, headings, tables, cells.

  • Sections, headings, paragraphs
  • Tables with rows and cells
  • Coordinates back to source

Linked attributes

The data extracted, validated, taxonomy-aligned, and tied back to the place it came from.

  • Vendor, total, tax code
  • Taxonomy and types
  • Provenance for every value

Audit trail

Every read, edit, decision and approval — by AI, agents and people — is appended to the content object itself.

  • Who or what acted, and when
  • Prompt, model and version
  • Reviewer, decision and reason

The content object is what the workflow passes from step to step. Everything below is built on it.

Activity plan

The content object moves through steps. Each step acts on it.

A Kodexa activity plan is a sequence of typed steps. Some run Kodexa modules (often LLM-backed), some run custom scripts, some call external services, some create tasks for people. Every action lands in the content object's audit trail.

01BRIDGE_CALL

Receive

Mark the document as received in the downstream system of record.

External service
02EXECUTION

Preprocess

OCR the pages, classify document types, segment and deskew.

kodexa/preprocessor
03EXECUTION

AI Labeling

Extract structured attributes against a taxonomy using an LLM.

kodexa/llm-taxonomy-model LLM
04SCRIPT

Enrich

Build shipment data, look up references, normalise codes.

Custom JS SCRIPT
05CREATE_TASK

Review

A person approves the attributes that carry risk, or rejects them.

Human review
06BRIDGE_CALL

Post

Persist the approved content object to the system of record.

External service
Audit trail · live
08:14:02BRIDGEStatus set: DOCUMENT RECEIVED
08:14:05MODULEkodexa/preprocessor · OCR + classify complete
08:14:21LLM24 attributes labelled · claude-sonnet-4-5
08:14:24SCRIPTenrichShipmentFromDocuments() · vendor matched
08:14:31HUMANReviewer accepted tax code · GST 10%
08:14:33BRIDGEStatus set: POSTED

The content object holds the work.
The activity plan is how the work gets done.