Skip to content
Help
HelpStart here

What Provenance is

The problem, the thesis, the five modules, and the two channels — what Provenance actually is. · 5 min read

Welcome to Provenance — a system of record for claims, built around one idea: outreach that can't say what it can't prove. This article is the front door. It explains the problem we're solving, the thesis behind the design, and the pieces you'll meet as you explore.

The problem

Ultra-personalization breaks human review. You can't legally read 100,000 unique messages, so in claims-heavy domains (health-tech, finance, anything regulated) you're forced to choose: generic-but-reviewed, or personal-but-unverified. The real bottleneck isn't writing — it's verification at scale. Once every message is unique, no human can stand behind what each one says.

The thesis

Provenance flips the trade-off: instead of reviewing output after the fact, it makes the system unable to assert anything it can't ground in an approved source. Personalization is then free to optimize — but only inside the truth boundary. An AI move can't win by saying something it can't prove, because an unprovable variant is never admitted to the pool in the first place.

Optimizing inside the truth boundary is the whole trick: the bandit still hunts for the best-performing variant per segment, but the choices it's allowed to make are pre-constrained to verified, permissible ones. The constraint is structural, not a downstream filter you hope catches the lie.

The five modules

Provenance is built from five modules. Each one has its own help article — follow the link to go deeper.

The system, end to end
Claims LibraryApproved source documents become atomic claims, each bound to a verbatim source span — a versioned claim–evidence graph. Read more →
The GateDecompose → retrieve → NLI → calibrated ensemble → compliance rules → a green/amber/red ledger. A blocked claim never becomes a sendable variant. Read more →
The OptimizerA contextual bandit that learns the best verified variant per segment — the ungated arm is out of the pool by construction. Read more →
Drift MonitorOn a source change or legal-hold flip, it surgically re-verifies only the affected claims and pauses the dependent variants. Read more →
Assurance LabAn adversarial harness that proves the Gate works — it runs trap claims through the real Gate and beats a single-judge baseline. Read more →

The two channels

The same Gate governs outreach across two channels: the email campaign (the bandit optimizes verified variants per segment) and the website (a personalized page that renders only Gate-passed claims). The same claim_id gets the same verdict on both channels — the truth boundary doesn't change when the surface does.

What kind of demo this is

Everything here is a deterministic, offline-by-default, 100%-synthetic demo. "Live" means a seed-locked replay — given the global seed, every run is byte-identical. It runs with no API key and sends no real email. There is no real PII: the recipients, profiles, and facts are synthetic.

  • Deterministic — seed-locked; re-running the pipeline produces identical artifacts.
  • Offline by default — no network and no API key required (an optional "rich" profile can use real models, and an optional live news connector exists, both off by default).
  • Synthetic — no real PII; the demo tenant and its data are fabricated for the showcase.
Ready to look around? Head to Take the tour for a guided first pass, or jump straight to the live demo.