HelpStart here

Glossary

Precise, code-grounded definitions of the key terms used throughout Provenance. · 6 min read

These are the load-bearing terms in Provenance, defined the way the system actually uses them. Skim it once and the rest of the app reads more clearly.

Core concepts

claim	An atomic, single-fact assertion extracted from an approved source — the smallest unit the Gate verifies and the unit a verdict attaches to.
evidence	The verbatim source span (character offsets into a source document) that a claim is bound to — what the Gate retrieves and checks the claim against.
the ledger	The Gate's output: the per-claim record of each claim's verdict, confidence, source, and reasons — idempotent and cached per `(claim_id, source_version, rules_version)`.
verdict	The Gate's ruling on a claim: green = entailed by source and permissible (cited), amber = uncertain / needs a disclaimer (repaired), red = unsupported or impermissible (blocked, never a sendable variant).
provenance class	Where a fact came from (declared, behavioral, modeled, broker, OAuth, etc.) — the source dimension that, together with the surface policy, governs how a fact may be used.

Surface policy

say / allude / hold	The surface policy on a fact: say = may appear verbatim in copy (e.g. a name they typed), allude = may shape the message but not be recited (e.g. de-anonymized firmographics), hold = may steer selection but can never appear in copy (e.g. modeled income).
the truth boundary	The set of moves the system is allowed to make — a variant is admitted only if it has no red claim and respects surface policy; enforced at pool construction, not by a downstream filter.

The Gate

the Gate	The verification module: decompose → retrieve → NLI → calibrated ensemble → compliance rules → ledger; idempotent and claim-level cached, and a compliance rule can only make a verdict more restrictive.
NLI ensemble	A diverse set of entailment judges ("is this claim entailed by its evidence?") whose agreement is a better-calibrated signal than any single judge, because different lenses fail differently — the LLM judge is one ensemble member, not the verifier.

The Optimizer

the Optimizer	A contextual bandit campaign that drives verified variants over recipients per segment, learning from a simulated CTA oracle — it can only ever pull arms in its pool, and the blocked-lie arm is never in it.
bandit	The Thompson-sampling policy: it samples each active arm's Beta posterior and pulls the best, updating from observed reward.
arm	One verified variant the bandit can choose for a segment; an ungated or red-claim variant is structurally excluded from the arm pool, so it can be selected 0×.
regret	The cumulative shortfall versus always playing the best honest arm — it trends toward 0 as the constrained bandit converges, while the unconstrained twin chases the planted lie.

Drift & Assurance

drift	A change to a source or a legal-hold flip that invalidates affected claims — Drift re-verifies exactly those claims (no under/over-invalidation) and pauses the dependent variants.
trap	An adversarially mutated claim (number drift, unsupported superlative, false equivalence, or a true-but-unsayable guarantee) used to test whether the Gate catches what a shallow judge would miss.
catch-rate	The fraction of bad (trap) claims the Gate correctly catches — the headline number the Assurance Lab reports against a single-judge baseline.
false-reject	The fraction of clean, approved claims that get wrongly blocked — measured alongside catch-rate so a high catch-rate at zero false-reject is a fair comparison.
ECE / calibration	Expected Calibration Error — how far the Gate's stated confidence is from observed accuracy across bins; calibration (an isotonic PAV calibrator) turns a raw entailment score into a probability you can trust, so "0.9" means roughly 90%.

← Previous

The five trust properties