Skip to content
Help
HelpStart here

Glossary

Precise, code-grounded definitions of the key terms used throughout Provenance. · 6 min read

These are the load-bearing terms in Provenance, defined the way the system actually uses them. Skim it once and the rest of the app reads more clearly.

Core concepts
claimAn atomic, single-fact assertion extracted from an approved source — the smallest unit the Gate verifies and the unit a verdict attaches to.
evidenceThe verbatim source span (character offsets into a source document) that a claim is bound to — what the Gate retrieves and checks the claim against.
the ledgerThe Gate's output: the per-claim record of each claim's verdict, confidence, source, and reasons — idempotent and cached per (claim_id, source_version, rules_version).
verdictThe Gate's ruling on a claim: green = entailed by source and permissible (cited), amber = uncertain / needs a disclaimer (repaired), red = unsupported or impermissible (blocked, never a sendable variant).
provenance classWhere a fact came from (declared, behavioral, modeled, broker, OAuth, etc.) — the source dimension that, together with the surface policy, governs how a fact may be used.
Surface policy
say / allude / holdThe surface policy on a fact: say = may appear verbatim in copy (e.g. a name they typed), allude = may shape the message but not be recited (e.g. de-anonymized firmographics), hold = may steer selection but can never appear in copy (e.g. modeled income).
the truth boundaryThe set of moves the system is allowed to make — a variant is admitted only if it has no red claim and respects surface policy; enforced at pool construction, not by a downstream filter.
The Gate
the GateThe verification module: decompose → retrieve → NLI → calibrated ensemble → compliance rules → ledger; idempotent and claim-level cached, and a compliance rule can only make a verdict more restrictive.
NLI ensembleA diverse set of entailment judges ("is this claim entailed by its evidence?") whose agreement is a better-calibrated signal than any single judge, because different lenses fail differently — the LLM judge is one ensemble member, not the verifier.
The Optimizer
the OptimizerA contextual bandit campaign that drives verified variants over recipients per segment, learning from a simulated CTA oracle — it can only ever pull arms in its pool, and the blocked-lie arm is never in it.
banditThe Thompson-sampling policy: it samples each active arm's Beta posterior and pulls the best, updating from observed reward.
armOne verified variant the bandit can choose for a segment; an ungated or red-claim variant is structurally excluded from the arm pool, so it can be selected 0×.
regretThe cumulative shortfall versus always playing the best honest arm — it trends toward 0 as the constrained bandit converges, while the unconstrained twin chases the planted lie.
Drift & Assurance
driftA change to a source or a legal-hold flip that invalidates affected claims — Drift re-verifies exactly those claims (no under/over-invalidation) and pauses the dependent variants.
trapAn adversarially mutated claim (number drift, unsupported superlative, false equivalence, or a true-but-unsayable guarantee) used to test whether the Gate catches what a shallow judge would miss.
catch-rateThe fraction of bad (trap) claims the Gate correctly catches — the headline number the Assurance Lab reports against a single-judge baseline.
false-rejectThe fraction of clean, approved claims that get wrongly blocked — measured alongside catch-rate so a high catch-rate at zero false-reject is a fair comparison.
ECE / calibrationExpected Calibration Error — how far the Gate's stated confidence is from observed accuracy across bins; calibration (an isotonic PAV calibrator) turns a raw entailment score into a probability you can trust, so "0.9" means roughly 90%.