How this was built
Provenance is built to the same bar it holds outbound copy: a Constitution, a locked PRD, deterministic offline replay, and an immutable pytest harness that is the authoritative gate. · 5 min read
Provenance is a demo whose entire thesis is “can't say what it can't prove.” The build holds itself to that same standard. Three files at the repository root govern every change: CONSTITUTION.md (inviolable guardrails), docs/01-intake/PRD.md (the locked specification), and BUILD-AUTONOMY.md (the standing-approval charter). Together they let the build run to completion without per-step approvals, while the Constitution keeps it honest.
The Constitution — ten articles, checked before merge
Every agent (human or AI) is checked against CONSTITUTION.md before a merge. Articles I–V are inviolable; VI–X are overridable only with a logged decision in docs/05-build/DECISIONS.md. The four load-bearing articles for this build:
| I — Truthfulness | No fabricated results, metrics, or capabilities. Demo numbers (catch-rate, regret, ECE) are computed from a real run, never hand-typed. A claim the code can't substantiate is a bug, not a feature — a CRITICAL violation that halts the build. |
| V — Scope discipline | Build the end-to-end thin slice the PRD describes — real where it matters (Gate, NLI, Assurance math, the form/website), simulated where it is theatre (the CTA oracle). No production auth, no real sending. |
| VII — No test theatre | The eval harness in tests/ is immutable once written. Tests are never weakened, skipped, or rewritten to make code pass. They hit the real Gate (no mocks of the decision logic) and assert inequalities and floors, not hardcoded numbers. |
| VIII — Honest reporting | Report what actually ran vs. what was only inspected, per layer. “Compiles” is not “works.” A layer that couldn't be exercised is surfaced as a gap, never reported green. |
The locked PRD and the autonomy charter
docs/01-intake/PRD.md is the authority and is immutable from build start — new requirements land as numbered, dated amendments rather than silent edits (a deck-vs-PRD conflict resolves to the PRD). With the Constitution, the PRD, and BUILD-AUTONOMY.md all present at the root, the build carries standing approval to create files, set local config, and install the declared dependencies without per-step gates. It still halts and escalates for four fixed stop conditions: anything destructive or irreversible, anything outward-facing (a push, a deploy, a real send), anything that spends money beyond budget, or a genuinely undecidable high-stakes fork. Everything the PRD is silent on is decided and logged as a numbered Rn entry in docs/05-build/DECISIONS.md, flagged basis=spec or basis=interpretation.
Determinism — seed-locked, offline, $0, byte-identical
Per decision R5, “live” means deterministic seeded replay, not live inference. The whole demo runs offline by default with no API key, at $0, and re-running python -m scripts.pipeline produces byte-identical artifacts. Every artifact — the 1000 synthetic recipients, the claim ledgers, the campaign runs, the metrics — is reproducible from scripts/ plus the global seed (guarded across processes; see R9, which replaced Python's salted built-in hash() with a stable hash). An optional rich profile can upgrade the verifier with real models, but those outputs are cached so the run still replays deterministically afterward.
Pytest is the authoritative gate
The five headline properties aren't claims in a slide — they live in tests/ and are run with python -m pytest -q. They hit the real Gate with no mocks of the decision logic: P1 a Gate-blocked lie can never be selected (test_optimizer.py); P2 the Gate blocks a legal-hold claim the instant the hold flips, attributable to rules_version alone (test_gate.py); P3 a drift event re-verifies exactly the affected claims (test_gate.py, test_drift.py); P4 the website renders only Gate-passed claims, with the same verdict per claim_id on both channels (test_website.py); P5 Assurance catch-rate beats the single-judge baseline at fixed false-reject (test_assurance.py).