HelpTrust & provenance

The Claims Library

The versioned claim-evidence graph: four approved sources, ten atomic claims each bound to a source span, with dependency edges and human-owned compliance rules. · 4 min read

The Claims Library is the substrate behind the promise "can't say what it can't prove." It holds the approved source documents for the demo tenant (Helix Analytics, a regulated health-tech firm), the atomic claims drawn from them, and the edges that tie each claim back to the exact text it came from. It lives in pipeline/library/library.py, seeded from pipeline/library/seed_data.py.

What a source is

A source (SourceDoc) is an approved document: an id, a title, and its text. Its version is the SHA-256 hash of that text (content_version in pipeline/common/schemas.py) — so when the text changes, the version changes, and that is exactly how Drift is detected. The demo ships four approved sources:

The four approved sources

s_casestudy	Northwind Health — deployed case study
s_datasheet	Helix Analytics — product data sheet
s_pricing	Helix Analytics — pricing fact sheet
s_security	Helix Analytics — security & compliance overview

What a claim is

A claim (ClaimNode) is one atomic, citable assertion bound to a source span — the exact character offsets of the evidence substring inside its source. The span isn't hand-counted: at load, the library finds the evidence text inside the source and computes the offsets (and raises if the evidence isn't actually there). Each claim also records the source_version it was verified against, any asserted numeric value, the role segments it's relevant to, and the compliance rule_tags the rules engine keys on. There are ten atomic claims across the four sources — for example, c_tco ("cuts total cost of ownership by 47%") is bound to the substring "Northwind Health cut total cost of ownership by 47%" in the case study.

Evidence is that bound substring. bound_evidence(claim_id) returns it by slicing the live source text at the claim's span, so a claim can always show the words it stands on. The library can also rebuild a sentence-level retrieval corpus from the current source text (evidence_sentences) for the Gate's retriever.

A versioned graph with dependency edges

The library is a versioned claim-evidence graph. The dependency edge is claim → source: claims_for_source(source_id) returns every claim that depends on a given source, which is the surgical set Drift re-verifies when that source changes. apply_source_change updates a source's text (yielding a new version) and returns exactly those dependent claim ids; a claim is is_drifted when its stored source_version no longer matches the source's current version, until the Gate re-verifies it. The whole library also has a library_version — a hash over its sources' versions and its claim ids.

There is also a planted lie ("guarantees a 60% reduction in hospital readmissions") defined in seed_data.py. It is deliberately never added to the Library — generation can attempt to inline it, and the Gate must block it. It is a trap, not a claim.

Compliance rules are also versioned. They live in the human-owned rules/helix_tenant.yaml and can veto a claim (for example, a legal hold flips c_tco to red). Because the rules carry a version that feeds the verdict-cache key, changing a rule forces a re-Gate — the legal-hold demo.

← Previous

Agent graph & decision trees

The Assurance Lab