What we can enrich a lead with — and the receipt each fact must carry
Between the form and a personalized email or return visit, we can learn a lot
about a lead from outside sources. In Provenance, every enriched fact is gated like a claim:
it must come from a human-allow-listed source, carry a lawful basis, be fresh, and pass the
Enrichment Gate before it can appear in a message. This page catalogs the sources honestly —
what they give, what they cost, and the basis question. In the demo, sources are
used (simulated) unless marked live;
nothing here calls a paid API.
Free / low-cost — between form → email
| Source | Gives us | Cost | Lawful basis / caution | Freshness | In demo |
| Email domain parse |
company domain, B2B vs personal |
$0 |
user-provided (low risk) |
instant |
used
live
|
| DNS / MX / WHOIS |
mail provider, domain age, registrar |
$0 |
public record |
days |
used
live
|
| Company website scrape |
products, locations, tech hints |
$0 |
ToS varies; respect robots.txt |
weeks |
cataloged |
| News / RSS / Google News |
recent events, funding, initiatives |
$0 |
public; attribute the source |
hours |
used
live
|
| SEC EDGAR |
revenue, risk factors (public cos) |
$0 |
public record |
quarterly |
cataloged |
| GitHub / job boards |
tech stack, hiring signals |
$0 |
public; ToS |
days |
cataloged |
| LinkedIn public profile |
title, seniority |
$0 |
ToS-restricted — careful |
weeks |
cataloged |
| BuiltWith (free tier) |
website tech stack |
free tier |
vendor ToS |
weeks |
cataloged |
Paid — between form → email, and return-visit refresh
| Source | Gives us | Cost | Lawful basis / caution | Freshness | In demo |
| Clearbit / HubSpot Enrichment |
firmographics, role, seniority |
per-enrichment / seat |
vendor AUP + your basis |
weeks |
used
|
| ZoomInfo |
contact + company, org chart |
seat + credits ($$$) |
AUP is strict; record basis |
weeks |
cataloged |
| Apollo.io |
contact + intent |
seat + credits |
AUP |
weeks |
cataloged |
| People Data Labs |
person/company graph |
per-record API |
basis required |
weeks |
cataloged |
| Cognism / Lusha |
EU-compliant B2B contacts |
seat + credits |
GDPR-positioned |
weeks |
cataloged |
| 6sense / Demandbase |
account intent, buying stage |
platform ($$$$) |
account-level, lower PII risk |
days |
used
|
| Bombora |
topic surge / intent |
subscription |
account-level |
weekly |
cataloged |
Engagement — email sent → click → website
| Source | Gives us | Cost | Lawful basis / caution | Freshness | In demo |
| Email open pixel |
opened? when? client? |
$0 |
disclose tracking; some clients block |
realtime |
used
|
| Click tracking (wrapped links) |
which CTA, when |
$0 |
first-party, low risk |
realtime |
used
|
| First-party site analytics |
pages, dwell, return |
$0 |
first-party cookie + notice |
realtime |
used
|
| Reverse-IP (Clearbit Reveal / KickFire) |
company of an anonymous visit |
per-lookup |
account-level, no PII |
realtime |
cataloged |
Where the data lives
| Store | Path | Holds |
| provenance.sqlite | /data/provenance.sqlite | recipients · form_events · cta_events · verdict_cache · llm_cache |
| profiles.sqlite | /data/profiles.sqlite | synthesized profiles + every fact receipt (source · basis · verdict) |
| claims/library.json | /app/data/demo/claims/library.json | versioned claim-evidence graph |
| observe/*.jsonl | /app/data/demo/observe | append-only observability event ledger (one per lane) |
| helix_tenant.yaml | /app/rules/helix_tenant.yaml | human-owned claim policy + enrichment-source policy |
All local, synthetic, and gitignored — the demo's $0 / offline / no-PII guarantee holds.
Live example — the profile synthesized for Northwind Health (synthetic mode)
| Verdict | Fact | Value | Source | Basis |
| usable |
company_domain | northwindhealth.org | email_domain | legitimate_interest_b2b |
| usable |
recent_news | Northwind Health reported a push to cut administrative cost | company_news_rss | public_record |
| usable |
num_facilities | 19 | firmographic_sim | contract_vendor_dpa |
| usable |
ehr_vendor | Allscripts/Veradigm | firmographic_sim | contract_vendor_dpa |
| usable |
size_band | 9+ hospital IDN | firmographic_sim | contract_vendor_dpa |
| disclaimer |
intent_topic | reducing length of stay | intent_sim | contract_vendor_dpa |
| disclaimer |
in_market | true | intent_sim | contract_vendor_dpa |
7 usable · 0 blocked ·
fact-audit caught 100.0% of un-shippable traps at
0.0% false-block. These facts are readable on the
Observatory and via /api/observe/profiles.