Skip to content
Help
HelpMethodology

Does personalization actually move the metrics?

The agent-swarm research on personalization → reply and booking, graded by an adversarial fact-check. These are external benchmarks (sourced, by confidence) — not Provenance's own measured numbers. · 5 min read

We ran the question through a 15-agent research swarm — does personalizing outreach actually move the numbers, and by how much? Seven researchers gathered the evidence; an adversarial verifier then graded every figure and threw out the inflated ones. What follows is what survived, by confidence. These are external research benchmarks (sourced) — not Provenance's own metrics; the product enforces the principles via the Gate, it doesn't claim the numbers as its own.

What held up — verified across sources

Personalization → the metric (graded)
LeverEffect on the metricConfidenceSource
Signal-based opener vs a generic blast≈ 2–2.5× reply ratecross-sourcemultiple datasets
Soft interest CTA vs a hard meeting ask≈ 3× reply (4.2% vs 1.4%); 15% cold-stage bookingrobustGong · Puzzle Inbox
One contact per account vs 10+7.8% vs 3.8% replysupportedBelkins
Personal / career value vs company ROI≈ 2× the impact; ≈ 50% more likely to buysupportedGoogle / CEB (B2B)
Not pitching in the first emailup to −57% reply when you dosupportedGong · 28M emails
Following up vs a single touch42–55% of replies come from follow-ups; +50–66% from the firstcross-sourcemultiple
Loss framing vs upside-onlya loss is felt ≈ 2× an equal gainsupportedprospect theory
Voice: rapport + a stated reason“did I catch you at a bad time?” → 0.9% successsupportedGong · 90,380 calls

Calibration — what “good” looks like

Cold-outbound reply benchmarks
BenchmarkValue
Average cold reply rate≈ 3.4%
Good5–10%
Elite10.7%+
Optimize onreply / positive-reply — not opens (Apple MPP inflates opens)
Deliverability ceilingsspam complaints < 0.3% · bounce < 2%
The takeaway isn't a single magic number — it's the direction and the order of operations: relevance first (it gates everything), then a soft ask, match the peer, frame the loss, don't pitch first, and follow up. Those are exactly the rules the Gate now enforces.

What we threw out — inflated or untraceable

Dropped by the fact-check
Claim we foundVerdictWhy
“5.2× from signal personalization”droppeda cross-stitch of two unrelated studies; the real lift is ≈ 2–2.5×
“287% from multichannel”droppedmisattributed — untraceable to a primary source
“25–40% reply” (vendor tiers)droppedsingle-vendor marketing, well above the ≈ 7.8% real-world ceiling
“44% time-ask penalty”droppeduntraceable
The full swarm playbook (all 7 dimensions, every source and verdict) is persisted in the repo at docs/research/copy-personalization-research.md. For how these findings became enforced code, see Copy that drives action and the engineering writeup.
Related