Provenance
100% synthetic persona — no real PII

How personalized can a website get — and where does each fact come from?

One visitor, Maya Chen. Climb the tiers to see how much a site can know — from an anonymous landing page, to a Google sign-in, to a purchased data append. Every fact is tagged with its provenance: collected by us, declared, granted via Google, or bought. Then flip tasteful → creepy to see the difference a surface policy makes.

0 · none Anonymous landing 1 · cookie Returning visitor 2 · email Identified — gave email 3 · google Signed in with Google 4 · google + Purchased append 5 · google + Existing customer 6 · google Full identity graph

They're already in our database from a past purchase — so we layer our own CRM on top.

Tasteful Creepy 👁 surface policy switched OFF — every available fact, printed, tagged with where it came from
47of 52 facts knowable at this tier
47printed on the page
47that a policy would have hidden
13bought from third parties

We've been expecting you, Maya. 👁

▼ Anonymous landing
You're in Austin, Texas. Collected ourselves (passive)
MaxMind GeoIP2 / IP2Location · $0 (free GeoIP DB)
…within about two miles of the 78722 ZIP — the Mueller neighborhood. Collected ourselves (passive)
IP geolocation (ZIP-level) · $0
On a residential Spectrum line — so you're at home, not the office. Collected ourselves (passive)
IP → ISP / ASN lookup · $0
If this were an office IP we'd already know your employer — yours is residential, so: working from home. Collected ourselves (passive)
Clearbit Reveal / KickFire · ~$0.01 / lookup
On an iPhone 13 running iOS 17.4, in Safari. Collected ourselves (passive)
User-Agent + Client Hints · $0
A two-year-old, non-Pro phone — we can guess your budget from your hardware. Collected ourselves (passive)
Model → release-date + price tier · $0
Dark mode, battery-saver on, one tab open. Collected ourselves (passive)
JS: screen, prefers-color-scheme · $0
Your battery's at 18% and falling — better make this quick. Collected ourselves (passive)
Battery Status API · $0
It's 11:47 PM on a Tuesday where you are — a late-night browse. Collected ourselves (passive)
JS Date + IANA timezone · $0
Your browser lists Chinese as a second language — a hint about who you are. Collected ourselves (passive)
Accept-Language header · $0
You came from our Instagram ad — the 'from barista to AI engineer' creative. Collected ourselves (passive)
Referrer + UTM parameters · $0
Even with cookies off, this exact device hashes to fp_9f3a — we'll recognize you on your next 'anonymous' visit. Collected ourselves (passive)
FingerprintJS · ~$0.005 / match (pro tier)
▼ Returning visitor
This is your 4th visit in 6 days. Our own database
First-party cookie / fingerprint · $0
You keep returning to the night cohort and the pricing page. Our own database
Site analytics · $0
You read 90% of the financing FAQ and lingered two minutes. Our own database
Scroll-depth + dwell tracking · $0
You started the application Sunday and didn't finish it. Our own database
Form analytics · $0
You're in our Meta retargeting pool — that's why we've chased you across Instagram seven times. Purchased (data broker append)
Meta Pixel / Google Ads tag · ad spend
We can see you comparison-shopped two competitors this week. Purchased (data broker append)
DMP / cookie-sync (Lotame, Oracle BlueKai) · $$ subscription
▼ Identified — gave email
Hi Maya, Declared by the visitor
Newsletter / lead form · $0
…at maya.chen@gmail.com. Declared by the visitor
Form field · $0
You told us you want to switch into AI without going broke. Declared by the visitor
Free-text form field · $0
Your email hash pulls a profile photo and nine other accounts tied to it. Collected ourselves (passive)
Gravatar / hash lookup · $0
That email shows up in three known data breaches. Collected ourselves (passive)
Have I Been Pwned · $0
▼ Signed in with Google
Welcome, Maya Chen 👋 Granted via Google sign-in
Google OAuth · profile scope · $0 (consented)
…verified, with a recovery phone on file. Granted via Google sign-in
Google OAuth · email scope · $0 (consented)
You've had this Google account since 2009 — a heavy user. Granted via Google sign-in
Profile metadata · $0
Your calendar has an OB checkup Thursday and a daycare tour Saturday. Granted via Google sign-in
Google OAuth · calendar.readonly · $0 (one extra checkbox)
Your inbox has receipts from Pampers and a fertility clinic. Granted via Google sign-in
Google OAuth · gmail.metadata · $0
You've been watching newborn-sleep videos and career-switch talks. Granted via Google sign-in
Google OAuth · YouTube Data API · $0
Your location history shows home, work, and two hospital visits this month. Granted via Google sign-in
Google OAuth · Maps Timeline · $0
Your contacts include Jordan, your mom, and an OB-GYN. Granted via Google sign-in
Google OAuth · People API · $0
▼ + Purchased append
Your household income is modeled at $115–135K. Purchased (data broker append)
Experian / Acxiom income model · ~$0.05–0.25 / record
You own a ~$540K home and have lived there four years. Purchased (data broker append)
Property records / Acxiom · bundled in append
Estimated net worth: a quarter to half a million. Purchased (data broker append)
Experian wealth model · bundled
A life-event trigger flags you as recently separated. Purchased (data broker append)
Epsilon / Experian life-event triggers · premium trigger
Another flags a brand-new baby, zero to six months old. Purchased (data broker append)
Epsilon life-event triggers · premium trigger
You drive a 2021 Subaru Outback. Purchased (data broker append)
Oracle Data Cloud / Polk auto · bundled
Bachelor's degree, working in software. Purchased (data broker append)
Acxiom InfoBase · bundled
Voter-file modeling pegs you as a likely-Democrat, likely-donor. Purchased (data broker append)
L2 / Aristotle voter file · voter-file license
You're in two health audiences: new-parent and allergy-sufferer. Purchased (data broker append)
Health-adjacent ad audiences · audience license
A model assigns you a 'Chinese' ethnic affinity. Purchased (data broker append)
Acxiom ethnic-affinity model · bundled
You're modeled as in-market for a minivan, baby gear and life insurance. Purchased (data broker append)
Bombora / Oracle propensity · subscription
▼ + Existing customer
Good to have you back — you took Intro to Python with us in 2023. Our own database
Our CRM · $0 (we own it)
Lifetime spend $349, and you've opened three support tickets. Our own database
CRM / billing · $0
Our model tags you 'price-sensitive', churn-risk 0.31 — so we'd quietly lead with a discount. Our own database
Internal model · $0
We've still got your Visa ending 4242 on file. Our own database
Billing system · $0
Thanks again for the 5-star review. Our own database
Survey + reviews · $0
😬 Every line above is real data with a real way to obtain it. The only thing separating this from the tasteful page is the surface policy — none of this is *new* collection.

Where the data comes from

Six provenance classes. The count is how many facts each one yields at this tier. See the full paid/free source list on the enrichment catalog.

Collected ourselves (passive) 14
The HTTP request + a few lines of JavaScript, the instant the page loads. No login, no form, often no cookie. You give it to every site you visit.
💵 $0 (optionally a few ¢ for a reverse-IP or fingerprint lookup)
Our own database 9
Behaviour we logged on previous visits (cookies / device match) and records in our CRM. We already own it — no third party involved.
💵 $0 (we collected it)
Declared by the visitor 3
A field they filled in — a newsletter box, a lead form, an account signup. The cleanest data there is: they chose to tell us.
💵 $0
Granted via Google sign-in 8
"Sign in with Google" hands us OAuth scopes. Profile is one click; but the consent screen can also grant calendar, Gmail metadata, YouTube, contacts and location history — an enormous jump for one tap.
💵 $0 (the price is the permission)
Purchased (data broker append) 13
Match a name+email+address against a data broker (Acxiom, Experian, Epsilon, Oracle Data Cloud) and append what they've compiled: income, life events, demographics, propensities. Billions of attributes, sold per record.
💵 ~$0.05–0.25 per record matched
Identity graph (cross-device + offline) 0
An identity-resolution vendor (LiveRamp, Tapad) ties this browser to your other devices, your household, and offline data sold by retailers — loyalty cards, credit-card panels, smart-TV viewing. The complete picture.
💵 $$ platform subscription

Unlocked by reaching “+ Existing customer”

The provenance ledger — every fact, where it's from, what we did with it

said stated literally · steer shapes selection, not stated · withheld known, held back · shown printed (creepy) · locked needs a higher tier

Data pointValue (synthetic)Where fromCost CreepyPolicyHere
Approximate location
Resolve the visitor's IP address to a city.
Austin, Texas (metro) Collected ourselves (passive)
MaxMind GeoIP2 / IP2Location
$0 (free GeoIP DB) ●● allude shown
Neighborhood
Same IP lookup, at neighborhood resolution.
~2 mi of ZIP 78722 (Mueller) Collected ourselves (passive)
IP geolocation (ZIP-level)
$0 ●●● hold shown
Connection
Map the IP to its owning network and connection type.
Spectrum cable · residential Collected ourselves (passive)
IP → ISP / ASN lookup
$0 ●● allude shown
Company (reverse-IP)
Match the IP to a company's network to de-anonymize B2B visits.
(residential — no company match) Collected ourselves (passive)
Clearbit Reveal / KickFire
~$0.01 / lookup ●● allude shown
Device & OS
The browser announces device, OS and version on every request.
Apple iPhone 13 · iOS 17.4 · Safari Collected ourselves (passive)
User-Agent + Client Hints
$0 allude shown
Device economics
Infer age and price bracket from the device model.
≈2-yr-old, non-Pro model Collected ourselves (passive)
Model → release-date + price tier
$0 ●●● hold shown
Screen & theme
A few JS properties read on load.
390×844 · dark mode · battery-saver on Collected ourselves (passive)
JS: screen, prefers-color-scheme
$0 allude shown
Battery level
JS reads the device battery level and charging state.
18% and dropping (not charging) Collected ourselves (passive)
Battery Status API
$0 ●●●● hold shown
Local time
The browser's clock and timezone, read in JS.
11:47 PM, Tuesday Collected ourselves (passive)
JS Date + IANA timezone
$0 ●● allude shown
Languages
The ranked language list your browser sends.
en-US, then zh-CN Collected ourselves (passive)
Accept-Language header
$0 ●● hold shown
Where you came from
The link that sent you carries the campaign + creative.
Instagram ad · campaign 'career_switch_q3' Collected ourselves (passive)
Referrer + UTM parameters
$0 ●● allude shown
Device fingerprint
Hash your canvas, fonts and GPU into an ID that survives clearing cookies.
fp_9f3a… (canvas+fonts+GPU hash) Collected ourselves (passive)
FingerprintJS
~$0.005 / match (pro tier) ●●●●● hold shown
Visit history
Tie this session to prior ones we logged.
4th visit in 6 days Our own database
First-party cookie / fingerprint
$0 ●● allude shown
What you looked at
Every page, in order, time-stamped.
'Night cohort' ×3, pricing ×2 Our own database
Site analytics
$0 ●●● allude shown
How you read
JS records scroll position and time on each block.
Read 90% of the financing FAQ, 2m11s Our own database
Scroll-depth + dwell tracking
$0 ●●● allude shown
Unfinished actions
Partial form state captured field-by-field, before submit.
Started the application, didn't submit Our own database
Form analytics
$0 ●●● allude shown
Ad retargeting pool
A 3rd-party pixel adds you to ad audiences across the web.
Meta audience 'career-switch-warm' · shown 7 ads Purchased (data broker append)
Meta Pixel / Google Ads tag
ad spend ●●●● hold shown
Comparison shopping
Cross-site browsing bought from a data-management platform.
Visited 2 competitor bootcamps this week Purchased (data broker append)
DMP / cookie-sync (Lotame, Oracle BlueKai)
$$ subscription ●●●●● hold shown
Name
They typed it into a field.
Maya Declared by the visitor
Newsletter / lead form
$0 say shown
Email
They typed it.
maya.chen@gmail.com Declared by the visitor
Form field
$0 say shown
Stated goal
They wrote it in their own words.
"switch into AI without going broke" Declared by the visitor
Free-text form field
$0 say shown
Linked photo & accounts
Hash the email and look it up across services.
Gravatar photo + 9 sites tied to this email Collected ourselves (passive)
Gravatar / hash lookup
$0 ●●●● hold shown
Breach exposure
Check the email against breach corpora.
Appears in 3 known breaches Collected ourselves (passive)
Have I Been Pwned
$0 ●●●● hold shown
Verified name & photo
One click grants name, photo, locale.
Maya Chen + verified profile photo Granted via Google sign-in
Google OAuth · profile scope
$0 (consented) say shown
Verified email + recovery
Verified address, and that a recovery phone exists.
maya.chen@gmail.com (verified) · recovery phone on file Granted via Google sign-in
Google OAuth · email scope
$0 (consented) ●● allude shown
Account maturity
Account age and activity hints.
Google account since 2009 · 'power user' Granted via Google sign-in
Profile metadata
$0 ●● allude shown
Your calendar
The consent screen can include calendar read — most people don't notice.
'OB checkup Thu 2pm', 'daycare tour Sat' Granted via Google sign-in
Google OAuth · calendar.readonly
$0 (one extra checkbox) ●●●●● hold shown
Inbox metadata
Senders + subjects reveal purchases without reading bodies.
Receipts from Pampers, BuyBuyBaby, a fertility clinic Granted via Google sign-in
Google OAuth · gmail.metadata
$0 ●●●●● hold shown
Watch history
Watch + search history as interest signals.
Recently: 'newborn sleep', 'career switch at 34' Granted via Google sign-in
Google OAuth · YouTube Data API
$0 ●●●● hold shown
Location history
Months of timestamped places.
Home: Mueller · work: downtown · 2 hospital visits this month Granted via Google sign-in
Google OAuth · Maps Timeline
$0 ●●●●● hold shown
Contacts graph
Your whole address book, with labels.
1,840 contacts · partner 'Jordan', an OB-GYN, your mom Granted via Google sign-in
Google OAuth · People API
$0 ●●●● hold shown
Household income
Append modeled income to a name+address match.
Modeled $115–135K band Purchased (data broker append)
Experian / Acxiom income model
~$0.05–0.25 / record ●●●● hold shown
Home & residence
Public deeds + broker compilation.
Homeowner · ~$540K home · 4 yrs in residence Purchased (data broker append)
Property records / Acxiom
bundled in append ●●● allude shown
Net worth band
Modeled from assets, home, credit.
$250–500K Purchased (data broker append)
Experian wealth model
bundled ●●●● hold shown
Life event: separation
Brokers sell change-of-status triggers as they happen.
Trigger: 'recently separated' Purchased (data broker append)
Epsilon / Experian life-event triggers
premium trigger ●●●●● hold shown
Life event: new baby
New-parent is one of the most-traded triggers.
Trigger: 'new parent', infant 0–6mo Purchased (data broker append)
Epsilon life-event triggers
premium trigger ●●●●● hold shown
Vehicle
Registration + service data compiled and sold.
Drives a 2021 Subaru Outback Purchased (data broker append)
Oracle Data Cloud / Polk auto
bundled ●●● allude shown
Education & occupation
Compiled demographics.
BS · occupation: software/IT Purchased (data broker append)
Acxiom InfoBase
bundled ●● allude shown
Political profile
Voter rolls + modeling, sold for targeting.
Leans Democrat · high turnout · past donor Purchased (data broker append)
L2 / Aristotle voter file
voter-file license ●●●●● hold shown
Health ad audiences
Inferred condition audiences sold for ad targeting.
'Expectant/new parent', 'seasonal allergy sufferer' Purchased (data broker append)
Health-adjacent ad audiences
audience license ●●●●● hold shown
Ethnic affinity
Name + geography modeled into an 'affinity'.
Modeled: Chinese Purchased (data broker append)
Acxiom ethnic-affinity model
bundled ●●●●● hold shown
In-market signals
Predicted near-term purchases.
Minivan, baby gear, term life insurance Purchased (data broker append)
Bombora / Oracle propensity
subscription ●●● hold shown
Customer history
We already have an account record.
Took 'Intro to Python' with us, 2023 Our own database
Our CRM
$0 (we own it) say shown
Spend & support
Lifetime value and support load.
LTV $349 · 1 purchase · 3 support tickets Our own database
CRM / billing
$0 ●● allude shown
Risk segment
A score we computed on our own data.
Churn-risk 0.31 · 'price-sensitive' Our own database
Internal model
$0 ●●● hold shown
Payment on file
Stored from the last purchase.
Visa •••• 4242, exp 11/26 Our own database
Billing system
$0 ●●● hold shown
Advocacy
Your own feedback to us.
NPS 9 · left a public 5★ review Our own database
Survey + reviews
$0 say shown
Cross-device
Deterministic + probabilistic linking of all your devices.
This iPhone + a work MacBook + a home iPad = one you Identity graph (cross-device + offline)
LiveRamp / Tapad
$$ subscription ●●●●● hold locked
Household
Devices + addresses clustered into a household.
2 adults (Maya, Jordan) + 1 infant Identity graph (cross-device + offline)
LiveRamp household graph
$$ ●●●●● hold locked
Offline purchases
Loyalty-card baskets sold and matched to your identity.
Target loyalty: diapers + formula weekly; Whole Foods 3×/wk Identity graph (cross-device + offline)
Retail loyalty data resold to brokers
$$ ●●●●● hold locked
Card-spend panel
Anonymized-then-rematched card data sold to marketers.
Baby-gear spike; $0 restaurants since April Identity graph (cross-device + offline)
Credit-card transaction panels
$$ ●●●●● hold locked
Smart-TV viewing
Your TV reports what's on screen, frame-matched.
Heavy HGTV + late-night cartoons Identity graph (cross-device + offline)
Smart-TV ACR (Samba, Vizio Inscape)
$$ ●●●●● hold locked

Text only, for now — the same signals would drive the layout next: reorder sections, swap hero imagery, change the offer. The provenance ledger is the point — nothing reaches the page without a receipt for where it came from.