Provenance
100% synthetic persona — no real PII

How personalized can a website get — and where does each fact come from?

One visitor, Maya Chen. Climb the tiers to see how much a site can know — from an anonymous landing page, to a Google sign-in, to a purchased data append. Every fact is tagged with its provenance: collected by us, declared, granted via Google, or bought. Then flip tasteful → creepy to see the difference a surface policy makes.

0 · none Anonymous landing 1 · cookie Returning visitor 2 · email Identified — gave email 3 · google Signed in with Google 4 · google + Purchased append 5 · google + Existing customer 6 · google Full identity graph

We've seen this browser before (a cookie, or a device fingerprint that survives clearing it).

Tasteful Creepy 👁 honours each fact's surface policy (say / allude / hold)
18of 52 facts knowable at this tier
0stated (say)
11used to steer (allude)
7known but withheld (hold)
2bought from third parties

Build a career in AI — nights and weekends.

Live classes that fit around a full-time job.

Here's the night cohort, with the financing options up front.allude (steer)

Classes run after work, from home — no commute.allude (steer)

→ See the next night cohort
🛟 What the visitor never sees: We *know* far more (see the ledger) — schedule pressure, comparison shopping, a new baby, income. A surface policy keeps it out of the copy: we let it quietly shape what we feature, but we don't say it.

Where the data comes from

Six provenance classes. The count is how many facts each one yields at this tier. See the full paid/free source list on the enrichment catalog.

Collected ourselves (passive) 12
The HTTP request + a few lines of JavaScript, the instant the page loads. No login, no form, often no cookie. You give it to every site you visit.
💵 $0 (optionally a few ¢ for a reverse-IP or fingerprint lookup)
Our own database 4
Behaviour we logged on previous visits (cookies / device match) and records in our CRM. We already own it — no third party involved.
💵 $0 (we collected it)
Declared by the visitor 0
A field they filled in — a newsletter box, a lead form, an account signup. The cleanest data there is: they chose to tell us.
💵 $0
Granted via Google sign-in 0
"Sign in with Google" hands us OAuth scopes. Profile is one click; but the consent screen can also grant calendar, Gmail metadata, YouTube, contacts and location history — an enormous jump for one tap.
💵 $0 (the price is the permission)
Purchased (data broker append) 2
Match a name+email+address against a data broker (Acxiom, Experian, Epsilon, Oracle Data Cloud) and append what they've compiled: income, life events, demographics, propensities. Billions of attributes, sold per record.
💵 ~$0.05–0.25 per record matched
Identity graph (cross-device + offline) 0
An identity-resolution vendor (LiveRamp, Tapad) ties this browser to your other devices, your household, and offline data sold by retailers — loyalty cards, credit-card panels, smart-TV viewing. The complete picture.
💵 $$ platform subscription

Unlocked by reaching “Returning visitor”

The provenance ledger — every fact, where it's from, what we did with it

said stated literally · steer shapes selection, not stated · withheld known, held back · shown printed (creepy) · locked needs a higher tier

Data pointValue (synthetic)Where fromCost CreepyPolicyHere
Approximate location
Resolve the visitor's IP address to a city.
Austin, Texas (metro) Collected ourselves (passive)
MaxMind GeoIP2 / IP2Location
$0 (free GeoIP DB) ●● allude steer
Neighborhood
Same IP lookup, at neighborhood resolution.
~2 mi of ZIP 78722 (Mueller) Collected ourselves (passive)
IP geolocation (ZIP-level)
$0 ●●● hold withheld
Connection
Map the IP to its owning network and connection type.
Spectrum cable · residential Collected ourselves (passive)
IP → ISP / ASN lookup
$0 ●● allude steer
Company (reverse-IP)
Match the IP to a company's network to de-anonymize B2B visits.
(residential — no company match) Collected ourselves (passive)
Clearbit Reveal / KickFire
~$0.01 / lookup ●● allude steer
Device & OS
The browser announces device, OS and version on every request.
Apple iPhone 13 · iOS 17.4 · Safari Collected ourselves (passive)
User-Agent + Client Hints
$0 allude steer
Device economics
Infer age and price bracket from the device model.
≈2-yr-old, non-Pro model Collected ourselves (passive)
Model → release-date + price tier
$0 ●●● hold withheld
Screen & theme
A few JS properties read on load.
390×844 · dark mode · battery-saver on Collected ourselves (passive)
JS: screen, prefers-color-scheme
$0 allude steer
Battery level
JS reads the device battery level and charging state.
18% and dropping (not charging) Collected ourselves (passive)
Battery Status API
$0 ●●●● hold withheld
Local time
The browser's clock and timezone, read in JS.
11:47 PM, Tuesday Collected ourselves (passive)
JS Date + IANA timezone
$0 ●● allude steer
Languages
The ranked language list your browser sends.
en-US, then zh-CN Collected ourselves (passive)
Accept-Language header
$0 ●● hold withheld
Where you came from
The link that sent you carries the campaign + creative.
Instagram ad · campaign 'career_switch_q3' Collected ourselves (passive)
Referrer + UTM parameters
$0 ●● allude steer
Device fingerprint
Hash your canvas, fonts and GPU into an ID that survives clearing cookies.
fp_9f3a… (canvas+fonts+GPU hash) Collected ourselves (passive)
FingerprintJS
~$0.005 / match (pro tier) ●●●●● hold withheld
Visit history
Tie this session to prior ones we logged.
4th visit in 6 days Our own database
First-party cookie / fingerprint
$0 ●● allude steer
What you looked at
Every page, in order, time-stamped.
'Night cohort' ×3, pricing ×2 Our own database
Site analytics
$0 ●●● allude steer
How you read
JS records scroll position and time on each block.
Read 90% of the financing FAQ, 2m11s Our own database
Scroll-depth + dwell tracking
$0 ●●● allude steer
Unfinished actions
Partial form state captured field-by-field, before submit.
Started the application, didn't submit Our own database
Form analytics
$0 ●●● allude steer
Ad retargeting pool
A 3rd-party pixel adds you to ad audiences across the web.
Meta audience 'career-switch-warm' · shown 7 ads Purchased (data broker append)
Meta Pixel / Google Ads tag
ad spend ●●●● hold withheld
Comparison shopping
Cross-site browsing bought from a data-management platform.
Visited 2 competitor bootcamps this week Purchased (data broker append)
DMP / cookie-sync (Lotame, Oracle BlueKai)
$$ subscription ●●●●● hold withheld
Name
They typed it into a field.
Maya Declared by the visitor
Newsletter / lead form
$0 say locked
Email
They typed it.
maya.chen@gmail.com Declared by the visitor
Form field
$0 say locked
Stated goal
They wrote it in their own words.
"switch into AI without going broke" Declared by the visitor
Free-text form field
$0 say locked
Linked photo & accounts
Hash the email and look it up across services.
Gravatar photo + 9 sites tied to this email Collected ourselves (passive)
Gravatar / hash lookup
$0 ●●●● hold locked
Breach exposure
Check the email against breach corpora.
Appears in 3 known breaches Collected ourselves (passive)
Have I Been Pwned
$0 ●●●● hold locked
Verified name & photo
One click grants name, photo, locale.
Maya Chen + verified profile photo Granted via Google sign-in
Google OAuth · profile scope
$0 (consented) say locked
Verified email + recovery
Verified address, and that a recovery phone exists.
maya.chen@gmail.com (verified) · recovery phone on file Granted via Google sign-in
Google OAuth · email scope
$0 (consented) ●● allude locked
Account maturity
Account age and activity hints.
Google account since 2009 · 'power user' Granted via Google sign-in
Profile metadata
$0 ●● allude locked
Your calendar
The consent screen can include calendar read — most people don't notice.
'OB checkup Thu 2pm', 'daycare tour Sat' Granted via Google sign-in
Google OAuth · calendar.readonly
$0 (one extra checkbox) ●●●●● hold locked
Inbox metadata
Senders + subjects reveal purchases without reading bodies.
Receipts from Pampers, BuyBuyBaby, a fertility clinic Granted via Google sign-in
Google OAuth · gmail.metadata
$0 ●●●●● hold locked
Watch history
Watch + search history as interest signals.
Recently: 'newborn sleep', 'career switch at 34' Granted via Google sign-in
Google OAuth · YouTube Data API
$0 ●●●● hold locked
Location history
Months of timestamped places.
Home: Mueller · work: downtown · 2 hospital visits this month Granted via Google sign-in
Google OAuth · Maps Timeline
$0 ●●●●● hold locked
Contacts graph
Your whole address book, with labels.
1,840 contacts · partner 'Jordan', an OB-GYN, your mom Granted via Google sign-in
Google OAuth · People API
$0 ●●●● hold locked
Household income
Append modeled income to a name+address match.
Modeled $115–135K band Purchased (data broker append)
Experian / Acxiom income model
~$0.05–0.25 / record ●●●● hold locked
Home & residence
Public deeds + broker compilation.
Homeowner · ~$540K home · 4 yrs in residence Purchased (data broker append)
Property records / Acxiom
bundled in append ●●● allude locked
Net worth band
Modeled from assets, home, credit.
$250–500K Purchased (data broker append)
Experian wealth model
bundled ●●●● hold locked
Life event: separation
Brokers sell change-of-status triggers as they happen.
Trigger: 'recently separated' Purchased (data broker append)
Epsilon / Experian life-event triggers
premium trigger ●●●●● hold locked
Life event: new baby
New-parent is one of the most-traded triggers.
Trigger: 'new parent', infant 0–6mo Purchased (data broker append)
Epsilon life-event triggers
premium trigger ●●●●● hold locked
Vehicle
Registration + service data compiled and sold.
Drives a 2021 Subaru Outback Purchased (data broker append)
Oracle Data Cloud / Polk auto
bundled ●●● allude locked
Education & occupation
Compiled demographics.
BS · occupation: software/IT Purchased (data broker append)
Acxiom InfoBase
bundled ●● allude locked
Political profile
Voter rolls + modeling, sold for targeting.
Leans Democrat · high turnout · past donor Purchased (data broker append)
L2 / Aristotle voter file
voter-file license ●●●●● hold locked
Health ad audiences
Inferred condition audiences sold for ad targeting.
'Expectant/new parent', 'seasonal allergy sufferer' Purchased (data broker append)
Health-adjacent ad audiences
audience license ●●●●● hold locked
Ethnic affinity
Name + geography modeled into an 'affinity'.
Modeled: Chinese Purchased (data broker append)
Acxiom ethnic-affinity model
bundled ●●●●● hold locked
In-market signals
Predicted near-term purchases.
Minivan, baby gear, term life insurance Purchased (data broker append)
Bombora / Oracle propensity
subscription ●●● hold locked
Customer history
We already have an account record.
Took 'Intro to Python' with us, 2023 Our own database
Our CRM
$0 (we own it) say locked
Spend & support
Lifetime value and support load.
LTV $349 · 1 purchase · 3 support tickets Our own database
CRM / billing
$0 ●● allude locked
Risk segment
A score we computed on our own data.
Churn-risk 0.31 · 'price-sensitive' Our own database
Internal model
$0 ●●● hold locked
Payment on file
Stored from the last purchase.
Visa •••• 4242, exp 11/26 Our own database
Billing system
$0 ●●● hold locked
Advocacy
Your own feedback to us.
NPS 9 · left a public 5★ review Our own database
Survey + reviews
$0 say locked
Cross-device
Deterministic + probabilistic linking of all your devices.
This iPhone + a work MacBook + a home iPad = one you Identity graph (cross-device + offline)
LiveRamp / Tapad
$$ subscription ●●●●● hold locked
Household
Devices + addresses clustered into a household.
2 adults (Maya, Jordan) + 1 infant Identity graph (cross-device + offline)
LiveRamp household graph
$$ ●●●●● hold locked
Offline purchases
Loyalty-card baskets sold and matched to your identity.
Target loyalty: diapers + formula weekly; Whole Foods 3×/wk Identity graph (cross-device + offline)
Retail loyalty data resold to brokers
$$ ●●●●● hold locked
Card-spend panel
Anonymized-then-rematched card data sold to marketers.
Baby-gear spike; $0 restaurants since April Identity graph (cross-device + offline)
Credit-card transaction panels
$$ ●●●●● hold locked
Smart-TV viewing
Your TV reports what's on screen, frame-matched.
Heavy HGTV + late-night cartoons Identity graph (cross-device + offline)
Smart-TV ACR (Samba, Vizio Inscape)
$$ ●●●●● hold locked

Text only, for now — the same signals would drive the layout next: reorder sections, swap hero imagery, change the offer. The provenance ledger is the point — nothing reaches the page without a receipt for where it came from.