Spaces:

xleaps
/

sgo

Sleeping

App Files Files Community

sgo / docs /research /ctr_calibration.md

Claude

Explore CTR calibration: map SGO scores to click-through rate predictions

109fc16 unverified about 2 months ago

preview code

raw

history blame contribute delete

4.85 kB

	# CTR Calibration with SGO

	## The Question

	> Can SGO be calibrated to predict CTR (click-through rate)?

	Short answer: yes, with a thin calibration layer on top of what SGO already provides.

	## What SGO Already Gives You

	Each SGO evaluator produces:

	```json
	{
	"score": 6,
	"action": "positive", // ← click/no-click proxy
	"attractions": [...],
	"concerns": [...],
	"dealbreakers": [...]
	}
	```

	The `action` field is essentially a discrete intent signal:
	- positive → would engage (click, sign up, buy)
	- neutral → might engage with the right nudge
	- negative → would not engage

	The score (1-10) provides a finer-grained propensity signal.

	SGO also already has:
	- Bias calibration (CoBRA-inspired) to match human cognitive biases
	- Counterfactual gradient (Jacobian) to estimate which changes move scores
	- Goal-weighted aggregation (VJP) to focus on goal-relevant evaluators

	## The Gap: Scores → Calibrated Probability

	SGO produces ordinal scores and discrete actions, not calibrated probabilities.
	To get "this ad will get 2.3% CTR", you need a calibration function that maps
	SGO's output distribution to observed rates.

	## Approach: Anchor + Scale

	### Step 1: Collect anchors

	Run SGO on a small set of creatives (ads, landing pages, emails) where you
	already know the real CTR from production data:

	\| Creative \| Real CTR \| SGO positive% \| SGO mean score \|
	\|----------\|----------\|---------------\|----------------\|
	\| Ad A \| 1.2% \| 24% \| 4.1 \|
	\| Ad B \| 3.8% \| 52% \| 6.3 \|
	\| Ad C \| 0.6% \| 12% \| 3.2 \|
	\| Ad D \| 2.1% \| 38% \| 5.5 \|

	### Step 2: Fit calibration function

	Use Platt scaling (logistic regression) or isotonic regression to learn:

	```
	P(click) = σ(a · score_sgo + b)
	```

	where `σ` is the sigmoid function, fit on the anchor data.

	With as few as 5-10 anchors, you get a usable mapping. This works because:
	- The ranking from SGO is already meaningful (higher score → higher CTR)
	- You only need to learn the scale and offset, not the ranking

	### Step 3: Predict new creatives

	Run SGO on a new creative → get score distribution → apply calibration function
	→ get predicted CTR with confidence interval.

	### Step 4: Use the gradient

	The real power isn't just predicting CTR — it's knowing what to change:

	```
	SGO gradient for Ad E (predicted CTR: 1.4%):
	+1.8 Simplify headline to one benefit → predicted CTR: 2.9%
	+1.2 Add social proof (customer count) → predicted CTR: 2.3%
	-0.4 Remove pricing from ad → predicted CTR: 1.1%
	```

	The counterfactual deltas can be converted to CTR deltas using the same
	calibration function's local derivative.

	## When This Works Well

	- Relative ranking is the main value. Even without calibration, SGO reliably
	ranks creatives by appeal. If you just need "which of these 5 ads will perform
	best?", raw SGO scores suffice.

	- Calibration shines for absolute prediction. When you need "will this hit our
	2% CTR target?", the anchor-based calibration gives you a number.

	- The gradient is unique to SGO. No CTR model tells you why the CTR is what
	it is and what specific change would improve it most. This is SGO's core value
	even when paired with an existing CTR model.

	## When to Be Careful

	- Population mismatch. SGO evaluators are census-grounded but synthetic. If your
	real audience is highly specialized (e.g., only DevOps engineers at Fortune 500),
	use targeted cohort generation and more anchors.

	- Context effects. Real CTR depends on placement, competition, time of day, etc.
	SGO evaluates the creative in isolation. Calibration anchors should come from
	similar contexts.

	- Small calibration sets. With <5 anchors, the calibration function is fragile.
	Use confidence intervals and treat predictions as directional.

	## Integration with Existing CTR Models

	If you already have a recommender system or CTR prediction model:

	```
	Existing model: "This ad will get 1.8% CTR"
	SGO: "Here's WHY, and changing the headline would get +0.7%"
	```

	The two are complementary:
	- CTR model → accurate predictions from behavioral data, fast inference
	- SGO → causal explanations and counterfactual optimization, no traffic needed

	You can also use SGO as a pre-filter: generate 20 ad variants, rank them with
	SGO, then only A/B test the top 3. This reduces the exploration cost of your
	CTR model's feedback loop.

	## Script

	See `scripts/ctr_calibrate.py` for a reference implementation that:
	1. Takes SGO results from multiple runs with known CTRs
	2. Fits a Platt scaling calibration function
	3. Predicts CTR for new SGO runs
	4. Converts counterfactual deltas to CTR deltas