Semantic Gradient Optimization — Agent Instructions
You are executing the Semantic Gradient Optimization pipeline. This file tells you how to run it end-to-end, interacting with the user at each decision point.
Read README.md first for the full framework. This file is the execution guide.
Phase 0 — Setup
Check dependencies
cd <project_dir>
uv sync
If uv is not installed or pyproject.toml is missing, install dependencies manually:
pip install datasets huggingface_hub openai python-dotenv
Check API key
The user needs an OpenAI-compatible LLM API key in .env:
LLM_API_KEY=...
LLM_BASE_URL=...
LLM_MODEL_NAME=...
If .env doesn't exist, copy .env.example and ask the user to fill it in. Do NOT read the .env file — ask the user to confirm it's configured.
Check data
If data/nemotron/dataset_info.json exists (relative to the project root), the persona dataset is ready. If not, run:
uv run python scripts/setup_data.py
This downloads the 1M-persona dataset (~2GB). Only needs to happen once.
Phase 1 — Define the Entity (θ)
Ask the user:
- "What are you optimizing? (product, resume, pitch, policy, profile, or describe your own)"
- "Describe it — or paste/point me to the document. I need what an evaluator would see."
- "Is there anything an evaluator should NOT see? (internal metrics, private details, etc.)"
Then:
- Write the entity to
entities/<name>.md - Confirm with the user: "Here's what I'll show evaluators. Anything to add or remove?"
If the user doesn't have a document ready, use the appropriate template from templates/ as a starting point and fill it in together.
Phase 2 — Define the Evaluator Population
Ask the user:
- "Who evaluates this? Describe your target audience."
- Examples: "startup CTOs", "hiring managers at FAANG", "homeowners in the Bay Area"
- "What dimensions matter most for segmentation?"
- Suggest defaults based on the domain (see table below)
- "Do you have a persona dataset, or should I use Nemotron-Personas-USA?"
Default stratification dimensions by domain
| Domain | Suggested dimensions |
|---|---|
| Product | Company size, role, budget, tech stack, geography |
| Resume | Company type, seniority, technical depth, industry |
| Pitch | Investment stage, sector focus, check size |
| Policy | Stakeholder role, income bracket, geography, property ownership |
| Profile | Age bracket, life stage, education, occupation, geography |
| Custom | Ask the user to name 3-4 dimensions |
Build the cohort
Run the stratified sampler with the user's parameters:
uv run python scripts/stratified_sampler.py \
--input data/filtered.json \
--entity entities/<entity>.md \
--total 50 \
--output data/cohort.json
If Nemotron doesn't fit the domain (e.g., evaluating a B2B product where you need CTO personas, not general population), generate personas using scripts/generate_cohort.py instead. But warn the user about the seeding quality difference (see README.md § The Seeding Problem).
Confirm: "Here's the cohort: N evaluators across M strata. [show distribution table]. Look right?"
Phase 3 — Evaluate: f(θ, xᵢ)
Run the evaluation:
uv run python scripts/evaluate.py \
--entity entities/<name>.md \
--cohort data/cohort.json \
--tag <run_tag> \
--parallel 5
Add --bias-calibration to inject CoBRA-inspired bias calibration instructions that reduce framing, authority, and order artifacts for more realistic evaluations.
Present results to the user:
- Overall score distribution (avg, positive %, negative %)
- Breakdown by each stratification dimension
- Top 5 attractions (aggregated)
- Top 5 concerns (aggregated)
- Any dealbreakers
- Most and least interested evaluators (with quotes)
Ask: "Any of these results surprising? Want to dig into a specific segment before we move to optimization?"
Phase 4 — Counterfactual Probe (Semantic Gradient)
Generate candidate changes
Ask the user:
- "What changes are you considering? List anything — I'll categorize them."
- "What will you NOT change? (boundaries/non-negotiables)"
If the user isn't sure, propose changes based on the top concerns from Phase 3:
- For each top concern, generate 1-2 changes that would address it
- Categorize each as: presentation (free), actionable (has cost), fixed, or boundary
- Filter out fixed and boundary — only probe the first two
Write changes to data/changes.json or use defaults.
Run the probe
uv run python scripts/counterfactual.py \
--tag <run_tag> \
--changes data/changes.json \
--min-score 4 --max-score 7 \
--parallel 5
Present the semantic gradient:
- Priority-ranked table: change, avg Δ, % helped, % hurt
- Top 3 changes with per-evaluator reasoning
- Demographic sensitivity: which changes help which segments
- Any changes that hurt certain segments (tradeoffs)
Ask: "Based on this gradient, which change do you want to make first? Or should we test a compound change?"
Phase 5 — Iterate
Once the user makes a change:
- Update the entity document:
entities/<name>_v2.md - Re-run evaluation with the same cohort:
--tag <new_tag> - Run comparison:
uv run python scripts/compare.py --runs <old_tag> <new_tag>
- Present the delta: what improved, what regressed, concerns resolved, new concerns
- Ask: "Want to probe the next round of changes, or are we good?"
Repeat until the user is satisfied or diminishing returns are clear.
Phase 6 — Bias Audit (Optional)
Run when the user questions evaluation fidelity, or proactively after the first evaluation to establish a baseline.
uv run python scripts/bias_audit.py \
--entity entities/<name>.md \
--cohort data/cohort.json \
--probes framing authority order \
--sample 10 \
--parallel 5
This runs CoBRA-inspired experiments (arXiv:2509.13588) through SGO's pipeline:
- Framing probe: Same entity rewritten with gain vs. loss framing → measures if LLM evaluators are over/under-sensitive vs. the ~30% human baseline (Tversky & Kahneman, 1981)
- Authority probe: Entity with/without credibility signals → measures authority bias vs. ~20% human baseline
- Order probe: Sections reordered → measures anchoring effects (should be ~0%)
Present: Per-probe shift %, comparison to human baselines, overall assessment (over-biased / under-biased / well-calibrated).
If over-biased: Suggest re-running evaluation with --bias-calibration flag.
If under-biased: Note that the panel may be more rational than real humans — this may be acceptable or not depending on the domain.
Ask: "Your panel shows [X]% framing sensitivity (human baseline: ~30%). Want to run with bias calibration enabled?"
Decision Tree
Start
│
▼
Has entity document?
├─ Yes → Phase 2
└─ No → Phase 1: build it together
│
▼
Has evaluator cohort?
├─ Yes (from prior run) → reuse, go to Phase 3
└─ No → Phase 2: define audience, build cohort
│
▼
Has evaluation results?
├─ Yes (from prior run) → show summary, ask if re-run needed
└─ No → Phase 3: run evaluation
│
▼
User wants optimization?
├─ Yes → Phase 4: counterfactual probe
└─ No → done, save results
│
▼
User made changes?
├─ Yes → Phase 5: re-evaluate, compare
└─ No → done
│
▼
User questions fidelity / wants validation?
├─ Yes → Phase 6: bias audit
│ └─ Over-biased? → re-run with --bias-calibration
└─ No → done
File Layout
<project_dir>/
├── README.md # Framework (for humans)
├── AGENT.md # This file (for agents)
├── LICENSE
├── pyproject.toml
├── .env.example
├── scripts/
│ ├── setup_data.py # Download Nemotron dataset
│ ├── persona_loader.py # Load + filter personas
│ ├── stratified_sampler.py
│ ├── generate_cohort.py # LLM-generate personas when no dataset fits
│ ├── evaluate.py # f(θ, x) scorer (supports --bias-calibration)
│ ├── counterfactual.py # Semantic gradient probe
│ ├── bias_audit.py # CoBRA-inspired cognitive bias measurement
│ └── compare.py # Cross-run diff
├── templates/
│ ├── entity_product.md
│ ├── entity_resume.md
│ ├── entity_pitch.md
│ └── changes.json # Default counterfactual template
├── entities/ # User's entity documents (θ)
├── data/ # Cohorts, filtered datasets
└── results/ # One subdir per run tag
└── <tag>/
├── meta.json
├── raw_results.json
├── analysis.md
└── counterfactual/
├── raw_probes.json
└── gradient.md