Eric Xu commited on
Commit ·
3ad352e
1
Parent(s): 85d7c12
Add 4-step path from Nemotron to any domain
Browse filesExplains how to go from census-grounded personas to specialized evaluators:
1. Filter by existing fields
2. Reframe the evaluation prompt
3. Enrich with situational overlay
4. Generate from scratch using Nemotron as quality bar
README.md
CHANGED
|
@@ -111,13 +111,27 @@ The column averages tell you what to fix first. "Case studies" has the highest a
|
|
| 111 |
|
| 112 |
### What makes the panel realistic?
|
| 113 |
|
| 114 |
-
SGO uses [NVIDIA Nemotron-Personas-USA](https://huggingface.co/datasets/nvidia/Nemotron-Personas-USA) —
|
| 115 |
|
| 116 |
This matters because when you ask an LLM to "generate 50 diverse personas," you get 5–6 archetypes with surface variation — mostly coastal, college-educated, and tech-adjacent. You can't audit what's missing. Census-grounded personas give you the construction worker in suburban Illinois and the quilter in rural Texas, because census data says those people exist.
|
| 117 |
|
| 118 |
-
The principle: **define the population before the measurement, not after.**
|
| 119 |
|
| 120 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 121 |
|
| 122 |
---
|
| 123 |
|
|
|
|
| 111 |
|
| 112 |
### What makes the panel realistic?
|
| 113 |
|
| 114 |
+
SGO uses [NVIDIA Nemotron-Personas-USA](https://huggingface.co/datasets/nvidia/Nemotron-Personas-USA) — 1 million synthetic Americans whose demographics match real US census distributions. Each persona includes detailed narratives: professional background, skills, career goals, hobbies, cultural background, and personality.
|
| 115 |
|
| 116 |
This matters because when you ask an LLM to "generate 50 diverse personas," you get 5–6 archetypes with surface variation — mostly coastal, college-educated, and tech-adjacent. You can't audit what's missing. Census-grounded personas give you the construction worker in suburban Illinois and the quilter in rural Texas, because census data says those people exist.
|
| 117 |
|
| 118 |
+
The principle: **define the population before the measurement, not after.**
|
| 119 |
|
| 120 |
+
### From general population to any domain
|
| 121 |
+
|
| 122 |
+
Nemotron covers age, sex, education, occupation, geography, and marital status as structured fields — plus rich narratives about each person's career, skills, values, and lifestyle. That's enough to directly evaluate anything consumer-facing: products, profiles, content, policy.
|
| 123 |
+
|
| 124 |
+
But what about domains the dataset doesn't explicitly cover — like "enterprise CTOs" or "Series B investors"? There are four ways to get there, from most grounded to most flexible:
|
| 125 |
+
|
| 126 |
+
**1. Filter by what's already there.** A Nemotron persona with `occupation: software_developer`, `education: graduate`, `age: 38` and a professional narrative describing team leadership *is* a plausible engineering manager evaluating your developer tool. You just filter and let the narrative do the work.
|
| 127 |
+
|
| 128 |
+
**2. Reframe the evaluation prompt.** Same persona, different lens. Instead of *"would you buy this?"*, ask *"you're evaluating this tool for your team — would you champion it internally?"* The persona's professional context, skills, and decision-making style naturally shape the answer.
|
| 129 |
+
|
| 130 |
+
**3. Enrich with a situational overlay.** Add context that the persona doesn't have: *"You are [full Nemotron persona]. You work at a 50-person Series A startup. Your team's tooling budget is $2k/month. You've been burned by vendor lock-in before."* The demographic grounding stays real; the professional situation is augmented.
|
| 131 |
+
|
| 132 |
+
**4. Generate from scratch, using Nemotron as a quality bar.** For truly specialized roles (VC partners, procurement officers, regulatory lawyers), generate personas via LLM — but use Nemotron personas as few-shot examples so the output matches the depth and internal consistency of the dataset. SGO's `generate_cohort.py` does this with an explicit warning about the quality tradeoff.
|
| 133 |
+
|
| 134 |
+
Each step trades some census grounding for more domain specificity. For most use cases, steps 1–2 are enough.
|
| 135 |
|
| 136 |
---
|
| 137 |
|