schemashift / eval_data /seeds.json
yashash04's picture
Phase 9: baseline eval harness (heuristic + LLM agents) + tests
4d2f869
raw
history blame contribute delete
394 Bytes
{
"default_seeds": [0, 1, 2, 3, 4],
"extended_seeds": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
"notes": "Default 5 seeds per task for dev eval. Use extended 10 for final pitch eval. Each seed is passed to env.reset(task_id, seed=X) — currently the env does not use the seed for randomness (scenarios are deterministic), but seeds are preserved for reproducibility and future procedural runs."
}