israaaML's picture
Claude Sonnet 4.6
add compare_agents.py: 4-way benchmark (Random/Heuristic/SFT/GRPO)
2968ead