Commit History

add compare_agents.py: 4-way benchmark (Random/Heuristic/SFT/GRPO)
2968ead
Running

israaaML Claude Sonnet 4.6 commited on

fix: sanitize numpy/pandas types in submit_solution JSON serialization
3ce0714

israaaML Claude Sonnet 4.6 commited on

v3: benchmark results, final report, agent/eval improvements, smoke test fixes
b3fc5ee

israaaML Claude Sonnet 4.6 commited on

v2: curriculum scheduling, SFT pipeline, reward redesign, agent guide
16038fc

israaaML Claude Sonnet 4.6 commited on

Upload folder using huggingface_hub
1779f34
verified

israaaML commited on

initial commit
86b817d
verified

israaaML commited on