feat: add episode trace, refresh training dataset, and update eval metrics a422c8d Mohammed-Altaf commited on 30 days ago
refactor: move training code to scripts/, add train/eval split, tune GRPO hyperparams fad16c9 Mohammed-Altaf commited on 30 days ago
refactor: harden imports, add training extras, and rewrite README 5dd60b9 Mohammed-Altaf commited on 30 days ago