fix: add training plots and clean up duplicate README headings 5905982 Mohammed-Altaf Claude Sonnet 4.6 commited on 20 days ago
chore: refresh training artifacts and rename consume_reward_components to private 7909885 Mohammed-Altaf commited on 20 days ago
feat: add episode trace, refresh training dataset, and update eval metrics a422c8d Mohammed-Altaf commited on 20 days ago
feat: add OpenEnv TRL wrapper, expand dataset, and add W&B eval tracking 6fa4fbd Mohammed-Altaf commited on 21 days ago
feat: add structured pruning action and random baseline policy d064b19 Mohammed-Altaf commited on 21 days ago
refactor: harden imports, add training extras, and rewrite README 5dd60b9 Mohammed-Altaf commited on 21 days ago