chore: refresh training artifacts and rename consume_reward_components to private 7909885 Mohammed-Altaf commited on about 1 month ago
feat: add episode trace, refresh training dataset, and update eval metrics a422c8d Mohammed-Altaf commited on Apr 25
feat: add OpenEnv TRL wrapper, expand dataset, and add W&B eval tracking 6fa4fbd Mohammed-Altaf commited on Apr 25
feat: add structured pruning action and random baseline policy d064b19 Mohammed-Altaf commited on Apr 25
refactor: harden imports, add training extras, and rewrite README 5dd60b9 Mohammed-Altaf commited on Apr 25