Clean up: remove junk files (opus_prompt, mermaid_render, EVALUATION, unused UI, train_ppo) a999428 Uddiii commited on Apr 26
Fix Mermaid diagrams for GitHub rendering - simplified all three diagrams, added agentic interaction flow a0f62f1 Uddiii commited on Apr 26
Submission-ready: README, blog, training pipeline, baseline evidence, OpenEnv compliance 7a90355 Uddiii commited on Apr 26
fix(grpo): Unsloth inference-mode swap + smaller LoRA + KV cleanup (T4 OOM #2) a3804d9 Uddiii commited on Apr 26
fix(grpo): per-step backward to bound VRAM during update (T4 OOM fix) 8f20926 Uddiii commited on Apr 26
feat(kaggle): add clean_launch.py + shrink budget to 20/25/30 = 75 eps cd923aa Uddiii commited on Apr 26
fix(grpo): skip reference model when kl_beta=0 to save 5GB VRAM on T4 0566783 Uddiii commited on Apr 25
fix(kaggle): unpin torch and loosen trl floor to prevent bnb/unsloth break 71a0a91 Uddiii commited on Apr 25