Clean up: remove junk files (opus_prompt, mermaid_render, EVALUATION, unused UI, train_ppo) a999428 Uddiii commited on 14 days ago
Fix Mermaid diagrams for GitHub rendering - simplified all three diagrams, added agentic interaction flow a0f62f1 Uddiii commited on 14 days ago
Submission-ready: README, blog, training pipeline, baseline evidence, OpenEnv compliance 7a90355 Uddiii commited on 14 days ago
fix(grpo): Unsloth inference-mode swap + smaller LoRA + KV cleanup (T4 OOM #2) a3804d9 Uddiii commited on 14 days ago
fix(grpo): per-step backward to bound VRAM during update (T4 OOM fix) 8f20926 Uddiii commited on 14 days ago
feat(kaggle): add clean_launch.py + shrink budget to 20/25/30 = 75 eps cd923aa Uddiii commited on 14 days ago
feat(kaggle): default to fixed-budget curriculum 20/30/50 episodes 69f89ec Uddiii commited on 14 days ago
fix(grpo): skip reference model when kl_beta=0 to save 5GB VRAM on T4 0566783 Uddiii commited on 14 days ago
fix(kaggle): align pip-managed numpy with kernel's loaded numpy 27cf9cd Uddiii commited on 14 days ago
fix(kaggle): escape backslash-n in REPAIR cell separator print 04688c1 Uddiii commited on 14 days ago
fix(kaggle): unpin torch and loosen trl floor to prevent bnb/unsloth break 71a0a91 Uddiii commited on 14 days ago
kaggle: refresh cell-8 promotion timing for per-phase early-stop c64ec55 Uddiii commited on 14 days ago
kaggle: lower convergence bar to +1.2 reward (3.1x baseline P3) 13ae8dd Uddiii commited on 14 days ago
LLM Emotion Adapter for TTS + LLM Empathy Judge + reward rebalance 28e9d71 Uddiii commited on 14 days ago