view article Article The Engineering Handbook for GRPO + LoRA with Verl: Training Qwen2.5 on Multi-GPU 24 days ago • 12
VAR RL Done Right: Tackling Asynchronous Policy Conflicts in Visual Autoregressive Generation Paper • 2601.02256 • Published 21 days ago • 33
Awesome SFT datasets Collection A curated list of interesting datasets to fine-tune language models with. • 43 items • Updated Apr 12, 2024 • 147
From Code Foundation Models to Agents and Applications: A Practical Guide to Code Intelligence Paper • 2511.18538 • Published Nov 23, 2025 • 294