view article Article Keep the Tokens Flowing: Lessons from 16 Open-Source RL Libraries +7 aminediroHF, qgallouedec, kashif, lewtun, edbeeching, albertvillanova, nouamanetazi, lvwerra, sergiopaniego • Mar 10 • 153
view article Article The Engineering Handbook for GRPO + LoRA with Verl: Training Qwen2.5 on Multi-GPU Weyaxi • Jan 2 • 22
view article Article Training and Finetuning Reranker Models with Sentence Transformers tomaarsen • Mar 26, 2025 • 194
view article Article dstack: Your LLM Launchpad - From Fine-Tuning to Serving, Simplified chansung • Aug 22, 2024 • 13
view article Article Topic 27: What are Chain-of-Agents and Chain-of-RAG? Kseniase • Feb 13, 2025 • 18