view article Article Ulysses Sequence Parallelism: Training with Million-Token Contexts kashif, stas • Mar 9 • 28
view article Article Mixture of Experts Explained +4 osanseviero, lewtun, philschmid, smangrul, ybelkada, pcuenq • Dec 11, 2023 • 1.13k
view article Article DualPipe Explained: A Comprehensive Guide to DualPipe That Anyone Can Understand—Even Without a Distributed Training Background NormalUhr • Feb 28, 2025 • 19
view article Article Improving Prompt Consistency with Structured Generations +1 willkurt, remi, clefourrier • Apr 30, 2024 • 68
view article Article No GPU left behind: Unlocking Efficiency with Co-located vLLM in TRL +4 toslali-ibm, mirinflim, qgallouedec, esnible, rganti, mudhakar • Jun 3, 2025 • 101
view article Article Training and Finetuning Sparse Embedding Models with Sentence Transformers tomaarsen, arthurbresnu • Jul 1, 2025 • 138
Absolute Zero: Reinforced Self-play Reasoning with Zero Data Paper • 2505.03335 • Published May 6, 2025 • 192
view article Article From Zero to Reasoning Hero: How DeepSeek-R1 Leverages Reinforcement Learning to Master Complex Reasoning NormalUhr • Feb 4, 2025 • 17
view article Article Efficient LLM Pretraining: Packed Sequences and Masked Attention sirluk • Oct 7, 2024 • 71
JudgeBench: A Benchmark for Evaluating LLM-based Judges Paper • 2410.12784 • Published Oct 16, 2024 • 47