view article Article No GPU left behind: Unlocking Efficiency with Co-located vLLM in TRL +4 Jun 3 • 96
view article Article Training and Finetuning Sparse Embedding Models with Sentence Transformers v5 Jul 1 • 132
Absolute Zero: Reinforced Self-play Reasoning with Zero Data Paper • 2505.03335 • Published May 6 • 188
view article Article From Zero to Reasoning Hero: How DeepSeek-R1 Leverages Reinforcement Learning to Master Complex Reasoning Feb 4 • 16
view article Article Efficient LLM Pretraining: Packed Sequences and Masked Attention Oct 7, 2024 • 64
JudgeBench: A Benchmark for Evaluating LLM-based Judges Paper • 2410.12784 • Published Oct 16, 2024 • 47