RetroAgent: From Solving to Evolving via Retrospective Dual Intrinsic Feedback Paper • 2603.08561 • Published 4 days ago • 10
💧 LFM2.5 Collection Collection of Instruct, Base, and Japanese LFM2.5-1.2B models. • 22 items • Updated 17 days ago • 102
RLFR: Extending Reinforcement Learning for LLMs with Flow Environment Paper • 2510.10201 • Published Oct 11, 2025 • 36
view article Article nanoVLM: The simplest repository to train your VLM in pure PyTorch +5 May 21, 2025 • 252
view article Article From DeepSpeed to FSDP and Back Again with Hugging Face Accelerate +2 Jun 13, 2024 • 62
view article Article From PyTorch DDP to Accelerate to Trainer, mastery of distributed training with ease Oct 21, 2022 • 43
CoRe^2: Collect, Reflect and Refine to Generate Better and Faster Paper • 2503.09662 • Published Mar 12, 2025 • 33
SmolLM2 Collection State-of-the-art compact LLMs for on-device applications: 1.7B, 360M, 135M • 16 items • Updated May 5, 2025 • 305