DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning Paper • 2501.12948 • Published Jan 22, 2025 • 449
view changelog Hugging Face Changelog Connect Your MCP Client to the Hugging Face Hub Jun 6, 2025 • 114
view article Article No GPU left behind: Unlocking Efficiency with Co-located vLLM in TRL +4 toslali-ibm, mirinflim, qgallouedec, esnible, rganti, mudhakar • Jun 3, 2025 • 101
sarvam-m Collection Collection of all variations of the sarvam-m model • 3 items • Updated May 24, 2025 • 28
view article Article A Gentle Introduction to 8-bit Matrix Multiplication for transformers at scale using transformers, accelerate and bitsandbytes ybelkada, timdettmers • Aug 17, 2022 • 132
view article Article Making LLMs even more accessible with bitsandbytes, 4-bit quantization and QLoRA +3 ybelkada, timdettmers, artidoro, sgugger, smangrul • May 24, 2023 • 180
view article Article Llama 3.1 - 405B, 70B & 8B with multilinguality and long context +6 philschmid, osanseviero, alvarobartt, lvwerra, dvilasuero, reach-vb, marcsun13, pcuenq • Jul 23, 2024 • 241