view article Article SyGra: The One-Stop Framework for Building Data for LLMs and SLMs ServiceNow-AI • Sep 22, 2025 • 14
view article Article Mixture of Experts (MoEs) in Transformers +5 ariG23498, pcuenq, merve, IlyasMoutawwakil, ArthurZ, sergiopaniego, Molbap • Feb 26 • 159
view article Article TRL v1.0: Post-Training Library Built to Move with the Field +2 qgallouedec, stevhliu, pcuenq, sergiopaniego • Mar 31 • 51
view article Article Multimodal Embedding & Reranker Models with Sentence Transformers tomaarsen • Apr 9 • 59
view article Article Efficient LLM Pretraining: Packed Sequences and Masked Attention sirluk • Oct 7, 2024 • 71
view article Article You could have designed state of the art positional encoding FL33TW00D-HF • Nov 25, 2024 • 478
view article Article SmolLM3: smol, multilingual, long-context reasoner +21 eliebak, cmpatino, anton-l, edbeeching, m-ric, nouamanetazi, akseljoonas, guipenedo, hynky, clefourrier, SaylorTwift, kashif, qgallouedec, hlarcher, glutamatt, Xenova, reach-vb, ngxson, craffel, lewtun, loubnabnl, lvwerra, thomwolf • Jul 8, 2025 • 775
view article Article 🦸🏻#14: What Is MCP, and Why Is Everyone – Suddenly!– Talking About It? Kseniase • Mar 17, 2025 • 357
Reasoning Datasets Collection Datasets with reasoning traces across various domains released by the community. • 15 items • Updated Jun 30, 2025 • 3
🧠 Reasoning datasets Collection Datasets with reasoning traces for math and code released by the community • 24 items • Updated May 19, 2025 • 189
Code Conversations Collection List of evol instruct based code dataset • 7 items • Updated Apr 30, 2024 • 3
view article Article SmolLM - blazingly fast and remarkably powerful +1 loubnabnl, anton-l, eliebak • Jul 16, 2024 • 455
view article Article StarCoder2-Instruct: Fully Transparent and Permissive Self-Alignment for Code Generation +7 yuxiang630, cassanof, ganler, YifengDing, StringChaos, harmdevries, lvwerra, arjunguha, lingming • Apr 29, 2024 • 79
view article Article StarCoder: A State-of-the-Art LLM for Code lvwerra, loubnabnl • May 4, 2023 • 73