Running Featured 1.38k FineWeb: decanting the web for the finest text data at scale 🍷 1.38k Explore and download the FineWeb web‑scale text dataset
view article Article How We Built a Semantic Highlight Model To Save Token Cost for RAG zilliz • Jan 15 • 67
💧 LFM2 Collection LFM2 is a new generation of hybrid models, designed for on-device deployment. • 28 items • Updated 5 days ago • 154
view article Article Continuous batching from first principles +1 ror, ArthurZ, mcpotato • Nov 25, 2025 • 411
madhurjindal/autonlp-Gibberish-Detector-492513457 Text Classification • 67M • Updated May 14, 2025 • 82.2k • • 69
view article Article SmolLM - blazingly fast and remarkably powerful +1 loubnabnl, anton-l, eliebak • Jul 16, 2024 • 460
view article Article Open-source DeepResearch – Freeing our search agents +3 m-ric, albertvillanova, merve, thomwolf, clefourrier • Feb 4, 2025 • 1.32k
Notebooks Collection A collection of notebooks demonstrating PEFT methods applied to various tasks • 9 items • Updated Jan 9, 2024 • 16