view article Article How NuminaMath Won the 1st AIMO Progress Prize +6 yfleureau, liyongsea, edbeeching, lewtun, benlipkin, romansoletskyi, vwxyzjn, kashif • Jul 11, 2024 • 128
view article Article SmolLM - blazingly fast and remarkably powerful +1 loubnabnl, anton-l, eliebak • Jul 16, 2024 • 455
view article Article Introducing RWKV - An RNN with the advantages of a transformer +2 BlinkDL, Hazzzardous, sgugger, ybelkada • May 15, 2023 • 25
view article Article GaLore: Advancing Large Model Training on Consumer-grade Hardware +7 Titus-von-Koeller, jiaweizhao, mdouglas, hiyouga, ybelkada, muellerzr, amyeroberts, smangrul, BenjaminB • Mar 20, 2024 • 32
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits Paper • 2402.17764 • Published Feb 27, 2024 • 630
StableIdentity: Inserting Anybody into Anywhere at First Sight Paper • 2401.15975 • Published Jan 29, 2024 • 18