view article Article Welcome Llama 4 Maverick & Scout on Hugging Face +5 burtenshaw, reach-vb, pcuenq, clem, rajatarya, jsulz, lysandre • Apr 5, 2025 • 149
Aya Model: An Instruction Finetuned Open-Access Multilingual Language Model Paper • 2402.07827 • Published Feb 12, 2024 • 48
view article Article Google releases Gemma 2 2B, ShieldGemma and Gemma Scope +2 Xenova, pcuenq, reach-vb, joaogante • Jul 31, 2024 • 60
🪐 SmolLM Collection A series of smol LLMs: 135M, 360M and 1.7B. We release base and Instruct models as well as the training corpus and some WebGPU demos • 12 items • Updated May 5, 2025 • 251
view article Article SmolLM - blazingly fast and remarkably powerful +1 loubnabnl, anton-l, eliebak • Jul 16, 2024 • 455
Evaluating Open Language Models Across Task Types, Application Domains, and Reasoning Types: An In-Depth Experimental Analysis Paper • 2406.11402 • Published Jun 17, 2024 • 6
view article Article Extracting Concepts from LLMs: Anthropic’s recent discoveries 📖 m-ric • Jun 20, 2024 • 26
Stacking Your Transformers: A Closer Look at Model Growth for Efficient LLM Pre-Training Paper • 2405.15319 • Published May 24, 2024 • 28
Sparse Upcycling: Training Mixture-of-Experts from Dense Checkpoints Paper • 2212.05055 • Published Dec 9, 2022 • 6