view article Article How to generate text: using different decoding methods for language generation with Transformers patrickvonplaten • Mar 1, 2020 • 297
view article Article nanoJAXGPT: A pedagogical introduction to JAX/Equinox sachithgunasekara • Oct 23, 2024 • 7
view article Article Llama can now see and run on your device - welcome Llama 3.2 +5 merve, philschmid, osanseviero, reach-vb, lewtun, ariG23498, pcuenq • Sep 25, 2024 • 191
view article Article 🌟 Easy Fine-Tuning with Hugging Face SQL Console, Notebook Creator, and SFT asoria • Sep 24, 2024 • 14
How FaR Are Large Language Models From Agents with Theory-of-Mind? Paper • 2310.03051 • Published Oct 4, 2023 • 35
Training Language Models to Self-Correct via Reinforcement Learning Paper • 2409.12917 • Published Sep 19, 2024 • 140
view article Article Mixture of Experts Explained +4 osanseviero, lewtun, philschmid, smangrul, ybelkada, pcuenq • Dec 11, 2023 • 1.13k
view article Article Making LLMs even more accessible with bitsandbytes, 4-bit quantization and QLoRA +3 ybelkada, timdettmers, artidoro, sgugger, smangrul • May 24, 2023 • 180
Open-Bezoar Collection Small, Cost-Effective and Open Models Trained on Mixes of Instruction Data • 7 items • Updated Apr 19, 2024 • 6