view article Article Fine-tuning LLMs to 1.58bit: extreme quantization made easy +4 medmekk, marcsun13, lvwerra, pcuenq, osanseviero, thomwolf โข Sep 18, 2024 โข 281
Transformer-Lite: High-efficiency Deployment of Large Language Models on Mobile Phone GPUs Paper โข 2403.20041 โข Published Mar 29, 2024 โข 34