TokenButler TokenButler -- Predict token importance for all heads across the transformer in the first layer itself. Enable fine-grained token sparsity! akhauriyash/DeepSeek-R1-Distill-Llama-8B-Butler Feature Extraction • 8B • Updated Mar 31, 2025 • 4 akhauriyash/Llama-3.1-8B-Butler Text Generation • 8B • Updated May 13, 2025 • 7 akhauriyash/Llama-2-7b-hf-Butler Text Generation • 7B • Updated May 13, 2025 • 8 akhauriyash/Llama-3.2-3B-Butler Text Generation • Updated Mar 31, 2025 • 28
TokenButler TokenButler -- Predict token importance for all heads across the transformer in the first layer itself. Enable fine-grained token sparsity! akhauriyash/DeepSeek-R1-Distill-Llama-8B-Butler Feature Extraction • 8B • Updated Mar 31, 2025 • 4 akhauriyash/Llama-3.1-8B-Butler Text Generation • 8B • Updated May 13, 2025 • 7 akhauriyash/Llama-2-7b-hf-Butler Text Generation • 7B • Updated May 13, 2025 • 8 akhauriyash/Llama-3.2-3B-Butler Text Generation • Updated Mar 31, 2025 • 28