TokenButler TokenButler -- Predict token importance for all heads across the transformer in the first layer itself. Enable fine-grained token sparsity! akhauriyash/DeepSeek-R1-Distill-Llama-8B-Butler Image Feature Extraction • 8B • Updated Mar 31, 2025 • 6 akhauriyash/Llama-3.1-8B-Butler Text Generation • 8B • Updated May 13, 2025 • 3 akhauriyash/Llama-2-7b-hf-Butler Text Generation • 7B • Updated May 13, 2025 • 6 akhauriyash/Llama-3.2-3B-Butler Text Generation • Updated Mar 31, 2025 • 7
akhauriyash/DeepSeek-R1-Distill-Llama-8B-Butler Image Feature Extraction • 8B • Updated Mar 31, 2025 • 6
TokenButler TokenButler -- Predict token importance for all heads across the transformer in the first layer itself. Enable fine-grained token sparsity! akhauriyash/DeepSeek-R1-Distill-Llama-8B-Butler Image Feature Extraction • 8B • Updated Mar 31, 2025 • 6 akhauriyash/Llama-3.1-8B-Butler Text Generation • 8B • Updated May 13, 2025 • 3 akhauriyash/Llama-2-7b-hf-Butler Text Generation • 7B • Updated May 13, 2025 • 6 akhauriyash/Llama-3.2-3B-Butler Text Generation • Updated Mar 31, 2025 • 7
akhauriyash/DeepSeek-R1-Distill-Llama-8B-Butler Image Feature Extraction • 8B • Updated Mar 31, 2025 • 6