EXL3 quants of gemma-4-31B-it-DFlash

2.50 bits per weight
3.00 bits per weight
3.50 bits per weight
4.00 bits per weight
5.00 bits per weight
6.00 bits per weight

Quant Mean acc. tokens¹
2.50 bpw 4.00
3.00 bpw 4.07
3.50 bpw 4.08
4.00 bpw 4.10
5.00 bpw 4.12
6.00 bpw 4.10
BF16 4.07

¹ Mean verified tokens per 15-token draft, CatBench at temp=0, using 4.00bpw target model on current exllamav3 dev branch (upcoming v0.0.33)

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for turboderp/gemma4-31b-it-DFlash-exl3

Quantized
(2)
this model

Collection including turboderp/gemma4-31b-it-DFlash-exl3