Step-3.5-Flash-exl3 / README.md
turboderp's picture
Update README.md
868d323 verified
---
license: apache-2.0
base_model: stepfun-ai/Step-3.5-Flash
base_model_relation: quantized
quantized_by: turboderp
tags:
- exl3
---
EXL3 quants of [Step-3.5-Flash](https://huggingface.co/stepfun-ai/Step-3.5-Flash)
⚠️ Requires ExLlamaV3 v0.0.23 (or v0.0.22 `dev` branch)
Base bitrates:
[2.00 bits per weight](https://huggingface.co/turboderp/Step-3.5-Flash-exl3/tree/2.00bpw)
[3.00 bits per weight](https://huggingface.co/turboderp/Step-3.5-Flash-exl3/tree/3.00bpw)
[4.00 bits per weight](https://huggingface.co/turboderp/Step-3.5-Flash-exl3/tree/4.00bpw)
Optimized:
[2.08 bits per weight](https://huggingface.co/turboderp/Step-3.5-Flash-exl3/tree/2.08bpw)
[3.05 bits per weight](https://huggingface.co/turboderp/Step-3.5-Flash-exl3/tree/3.05bpw)
*(more coming soon)*
. | Ppl¹ | KL-div
---------|--------|---------
2.00 bpw | 2.629 | 0.653
2.08 bpw | 2.154 | 0.466
3.00 bpw | 1.521 | 0.142
3.05 bpw | 1.478 | 0.118
4.00 bpw | 1.379 | 0.053
Original | 1.336 |
¹ (10 rows of wikitext2)