BitDistiller: Unleashing the Potential of Sub-4-Bit LLMs via Self-Distillation
Paper • 2402.10631 • Published • 2
| PPL | arc_easy | arc_challenge | piqa | winogrande | hellaswag | mmlu | QA Avg |
|---|---|---|---|---|---|---|---|
| 7.71 | 61.45 ± 1.00 | 30.89 ± 1.35 | 72.63 ± 1.04 | 59.59 ± 1.38 | 46.70 ± 0.50 | - | 54.25 |
Training method based on BitDistiller Paper
Base model
TinyLlama/TinyLlama_v1.1