BitDistiller: Unleashing the Potential of Sub-4-Bit LLMs via Self-Distillation
Paper • 2402.10631 • Published • 2
| PPL | arc_easy | arc_challenge | piqa | winogrande | hellaswag | mmlu | QA Avg |
|---|---|---|---|---|---|---|---|
| 11.52 | 43.60 ± 1.02 | 23.46 ± 1.24 | 66.38 ± 1.10 | 52.01 ± 1.40 | 38.80 ± 0.49 | - | 44.85 |
Training method based on BitDistiller Paper
Base model
TinyLlama/TinyLlama_v1.1