Upload README.md with huggingface_hub
Browse files
README.md
CHANGED
|
@@ -75,6 +75,16 @@ Evaluated on a 128-sample held-out split during quantization, measured every 6 l
|
|
| 75 |
|
| 76 |
The final 2-bit quantized model retains perplexity within 0.015 of the dense baseline (< 1% relative degradation).
|
| 77 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 78 |
## Usage
|
| 79 |
|
| 80 |
This model requires **vLLM** for inference. Because Kimi-K2.5 uses a custom model architecture (`kimi_k25`), you must pass `--trust-remote-code`.
|
|
|
|
| 75 |
|
| 76 |
The final 2-bit quantized model retains perplexity within 0.015 of the dense baseline (< 1% relative degradation).
|
| 77 |
|
| 78 |
+
### Benchmark Results (lm-evaluation-harness)
|
| 79 |
+
|
| 80 |
+
| Benchmark | Metric | Score |
|
| 81 |
+
|---|---|---|
|
| 82 |
+
| GSM8K | exact_match (strict) | **92.57** |
|
| 83 |
+
| ARC-Challenge | acc_norm | **62.97** |
|
| 84 |
+
| ARC-Easy | acc_norm | **85.10** |
|
| 85 |
+
| PIQA | acc_norm | **82.37** |
|
| 86 |
+
| WinoGrande | acc | **76.95** |
|
| 87 |
+
|
| 88 |
## Usage
|
| 89 |
|
| 90 |
This model requires **vLLM** for inference. Because Kimi-K2.5 uses a custom model architecture (`kimi_k25`), you must pass `--trust-remote-code`.
|