soroushtabesh commited on
Commit
7aaaf63
·
verified ·
1 Parent(s): 842da8d

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +10 -0
README.md CHANGED
@@ -75,6 +75,16 @@ Evaluated on a 128-sample held-out split during quantization, measured every 6 l
75
 
76
  The final 2-bit quantized model retains perplexity within 0.015 of the dense baseline (< 1% relative degradation).
77
 
 
 
 
 
 
 
 
 
 
 
78
  ## Usage
79
 
80
  This model requires **vLLM** for inference. Because Kimi-K2.5 uses a custom model architecture (`kimi_k25`), you must pass `--trust-remote-code`.
 
75
 
76
  The final 2-bit quantized model retains perplexity within 0.015 of the dense baseline (< 1% relative degradation).
77
 
78
+ ### Benchmark Results (lm-evaluation-harness)
79
+
80
+ | Benchmark | Metric | Score |
81
+ |---|---|---|
82
+ | GSM8K | exact_match (strict) | **92.57** |
83
+ | ARC-Challenge | acc_norm | **62.97** |
84
+ | ARC-Easy | acc_norm | **85.10** |
85
+ | PIQA | acc_norm | **82.37** |
86
+ | WinoGrande | acc | **76.95** |
87
+
88
  ## Usage
89
 
90
  This model requires **vLLM** for inference. Because Kimi-K2.5 uses a custom model architecture (`kimi_k25`), you must pass `--trust-remote-code`.