eerrr9 commited on
Commit
cacfb26
·
verified ·
1 Parent(s): 70e2688

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +48 -3
README.md CHANGED
@@ -1,3 +1,48 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ ---
4
+
5
+ ## Model Overview
6
+
7
+ **Kimi-K2-Instruct-eagle3** is a specialized draft model designed to accelerate the inference of the Kimi-K2-Instruct ecosystem using the **EAGLE3 (Extrapolation Algorithm for Greater Language-model Efficiency)** framework.
8
+
9
+ Built upon the **Llama architecture**, this model acts as a highly efficient drafter. It has been trained on **1.4 million high-quality samples** from the **Open-PerfectBlend** dataset, ensuring strict alignment with the teacher model's distribution.
10
+
11
+ This model serves as a general-purpose English instruction follower with strong capabilities in:
12
+ * **Conversation**
13
+ * **Mathematical Reasoning**
14
+ * **Code Generation**
15
+
16
+ ## Performance & Acceleration
17
+
18
+ The core value of this EAGLE model is its ability to predict multiple future tokens that are subsequently verified by the base model. High acceptance lengths indicate significant latency reduction.
19
+
20
+ **Average Token Acceptance Lengths (MLA):**
21
+
22
+ | Benchmark | Average Acceptance Length |
23
+ | :--- | :--- |
24
+ | **HumanEval** (Code) | **3.372** |
25
+ | **GSM8K** (Math) | **3.165** |
26
+ | **Math500** (Complex Math) | **3.490** |
27
+
28
+ These metrics demonstrate robust acceleration performance across diverse and complex domains.
29
+
30
+ ![1](https://hackmd.io/_uploads/ryP6cBLXbg.png)
31
+ ![2](https://hackmd.io/_uploads/S1Da5BLmbl.png)
32
+ ![3](https://hackmd.io/_uploads/S1v65HIm-e.png)
33
+
34
+
35
+ ## Training Data
36
+
37
+ The model was trained on **1.4 million samples** sourced from the **Open-PerfectBlend** dataset. The data selection prioritizes high-quality instruction-following scenarios to maximize the draft model's predictive accuracy relative to the base model.
38
+
39
+ ## Citation
40
+
41
+ If you use this model in your research or application, please cite the following:
42
+
43
+ ```bibtex
44
+ @misc{kimik2eagle3,
45
+ title={Kimi-K2-Instruct-eagle3: Accelerating Instruction Following with EAGLE},
46
+ author={Ant AQ Team},
47
+ year={2025},
48
+ }