ruipeterpan nielsr HF Staff commited on
Commit
fe26b26
·
verified ·
1 Parent(s): 431c02d

Add pipeline tag, library name and link paper (#1)

Browse files

- Add pipeline tag, library name and link paper (ca0363607f6c0e095b051de788d93b54a33a044a)


Co-authored-by: Niels Rogge <nielsr@users.noreply.huggingface.co>

Files changed (1) hide show
  1. README.md +13 -8
README.md CHANGED
@@ -1,14 +1,21 @@
1
  ---
2
- license: apache-2.0
3
  base_model:
4
  - Qwen/Qwen2.5-14B-Instruct
5
  datasets:
6
  - HuggingFaceH4/ultrachat_200k
 
 
 
7
  ---
 
8
  # Qwen2.5-14B-Instruct_EAGLE3_UltraChat
9
 
 
 
 
 
10
  ### Introduction
11
- **Qwen2.5-14B-Instruct_EAGLE3_UltraChat** is trained based on the open-source Qwen2.5-32B-Instruct model using the [SpecForge](https://github.com/sgl-project/SpecForge) framework,
12
  and can be used for the Eagle-3 speculative decoding algorithm to speed up the inference of large language models during the decoding stage.
13
 
14
 
@@ -48,10 +55,8 @@ We run our evaluations on two NVIDIA A6000-48GB GPUs connected via PCIe 4.0 x16.
48
  | **Qwen2.5-7B-Instruct** | 2.19x | 2.05x | 2.02x | 1.78x | 2.25x | **2.06x** |
49
 
50
 
51
- ### Relevant Link
52
-
53
- Qwen2.5-14B-Instruct Open-source Weights: https://huggingface.co/Qwen/Qwen2.5-14B-Instruct
54
-
55
- "Fail Fast, Win Big: Rethinking the Drafting Strategy in Speculative Decoding via Diffusion LLMs" [arXiv '25]: https://arxiv.org/pdf/2512.20573
56
 
57
- Artifact of FailFast: https://github.com/ruipeterpan/failfast
 
 
 
1
  ---
 
2
  base_model:
3
  - Qwen/Qwen2.5-14B-Instruct
4
  datasets:
5
  - HuggingFaceH4/ultrachat_200k
6
+ license: apache-2.0
7
+ pipeline_tag: text-generation
8
+ library_name: vllm
9
  ---
10
+
11
  # Qwen2.5-14B-Instruct_EAGLE3_UltraChat
12
 
13
+ This repository contains the EAGLE-3 draft model presented in the paper [Fail Fast, Win Big: Rethinking the Drafting Strategy in Speculative Decoding via Diffusion LLMs](https://huggingface.co/papers/2512.20573).
14
+
15
+ Code: [GitHub - FailFast](https://github.com/ruipeterpan/failfast)
16
+
17
  ### Introduction
18
+ **Qwen2.5-14B-Instruct_EAGLE3_UltraChat** is trained based on the open-source Qwen2.5-14B-Instruct model using the [SpecForge](https://github.com/sgl-project/SpecForge) framework,
19
  and can be used for the Eagle-3 speculative decoding algorithm to speed up the inference of large language models during the decoding stage.
20
 
21
 
 
55
  | **Qwen2.5-7B-Instruct** | 2.19x | 2.05x | 2.02x | 1.78x | 2.25x | **2.06x** |
56
 
57
 
58
+ ### Relevant Links
 
 
 
 
59
 
60
+ - Qwen2.5-14B-Instruct Open-source Weights: https://huggingface.co/Qwen/Qwen2.5-14B-Instruct
61
+ - "Fail Fast, Win Big: Rethinking the Drafting Strategy in Speculative Decoding via Diffusion LLMs" [arXiv '25]: https://arxiv.org/pdf/2512.20573
62
+ - Artifact of FailFast: https://github.com/ruipeterpan/failfast