Add pipeline tag, library name and link paper (#1)

- Add pipeline tag, library name and link paper (ca0363607f6c0e095b051de788d93b54a33a044a)

Co-authored-by: Niels Rogge <nielsr@users.noreply.huggingface.co>

Files changed (1) hide show

README.md CHANGED Viewed

@@ -1,14 +1,21 @@
 ---
-license: apache-2.0
 base_model:
 - Qwen/Qwen2.5-14B-Instruct
 datasets:
 - HuggingFaceH4/ultrachat_200k
 ---
 # Qwen2.5-14B-Instruct_EAGLE3_UltraChat
 ### Introduction
-**Qwen2.5-14B-Instruct_EAGLE3_UltraChat** is trained based on the open-source Qwen2.5-32B-Instruct model using the [SpecForge](https://github.com/sgl-project/SpecForge) framework,
 and can be used for the Eagle-3 speculative decoding algorithm to speed up the inference of large language models during the decoding stage.
@@ -48,10 +55,8 @@ We run our evaluations on two NVIDIA A6000-48GB GPUs connected via PCIe 4.0 x16.
 | **Qwen2.5-7B-Instruct**  | 2.19x | 2.05x | 2.02x | 1.78x | 2.25x | **2.06x** |
-### Relevant Link
-Qwen2.5-14B-Instruct Open-source Weights: https://huggingface.co/Qwen/Qwen2.5-14B-Instruct
-"Fail Fast, Win Big: Rethinking the Drafting Strategy in Speculative Decoding via Diffusion LLMs" [arXiv '25]: https://arxiv.org/pdf/2512.20573
-Artifact of FailFast: https://github.com/ruipeterpan/failfast

 ---
 base_model:
 - Qwen/Qwen2.5-14B-Instruct
 datasets:
 - HuggingFaceH4/ultrachat_200k
+license: apache-2.0
+pipeline_tag: text-generation
+library_name: vllm
 ---
 # Qwen2.5-14B-Instruct_EAGLE3_UltraChat
+This repository contains the EAGLE-3 draft model presented in the paper [Fail Fast, Win Big: Rethinking the Drafting Strategy in Speculative Decoding via Diffusion LLMs](https://huggingface.co/papers/2512.20573).
+Code: [GitHub - FailFast](https://github.com/ruipeterpan/failfast)
 ### Introduction
+**Qwen2.5-14B-Instruct_EAGLE3_UltraChat** is trained based on the open-source Qwen2.5-14B-Instruct model using the [SpecForge](https://github.com/sgl-project/SpecForge) framework,
 and can be used for the Eagle-3 speculative decoding algorithm to speed up the inference of large language models during the decoding stage.
 | **Qwen2.5-7B-Instruct**  | 2.19x | 2.05x | 2.02x | 1.78x | 2.25x | **2.06x** |
+### Relevant Links
+- Qwen2.5-14B-Instruct Open-source Weights: https://huggingface.co/Qwen/Qwen2.5-14B-Instruct
+- "Fail Fast, Win Big: Rethinking the Drafting Strategy in Speculative Decoding via Diffusion LLMs" [arXiv '25]: https://arxiv.org/pdf/2512.20573
+- Artifact of FailFast: https://github.com/ruipeterpan/failfast