Add model card with metadata and description
#1
by
nielsr HF Staff - opened
README.md
CHANGED
|
@@ -1,6 +1,18 @@
|
|
| 1 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
| 2 |
|
| 3 |
-
|
| 4 |
|
| 5 |
-
|
| 6 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
pipeline_tag: text-generation
|
| 3 |
+
library_name: transformers
|
| 4 |
+
license: apache-2.0
|
| 5 |
+
---
|
| 6 |
|
| 7 |
+
# PosS: Position Specialist Generates Better Draft for Speculative Decoding
|
| 8 |
|
| 9 |
+
This model, presented in [POSS: Position Specialist Generates Better Draft for Speculative Decoding](https://huggingface.co/papers/2506.03566), improves speculative decoding in Large Language Models (LLMs). PosS utilizes multiple position-specialized draft layers to generate tokens, mitigating error accumulation and improving the acceptance rate of later-position tokens.
|
| 10 |
+
|
| 11 |
+
**Key Features:**
|
| 12 |
+
|
| 13 |
+
* Position Specialists for improved token prediction accuracy at all positions.
|
| 14 |
+
* Enhanced average acceptance length and speed-up ratio compared to baseline methods.
|
| 15 |
+
|
| 16 |
+
**Code:** [https://github.com/shrango/PosS](https://github.com/shrango/PosS)
|
| 17 |
+
|
| 18 |
+
For detailed usage instructions, evaluation methods, and training details, please refer to the GitHub repository. Pre-trained weights are available for Llama-3-8B-Instruct and Llama-2-13B-chat.
|