HINT-lab
/

PosS3-Llama3-8B-Instruct

Add model card with metadata and sample usage

by nielsr HF Staff - opened Jun 6, 2025

←

Files changed (1) hide show

README.md CHANGED Viewed

@@ -1,6 +1,36 @@
-This is the PosS-3 model of the paper **PosS:Position Specialist Generates Better Draft for Speculative Decoding**
-If the code fails to auto-download the models, you may mannually download the following files.
 - `pytorch_model.bin`: Model weights
 - `config.json`: Model config

+---
+pipeline_tag: text-generation
+library_name: transformers
+license: apache-2.0
+---
+# POSS: Position Specialist Generates Better Draft for Speculative Decoding
+This repository contains the PosS-3 model described in the paper [POSS: Position Specialist Generates Better Draft for Speculative Decoding](https://arxiv.org/abs/2506.03566).
+**Authors:** [Langlin Huang](https://shrango.github.io/), [Chengsong Huang](https://chengsong-huang.github.io/), [Jixuan Leng](https://jixuanleng.com/), Di Huang, [Jiaxin Huang](https://teapot123.github.io/)
+The PosS model improves speculative decoding by using multiple position-specialized draft layers. This approach mitigates error accumulation in draft model-generated features, leading to improved token acceptance rates, especially at later positions.
+For code and further details, please refer to the GitHub repository: [https://github.com/shrango/PosS](https://github.com/shrango/PosS)
+If the code fails to auto-download the models, you may manually download the following files:
 - `pytorch_model.bin`: Model weights
 - `config.json`: Model config
+**Sample Usage (Inference):**
+The following command demonstrates how to use the model for inference (replace placeholders with actual paths):
+```bash
+python spec_decode.py \
+    --device-num 0 \
+    --target-model llama3-8b \
+    --method poss-3 \
+    --temperature 0 \
+    --total-token 60 \
+    --depth 6 \
+    --repeat-time 3 \
+    --dataset mt_bench
+```