Add model card and metadata

#1
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +32 -1
README.md CHANGED
@@ -1 +1,32 @@
1
- arxiv.org/abs/2603.27027
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ library_name: transformers
4
+ pipeline_tag: text-generation
5
+ ---
6
+
7
+ # TAPS: Task-Aware Proposal Distributions for Speculative Sampling
8
+
9
+ [**Paper**](https://arxiv.org/abs/2603.27027) | [**Code**](https://github.com/Moe-Zbeeb/TAPS)
10
+
11
+ TAPS is a research framework investigating how draft training distributions shape speculative decoding quality. This repository contains a lightweight, single-layer Llama-style drafter (~0.8B parameters) used as a proposal model to accelerate inference for larger verifier models like Meta-Llama-3-8B-Instruct.
12
+
13
+ ## Overview
14
+ Speculative decoding speeds up autoregressive generation by letting a lightweight drafter propose tokens that a larger verifier checks in parallel. TAPS demonstrates that speculative decoding performance depends significantly on the alignment between the drafter's training data and the downstream workload.
15
+
16
+ Key findings include:
17
+ - **Task Specialization:** Drafters trained on specific domains (e.g., MathInstruct vs. ShareGPT) perform best on related benchmarks (MT-Bench vs. reasoning tasks).
18
+ - **Specialist Composition:** Specialists are better combined at inference time (via confidence routing or merged-tree verification) than through naive weight-space averaging.
19
+ - **Routing Signals:** Confidence is a robust signal for routing requests to the most appropriate specialized drafter.
20
+
21
+ ## Usage
22
+ For scripts to train, evaluate, or perform inference using these drafters (including HASS and EAGLE-2 variants), please visit the official [TAPS GitHub repository](https://github.com/Moe-Zbeeb/TAPS).
23
+
24
+ ## Citation
25
+ ```bibtex
26
+ @article{zbib2026taps,
27
+ title={TAPS: Task Aware Proposal Distributions for Speculative Sampling},
28
+ author={Zbib, Mohamad and Bazzi, Mohamad and Mohanna, Ammar and Ghanem, Bernard and Hammoud, Hasan Abed Al Kader},
29
+ year={2026},
30
+ note={Technical report}
31
+ }
32
+ ```