Update README: benchmark positioning and citations
Browse files
README.md
CHANGED
|
@@ -6,20 +6,23 @@ base_model:
|
|
| 6 |
- Qwen/Qwen2.5-VL-7B-Instruct
|
| 7 |
---
|
| 8 |
|
| 9 |
-
# ViSpec
|
| 10 |
|
| 11 |
-
|
| 12 |
|
| 13 |
-
|
| 14 |
-
<a href="https://github.com/KangJialiang/ViSpec"><img src="https://img.shields.io/static/v1?label=GitHub&message=Code&color=blue&logo=github"></a>
|
| 15 |
|
| 16 |
-
|
| 17 |
|
| 18 |
-
|
|
|
|
|
|
|
| 19 |
|
| 20 |
## Citation
|
| 21 |
|
| 22 |
-
If you
|
|
|
|
|
|
|
| 23 |
|
| 24 |
```bibtex
|
| 25 |
@inproceedings{vispec,
|
|
@@ -28,4 +31,40 @@ If you find our work useful, please consider citing:
|
|
| 28 |
booktitle={Annual Conference on Neural Information Processing Systems},
|
| 29 |
year={2025}
|
| 30 |
}
|
| 31 |
-
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 6 |
- Qwen/Qwen2.5-VL-7B-Instruct
|
| 7 |
---
|
| 8 |
|
| 9 |
+
# ViSpec-Qwen2.5-VL-7B-Instruct (Benchmark Release)
|
| 10 |
|
| 11 |
+
This model repo is part of a **multimodal speculative decoding benchmark suite**.
|
| 12 |
|
| 13 |
+
## Why this repo exists
|
|
|
|
| 14 |
|
| 15 |
+
We maintain a unified benchmark codebase that includes multiple methods (Baseline, EAGLE, EAGLE2, Lookahead, MSD, ViSpec) so users can run training/evaluation more easily under one setup.
|
| 16 |
|
| 17 |
+
- The methods are aggregated here for **user convenience** (shared dataset format, scripts, and metrics).
|
| 18 |
+
- The original ideas and implementations belong to their respective authors.
|
| 19 |
+
- This specific Hugging Face repo hosts the **ViSpec-Qwen2.5-VL-7B-Instruct checkpoint** used in our benchmark runs.
|
| 20 |
|
| 21 |
## Citation
|
| 22 |
|
| 23 |
+
If you use this checkpoint and benchmark, please cite ViSpec and the original methods you compare against.
|
| 24 |
+
|
| 25 |
+
### ViSpec
|
| 26 |
|
| 27 |
```bibtex
|
| 28 |
@inproceedings{vispec,
|
|
|
|
| 31 |
booktitle={Annual Conference on Neural Information Processing Systems},
|
| 32 |
year={2025}
|
| 33 |
}
|
| 34 |
+
```
|
| 35 |
+
|
| 36 |
+
### EAGLE / EAGLE2 / EAGLE3
|
| 37 |
+
|
| 38 |
+
```bibtex
|
| 39 |
+
@inproceedings{li2024eagle,
|
| 40 |
+
author = {Yuhui Li and Fangyun Wei and Chao Zhang and Hongyang Zhang},
|
| 41 |
+
title = {{EAGLE}: Speculative Sampling Requires Rethinking Feature Uncertainty},
|
| 42 |
+
booktitle = {International Conference on Machine Learning},
|
| 43 |
+
year = {2024}
|
| 44 |
+
}
|
| 45 |
+
|
| 46 |
+
@inproceedings{li2024eagle2,
|
| 47 |
+
author = {Yuhui Li and Fangyun Wei and Chao Zhang and Hongyang Zhang},
|
| 48 |
+
title = {{EAGLE-2}: Faster Inference of Language Models with Dynamic Draft Trees},
|
| 49 |
+
booktitle = {Empirical Methods in Natural Language Processing},
|
| 50 |
+
year = {2024}
|
| 51 |
+
}
|
| 52 |
+
|
| 53 |
+
@inproceedings{li2025eagle3,
|
| 54 |
+
author = {Yuhui Li and Fangyun Wei and Chao Zhang and Hongyang Zhang},
|
| 55 |
+
title = {{EAGLE-3}: Scaling up Inference Acceleration of Large Language Models via Training-Time Test},
|
| 56 |
+
booktitle = {Annual Conference on Neural Information Processing Systems},
|
| 57 |
+
year = {2025}
|
| 58 |
+
}
|
| 59 |
+
```
|
| 60 |
+
|
| 61 |
+
### Other integrated baselines (links)
|
| 62 |
+
|
| 63 |
+
- Lookahead Decoding: https://lmsys.org/blog/2023-11-21-lookahead-decoding/
|
| 64 |
+
- MSD-LLaVA1.5-7B: https://huggingface.co/lucylyn/MSD-LLaVA1.5-7B
|
| 65 |
+
- Medusa: https://github.com/FasterDecoding/Medusa
|
| 66 |
+
|
| 67 |
+
## Notes
|
| 68 |
+
|
| 69 |
+
- This model card focuses on benchmark usage and attribution.
|
| 70 |
+
- For full benchmark code and scripts, please refer to the benchmark repository used in your experiment setup.
|