Cloudriver
/

MSD-Qwen2.5-VL-7B-Instruct

+---
+license: apache-2.0
+library_name: transformers
+pipeline_tag: image-text-to-text
+base_model:
+- Qwen/Qwen2.5-VL-7B-Instruct
+tags:
+- speculative-decoding
+- multimodal
+- qwen2-vl
+- mmspec
+---
+# MSD-Qwen2.5-VL-7B-Instruct (Benchmark Release)
+This model repo is part of a **multimodal speculative decoding benchmark suite**.
+## Why this repo exists
+We maintain a unified benchmark codebase that includes multiple methods (Baseline, EAGLE, EAGLE2, Lookahead, MSD, ViSpec) so users can run training/evaluation more easily under one setup.
+- The methods are aggregated here for **user convenience** (shared dataset format, scripts, and metrics).
+- The original ideas and implementations belong to their respective authors.
+- This specific Hugging Face repo hosts the **MSD-Qwen2.5-VL-7B-Instruct checkpoint** used in our benchmark runs.
+## Upstream / Base Model
+- Base model: `Qwen/Qwen2.5-VL-7B-Instruct`
+- Original MSD Qwen checkpoint: `lucylyn/MSD-Qwen2VL-7B-Instruct`
+## What is in this repo
+- `config.json`
+- `pytorch_model.bin`
+This checkpoint is intended to be loaded as the MSD speculative model together with the base model above (not as a standalone complete replacement for base model + processor/tokenizer assets).
+## Example usage (benchmark codebase)
+```bash
+python -m evaluation.eval_msd_mmspec \
+  --base-model-path Qwen/Qwen2.5-VL-7B-Instruct \
+  --msd-model-path Cloudriver/MSD-Qwen2.5-VL-7B-Instruct \
+  --data-folder dataset/MMSpec/testmini \
+  --answer-file results/mmspec_testmini/msd-temperature-0.jsonl \
+  --model-id msd-qwen2.5-vl-7b \
+  --temperature 0 \
+  --use-msd \
+  --total-token -1 \
+  --depth 5 \
+  --top-k 10
+```
+## Method references
+- MSD-LLaVA checkpoint: https://huggingface.co/lucylyn/MSD-LLaVA1.5-7B
+- MSD-Qwen checkpoint: https://huggingface.co/lucylyn/MSD-Qwen2VL-7B-Instruct
+- ViSpec: https://arxiv.org/abs/2509.15235
+- Lookahead Decoding: https://lmsys.org/blog/2023-11-21-lookahead-decoding/
+- Medusa: https://github.com/FasterDecoding/Medusa
+## Citation
+If you use this checkpoint and benchmark, please cite the original MSD method/checkpoint and the baseline methods you compare against.
+### EAGLE / EAGLE2 / EAGLE3
+```bibtex
+@inproceedings{li2024eagle,
+  author = {Yuhui Li and Fangyun Wei and Chao Zhang and Hongyang Zhang},
+  title = {{EAGLE}: Speculative Sampling Requires Rethinking Feature Uncertainty},
+  booktitle = {International Conference on Machine Learning},
+  year = {2024}
+}
+@inproceedings{li2024eagle2,
+  author = {Yuhui Li and Fangyun Wei and Chao Zhang and Hongyang Zhang},
+  title = {{EAGLE-2}: Faster Inference of Language Models with Dynamic Draft Trees},
+  booktitle = {Empirical Methods in Natural Language Processing},
+  year = {2024}
+}
+@inproceedings{li2025eagle3,
+  author = {Yuhui Li and Fangyun Wei and Chao Zhang and Hongyang Zhang},
+  title = {{EAGLE-3}: Scaling up Inference Acceleration of Large Language Models via Training-Time Test},
+  booktitle = {Annual Conference on Neural Information Processing Systems},
+  year = {2025}
+}
+```
+## Notes
+- This model card focuses on benchmark usage and attribution.
+- For full benchmark code and scripts, please refer to the benchmark repository used in your experiment setup.