Cloudriver commited on
Commit
20715f9
·
verified ·
1 Parent(s): 7b35b7c

Add model card README for MSD-Qwen2.5-VL-7B-Instruct

Browse files
Files changed (1) hide show
  1. README.md +94 -0
README.md ADDED
@@ -0,0 +1,94 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ library_name: transformers
4
+ pipeline_tag: image-text-to-text
5
+ base_model:
6
+ - Qwen/Qwen2.5-VL-7B-Instruct
7
+ tags:
8
+ - speculative-decoding
9
+ - multimodal
10
+ - qwen2-vl
11
+ - mmspec
12
+ ---
13
+
14
+ # MSD-Qwen2.5-VL-7B-Instruct (Benchmark Release)
15
+
16
+ This model repo is part of a **multimodal speculative decoding benchmark suite**.
17
+
18
+ ## Why this repo exists
19
+
20
+ We maintain a unified benchmark codebase that includes multiple methods (Baseline, EAGLE, EAGLE2, Lookahead, MSD, ViSpec) so users can run training/evaluation more easily under one setup.
21
+
22
+ - The methods are aggregated here for **user convenience** (shared dataset format, scripts, and metrics).
23
+ - The original ideas and implementations belong to their respective authors.
24
+ - This specific Hugging Face repo hosts the **MSD-Qwen2.5-VL-7B-Instruct checkpoint** used in our benchmark runs.
25
+
26
+ ## Upstream / Base Model
27
+
28
+ - Base model: `Qwen/Qwen2.5-VL-7B-Instruct`
29
+ - Original MSD Qwen checkpoint: `lucylyn/MSD-Qwen2VL-7B-Instruct`
30
+
31
+ ## What is in this repo
32
+
33
+ - `config.json`
34
+ - `pytorch_model.bin`
35
+
36
+ This checkpoint is intended to be loaded as the MSD speculative model together with the base model above (not as a standalone complete replacement for base model + processor/tokenizer assets).
37
+
38
+ ## Example usage (benchmark codebase)
39
+
40
+ ```bash
41
+ python -m evaluation.eval_msd_mmspec \
42
+ --base-model-path Qwen/Qwen2.5-VL-7B-Instruct \
43
+ --msd-model-path Cloudriver/MSD-Qwen2.5-VL-7B-Instruct \
44
+ --data-folder dataset/MMSpec/testmini \
45
+ --answer-file results/mmspec_testmini/msd-temperature-0.jsonl \
46
+ --model-id msd-qwen2.5-vl-7b \
47
+ --temperature 0 \
48
+ --use-msd \
49
+ --total-token -1 \
50
+ --depth 5 \
51
+ --top-k 10
52
+ ```
53
+
54
+ ## Method references
55
+
56
+ - MSD-LLaVA checkpoint: https://huggingface.co/lucylyn/MSD-LLaVA1.5-7B
57
+ - MSD-Qwen checkpoint: https://huggingface.co/lucylyn/MSD-Qwen2VL-7B-Instruct
58
+ - ViSpec: https://arxiv.org/abs/2509.15235
59
+ - Lookahead Decoding: https://lmsys.org/blog/2023-11-21-lookahead-decoding/
60
+ - Medusa: https://github.com/FasterDecoding/Medusa
61
+
62
+ ## Citation
63
+
64
+ If you use this checkpoint and benchmark, please cite the original MSD method/checkpoint and the baseline methods you compare against.
65
+
66
+ ### EAGLE / EAGLE2 / EAGLE3
67
+
68
+ ```bibtex
69
+ @inproceedings{li2024eagle,
70
+ author = {Yuhui Li and Fangyun Wei and Chao Zhang and Hongyang Zhang},
71
+ title = {{EAGLE}: Speculative Sampling Requires Rethinking Feature Uncertainty},
72
+ booktitle = {International Conference on Machine Learning},
73
+ year = {2024}
74
+ }
75
+
76
+ @inproceedings{li2024eagle2,
77
+ author = {Yuhui Li and Fangyun Wei and Chao Zhang and Hongyang Zhang},
78
+ title = {{EAGLE-2}: Faster Inference of Language Models with Dynamic Draft Trees},
79
+ booktitle = {Empirical Methods in Natural Language Processing},
80
+ year = {2024}
81
+ }
82
+
83
+ @inproceedings{li2025eagle3,
84
+ author = {Yuhui Li and Fangyun Wei and Chao Zhang and Hongyang Zhang},
85
+ title = {{EAGLE-3}: Scaling up Inference Acceleration of Large Language Models via Training-Time Test},
86
+ booktitle = {Annual Conference on Neural Information Processing Systems},
87
+ year = {2025}
88
+ }
89
+ ```
90
+
91
+ ## Notes
92
+
93
+ - This model card focuses on benchmark usage and attribution.
94
+ - For full benchmark code and scripts, please refer to the benchmark repository used in your experiment setup.