MSD-LLaVA1.5-7B / README.md

Cloudriver

Update model card to benchmark-release style

cd45729 verified about 2 months ago

preview code

raw

history blame contribute delete

2.35 kB

metadata

license: apache-2.0
library_name: transformers
pipeline_tag: image-text-to-text
base_model:
  - llava-hf/llava-1.5-7b-hf

MSD-LLaVA1.5-7B (Benchmark Release)

This model repo is part of a multimodal speculative decoding benchmark suite.

Why this repo exists

We maintain a unified benchmark codebase that includes multiple methods (Baseline, EAGLE, EAGLE2, Lookahead, MSD, ViSpec) so users can run training/evaluation more easily under one setup.

The methods are aggregated here for user convenience (shared dataset format, scripts, and metrics).
The original ideas and implementations belong to their respective authors.
This specific Hugging Face repo hosts the MSD-LLaVA1.5-7B checkpoint used in our benchmark runs.

Upstream / Base Model

Base model: llava-hf/llava-1.5-7b-hf
Original MSD checkpoint: lucylyn/MSD-LLaVA1.5-7B

Method References

MSD-LLaVA checkpoint: https://huggingface.co/lucylyn/MSD-LLaVA1.5-7B
ViSpec: https://arxiv.org/abs/2509.15235
Lookahead Decoding: https://lmsys.org/blog/2023-11-21-lookahead-decoding/
Medusa: https://github.com/FasterDecoding/Medusa

Citation

If you use this checkpoint and benchmark, please cite the original MSD method/checkpoint and the baseline methods you compare against.

EAGLE / EAGLE2 / EAGLE3

@inproceedings{li2024eagle,
  author = {Yuhui Li and Fangyun Wei and Chao Zhang and Hongyang Zhang},
  title = {{EAGLE}: Speculative Sampling Requires Rethinking Feature Uncertainty},
  booktitle = {International Conference on Machine Learning},
  year = {2024}
}

@inproceedings{li2024eagle2,
  author = {Yuhui Li and Fangyun Wei and Chao Zhang and Hongyang Zhang},
  title = {{EAGLE-2}: Faster Inference of Language Models with Dynamic Draft Trees},
  booktitle = {Empirical Methods in Natural Language Processing},
  year = {2024}
}

@inproceedings{li2025eagle3,
  author = {Yuhui Li and Fangyun Wei and Chao Zhang and Hongyang Zhang},
  title = {{EAGLE-3}: Scaling up Inference Acceleration of Large Language Models via Training-Time Test},
  booktitle = {Annual Conference on Neural Information Processing Systems},
  year = {2025}
}

Notes

This model card focuses on benchmark usage and attribution.
For full benchmark code and scripts, please refer to the benchmark repository used in your experiment setup.