File size: 2,360 Bytes
29e0c58 ca9f586 29e0c58 ca9f586 29e0c58 ca9f586 29e0c58 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 |
---
license: apache-2.0
language:
- zh
- en
base_model:
- lingshu-medical-mllm/Lingshu-7B
pipeline_tag: image-text-to-text
metrics:
- bertscore
- bleu
library_name: transformers
tags:
- medical
---
# EchoVLM (paper implementation)
Official PyTorch implementation of the model described in
**"[EchoVLM: Dynamic Mixture-of-Experts Vision-Language Model for Universal Ultrasound Intelligence](https://arxiv.org/abs/2509.14977)"**.
## 🤖 Model Details
| Item | Value |
|-------------|-------------------------------------------------|
| Paper | [arXiv:2509.14977](https://arxiv.org/abs/2509.14977) |
| Authors | Chaoyin She¹, Ruifang Lu² |
| Code | [GitHub repo](https://github.com/Asunatan/EchoVLM) |
| Model Hub | [Hugging Face](https://huggingface.co/chaoyinshe/EchoVLM) |
## 🔄 Updates
- **Coming soon**: V2 with Chain-of-Thought reasoning and reinforcement learning enhancements—full training & inference code plus benchmark test-set will be fully open-sourced.
- **Dec 1, 2025**: To better promote development in this field, we've open-sourced our latest instruction fine-tuned model based on Lingshu-7B. Essentially built on Qwen2.5VL, it enjoys a better ecosystem—for example, it can seamlessly leverage vLLM for accelerated inference. Released model weights on [Hugging Face](https://huggingface.co/chaoyinshe/EchoVLM_V2_lingshu_base_7b_instruct_preview).
- **Sep 21, 2025**: The full, uncleaned model codebase is now open-sourced on GitHub!
- **Sep 19, 2025**: Released model weights on [Hugging Face](https://huggingface.co/chaoyinshe/EchoVLM).
- **Sep 17, 2025**: Paper published on [arXiv](https://arxiv.org/abs/2509.14977).
## 🚀 Quick Start
Reference [Qwen2.5-VL-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-VL-7B-Instruct)
## 📌 Citation
If you use this model or code in your research, please cite:
```bibtex
@misc{she2025echovlmdynamicmixtureofexpertsvisionlanguage,
title={EchoVLM: Dynamic Mixture-of-Experts Vision-Language Model for Universal Ultrasound Intelligence},
author={Chaoyin She and Ruifang Lu and Lida Chen and Wei Wang and Qinghua Huang},
year={2025},
eprint={2509.14977},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2509.14977},
}
``` |