cnxup/Qwen2.5-VL-7B-MLA-stage1-rope32
8B
•
Updated
•
4
•
1
The MHA2MLA-VLM model published in the paper "MHA2MLA-VLM: Enabling DeepSeek's Economical Multi-Head Latent Attention across Vision-Language Models"