Qwen/Qwen3-VL-30B-A3B-Instruct-FP8 Image-Text-to-Text • 31B • Updated 16 days ago • 204k • 90
Qwen/Qwen3-VL-30B-A3B-Instruct Image-Text-to-Text • 31B • Updated 16 days ago • 1.27M • • 437
Qwen/Qwen3-VL-30B-A3B-Thinking Image-Text-to-Text • 31B • Updated 16 days ago • 57.4k • • 163
Qwen/Qwen3-VL-235B-A22B-Instruct-FP8 Image-Text-to-Text • 236B • Updated 16 days ago • 280k • 32
Qwen/Qwen3-VL-235B-A22B-Thinking-FP8 Image-Text-to-Text • 236B • Updated 16 days ago • 12.4k • 24
Qwen/Qwen3-VL-235B-A22B-Instruct Image-Text-to-Text • 236B • Updated 16 days ago • 110k • • 330
Qwen/Qwen3-VL-235B-A22B-Thinking Image-Text-to-Text • 236B • Updated 16 days ago • 6.23k • • 341
Lingshu: A Generalist Foundation Model for Unified Multimodal Medical Understanding and Reasoning Paper • 2506.07044 • Published Jun 8 • 114
InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models Paper • 2504.10479 • Published Apr 14 • 303
VideoLLaMA 2: Advancing Spatial-Temporal Modeling and Audio Understanding in Video-LLMs Paper • 2406.07476 • Published Jun 11, 2024 • 37
LongPO: Long Context Self-Evolution of Large Language Models through Short-to-Long Preference Optimization Paper • 2502.13922 • Published Feb 19 • 28