Qwen/Qwen3-VL-30B-A3B-Instruct-FP8 Image-Text-to-Text • 31B • Updated Nov 26, 2025 • 265k • 97
Qwen/Qwen3-VL-30B-A3B-Instruct Image-Text-to-Text • 31B • Updated Nov 26, 2025 • 887k • • 524
Qwen/Qwen3-VL-30B-A3B-Thinking Image-Text-to-Text • 31B • Updated Nov 26, 2025 • 410k • • 186
Qwen/Qwen3-VL-235B-A22B-Instruct-FP8 Image-Text-to-Text • 236B • Updated Nov 26, 2025 • 166k • 37
Qwen/Qwen3-VL-235B-A22B-Thinking-FP8 Image-Text-to-Text • 236B • Updated Nov 26, 2025 • 19.2k • 24
Lingshu: A Generalist Foundation Model for Unified Multimodal Medical Understanding and Reasoning Paper • 2506.07044 • Published Jun 8, 2025 • 113
InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models Paper • 2504.10479 • Published Apr 14, 2025 • 306
VideoLLaMA 2: Advancing Spatial-Temporal Modeling and Audio Understanding in Video-LLMs Paper • 2406.07476 • Published Jun 11, 2024 • 36
LongPO: Long Context Self-Evolution of Large Language Models through Short-to-Long Preference Optimization Paper • 2502.13922 • Published Feb 19, 2025 • 27