yuxinlu1/gemma-4-12B-agentic-fable5-composer2.5-v2-3.5x-tau2-GGUF Text Generation • 12B • Updated 7 days ago • 187k • 672
Loc3R-VLM: Language-based Localization and 3D Reasoning with Vision-Language Models Paper • 2603.18002 • Published Mar 18 • 15 • 3
Qwen/Qwen3-VL-30B-A3B-Instruct Image-Text-to-Text • 31B • Updated Nov 26, 2025 • 593k • • 580
meta-llama/Llama-3.2-11B-Vision-Instruct Image-Text-to-Text • 11B • Updated Dec 4, 2024 • 46.9k • 1.61k