Vision LLM - a SamoXXX Collection

SamoXXX 's Collections

LLMs Train Datasets

Image and 3D GenAI/Reconstruct

Vision LLM

updated 24 days ago

Collecting best Vision LLMs - to study and learn from them

rhymes-ai/Aria

Image-Text-to-Text • 25B • Updated Apr 23, 2025 • 120k • 638
microsoft/OmniParser

Image-Text-to-Text • Updated Dec 2, 2024 • 268 • 1.71k
jadechoghari/Ferret-UI-Gemma2b

Image-Text-to-Text • 3B • Updated Oct 18, 2024 • 139 • 52
jadechoghari/Ferret-UI-Llama8b

Image-Text-to-Text • 8B • Updated Jan 8, 2025 • 25 • 68
gpt-omni/mini-omni2

Any-to-Any • Updated Oct 24, 2024 • 114 • 285
mPLUG/DocOwl2

Image-Text-to-Text • 9B • Updated Sep 27, 2024 • 151 • 116
google/siglip-so400m-patch16-256-i18n

Zero-Shot Image Classification • 1B • Updated Nov 18, 2024 • 1.79k • 31
openvla/openvla-7b

Robotics • 8B • Updated Feb 17 • 1.75M • 234
NexaAI/OmniVLM-968M

0.5B • Updated Aug 20, 2025 • 995 • 531
Qwen/Qwen2.5-VL-7B-Instruct

Image-Text-to-Text • 8B • Updated Apr 6, 2025 • 9.59M • • 1.6k
ByteDance-Seed/UI-TARS-7B-SFT

Image-Text-to-Text • 8B • Updated Jan 25, 2025 • 1.58k • 180
moonshotai/Kimi-VL-A3B-Instruct

Image-Text-to-Text • 16B • Updated Jan 30 • 322k • 270
reducto/RolmOCR

Image-Text-to-Text • 8B • Updated Apr 2, 2025 • 218k • 587
nvidia/LocateAnything-3B

Image-Text-to-Text • 4B • Updated 15 days ago • 570k • 2.4k
Hcompany/Holo-3.1-4B

Image-Text-to-Text • 5B • Updated 1 day ago • 4.84k • 83