vision - a shobbs Collection

shobbs 's Collections

Mobile use aka smart phone actions dataset

think and learn

video llm llava

vision

updated 5 days ago

google/paligemma2-28b-pt-896

Image-Text-to-Text • 28B • Updated Dec 5, 2024 • 451 • 52
lmstudio-community/olmOCR-7B-0225-preview-GGUF

Image-Text-to-Text • 8B • Updated Feb 25, 2025 • 431 • 13
vidore/colqwen2.5-v0.2

Visual Document Retrieval • Updated Jun 16, 2025 • 40.6k • 99
vidore/colpali-v1.3

Visual Document Retrieval • Updated Mar 14, 2025 • 17.6k • 99
vidore/colSmol-500M

Visual Document Retrieval • Updated Mar 14, 2025 • 1.5k • 22
deepseek-ai/deepseek-vl2

Image-Text-to-Text • 27B • Updated Dec 18, 2024 • 2.75k • 388
Sleeping

Agents

6

gen2seg: Generative Models Enable Generalizable Instance Segmentation

🚀

6

A demo of our gen2seg SD and MAE-H models.
nvidia/NitroGen

Reinforcement Learning • Updated Feb 5 • 562
naver-hyperclovax/HyperCLOVAX-SEED-Omni-8B

Text Generation • 11B • Updated Jan 6 • 389 • 192
microsoft/Phi-4-reasoning-vision-15B

Image-Text-to-Text • 15B • Updated Mar 18 • 6.31k • 175
nvidia/asset-harvester

Image-to-3D • Updated 13 days ago • 600 • 44
facebook/sam3.1

Mask Generation • Updated Mar 27 • 84.7k • 495
MeshWeaver: Sparse-Voxel-Guided Surface Weaving for Autoregressive Mesh Generation

Paper • 2606.04688 • Published Jun 3 • 5
microsoft/Mage-VL

Image-Text-to-Text • 5B • Updated 5 days ago • 431k • 211