Edit Models filters

Model Tree

Apps

Docker Model Runner

Inference Providers

OVHcloud AI Endpoints

HF Inference API

Misc

Inference Endpoints

text-generation-inference

Eval Results (legacy)

text-embeddings-inference

4-bit precision

8-bit precision

Mixture of Experts

Carbon Emissions

Models

289

Base only

Active filters: VLM

NemoStation/Marlin-2B

Video-Text-to-Text • 2B • Updated about 15 hours ago • 15.8k • 454

numind/NuMarkdown-8B-Thinking

Image-to-Text • 8B • Updated 11 days ago • 38.5k • 472

lunahr/Marlin-2B-ungated

Video-Text-to-Text • 2B • Updated 9 days ago • 1.21k • 5

nvidia/NVIDIA-Nemotron-Parse-v1.2

Image-Text-to-Text • 0.9B • Updated 25 days ago • 141k • 37

mPLUG/ToolCUA-8B

Image-Text-to-Text • 9B • Updated 18 days ago • 112 • 5

nvidia/Eagle2-2B

Image-Text-to-Text • 2B • Updated Apr 27, 2025 • 563 • 34

nvidia/Eagle2-1B

Image-Text-to-Text • 1B • Updated Apr 27, 2025 • 2.38k • 30

nvidia/VILA-HD-8B-PS3-1.5K-SigLIP

Image-Text-to-Text • Updated Jul 30, 2025 • 58 • 4

nvidia/VILA-HD-8B-PS3-4K-SigLIP

Image-Text-to-Text • Updated Jul 30, 2025 • 62 • 2

nvidia/Llama-3.1-Nemotron-Nano-VL-8B-V1

Image-Text-to-Text • 9B • Updated Dec 4, 2025 • 1.21M • 179

nvidia/VILA-HD-8B-PS3-1.5K-SigLIP2

Image-Text-to-Text • Updated Jul 30, 2025 • 487 • 1

nvidia/VILA-HD-8B-PS3-4K-SigLIP2

Image-Text-to-Text • Updated Jul 30, 2025 • 55 • 3

nvidia/VILA-HD-8B-PS3-1.5K-C-RADIOv2

Image-Text-to-Text • Updated Jul 30, 2025 • 58 • 1

nvidia/VILA-HD-8B-PS3-4K-C-RADIOv2

Image-Text-to-Text • Updated Jul 30, 2025 • 60 • 1

nvidia/NVIDIA-Nemotron-Nano-12B-v2-VL-BF16

Image-Text-to-Text • 13B • Updated Dec 2, 2025 • 161k • 83

nvidia/NVIDIA-Nemotron-Nano-12B-v2-VL-NVFP4-QAD

Image-Text-to-Text • 8B • Updated 25 days ago • 4.17k • 27

HongxinLi/GoClick-Large

Image-Text-to-Text • 0.8B • Updated 21 days ago • 179 • 1

mradermacher/ToolCUA-8B-GGUF

8B • Updated 17 days ago • 779 • 2

adnankhan-11/VisionNav-3B

4B • Updated 15 days ago • 123 • 1

mradermacher/VisionNav-3B-GGUF

3B • Updated 12 days ago • 547 • 1

rohanshad/cmr_c0.1

Updated Mar 25 • 3

Efficient-Large-Model/VILA-13b

Text Generation • 13B • Updated Mar 4, 2024 • 24 • 20

Efficient-Large-Model/VILA-7b

Text Generation • 7B • Updated Mar 4, 2024 • 602 • 27

Efficient-Large-Model/VILA-7b-4bit-awq

Text Generation • Updated Mar 4, 2024 • 14 • 2

Efficient-Large-Model/VILA-13b-4bit-awq

Text Generation • Updated Mar 4, 2024 • 13 • 2

Efficient-Large-Model/VILA-2.7b

Text Generation • 3B • Updated Mar 4, 2024 • 139 • 15

TIGER-Lab/Mantis-bakllava-7b

Image-Text-to-Text • 8B • Updated May 18, 2024 • 50 • 5

TIGER-Lab/Mantis-llava-7b

Image-Text-to-Text • 7B • Updated May 18, 2024 • 23 • 16

Efficient-Large-Model/VILA1.5-3b

Text Generation • Updated Jul 18, 2024 • 1.58k • 34

Efficient-Large-Model/VILA1.5-13b

Text Generation • Updated Jul 18, 2024 • 257 • 5