Edit Models filters

Model Tree

Apps

Docker Model Runner

Inference Providers

OVHcloud AI Endpoints

HF Inference API

Misc

Inference Endpoints

text-generation-inference

Eval Results (legacy)

text-embeddings-inference

4-bit precision

8-bit precision

Mixture of Experts

Carbon Emissions

Models

1,916

Base only

Active filters: nvidia

alexcovo/NVIDIA-Nemotron-Nano-12B-v2-Q4_K_M-GGUF

Text Generation • 12B • Updated Aug 31, 2025 • 6

cyankiwi/NVIDIA-Nemotron-Nano-9B-v2-AWQ-4bit

Text Generation • 2B • Updated Aug 31, 2025 • 649 • 3

cyankiwi/NVIDIA-Nemotron-Nano-12B-v2-AWQ-4bit

Text Generation • 3B • Updated Sep 13, 2025 • 2.64k • 4

cyankiwi/NVIDIA-Nemotron-Nano-12B-v2-AWQ-8bit

Text Generation • 4B • Updated Sep 13, 2025 • 1.97k • 1

cyankiwi/NVIDIA-Nemotron-Nano-9B-v2-AWQ-8bit

Text Generation • 3B • Updated Aug 31, 2025 • 142

QuantFactory/OpenReasoning-Nemotron-7B-GGUF

Text Generation • 8B • Updated Sep 1, 2025 • 58 • 2

nvidia/Qwen3-235B-A22B-Thinking-2507-Eagle3

Text Generation • 0.3B • Updated Mar 10 • 144 • 2

nvidia/Qwen3-30B-A3B-Thinking-2507-Eagle3

Text Generation • 0.1B • Updated Mar 10 • 42 • 4

Lumia101/NVIDIA-Nemotron-Nano-9B-v2-Q4_K_M-GGUF

Text Generation • 9B • Updated Nov 28, 2025 • 78

mlx-community/NVIDIA-Nemotron-Nano-9B-v2-6bit

Text Generation • Updated Sep 5, 2025 • 300 • 1

Mungert/NVIDIA-Nemotron-Nano-12B-v2-GGUF

Text Generation • 12B • Updated Sep 24, 2025 • 213 • 2

nvidia/Phi-4-multimodal-instruct-NVFP4

4B • Updated Sep 5, 2025 • 1.8k • 12

nvidia/Phi-4-multimodal-instruct-FP8

6B • Updated Sep 5, 2025 • 230 • 7

nvidia/Phi-4-reasoning-plus-FP8

15B • Updated Sep 15, 2025 • 129 • 7

nvidia/Phi-4-reasoning-plus-NVFP4

8B • Updated Sep 5, 2025 • 614k • 11

nvidia/Cosmos-Predict2.5-14B

Updated 10 days ago • 11.4k • 31

DBMe/Llama-3_1-Nemotron-Ultra-253B-v1-exl3-2.7bpw

Text Generation • 45B • Updated Sep 6, 2025 • 2

maxrubin629/Nemotron-H-8B-Reasoning-128K-6bit

Text Generation • 8B • Updated Sep 6, 2025 • 12

straino/NVIDIA-Nemotron-Nano-9B-v2-Base-Q4_K_M-GGUF

Text Generation • 9B • Updated Sep 8, 2025 • 16 • 1

NexVeridian/NVIDIA-Nemotron-Nano-9B-v2-3bit

Text Generation • 1B • Updated Sep 8, 2025 • 94

NexVeridian/NVIDIA-Nemotron-Nano-12B-v2-3bit

Text Generation • 12B • Updated Sep 8, 2025 • 50

nvidia/Qwen3-8B-NVFP4

Text Generation • 5B • Updated Sep 9, 2025 • 184k • 19

nvidia/Qwen3-8B-FP8

Text Generation • 8B • Updated Sep 9, 2025 • 4.34k • 6

nvidia/Qwen3-14B-NVFP4

Text Generation • 8B • Updated Sep 9, 2025 • 48.7k • 12

nvidia/Qwen3-14B-FP8

Text Generation • 15B • Updated Sep 9, 2025 • 6.04k • 6

nvidia/Qwen3-32B-NVFP4

Text Generation • 17B • Updated Sep 9, 2025 • 254k • 17

nvidia/Qwen2.5-VL-7B-Instruct-FP8

Text Generation • 8B • Updated Dec 12, 2025 • 823 • 8

nvidia/Qwen2.5-VL-7B-Instruct-NVFP4

Text Generation • 5B • Updated Dec 6, 2025 • 8.26k • 16

SandLogicTechnologies/OpenReasoning-Nemotron-1.5B-GGUF

Text Generation • 2B • Updated Sep 11, 2025 • 39

SandLogicTechnologies/OpenReasoning-Nemotron-7B-GGUF

Text Generation • 8B • Updated Sep 11, 2025 • 11