Edit Models filters

Model Tree

Apps

Docker Model Runner

Inference Providers

OVHcloud AI Endpoints

HF Inference API

Misc

Inference Endpoints

text-generation-inference

Eval Results (legacy)

text-embeddings-inference

4-bit precision

8-bit precision

Mixture of Experts

Carbon Emissions

Models

1,914

Base only

Active filters: nvidia

mlx-community/NVIDIA-Nemotron-3-Nano-30B-A3B-MLX-BF16

Text Generation • 32B • Updated Dec 17, 2025 • 230

mlx-community/NVIDIA-Nemotron-3-Nano-30B-A3B-MLX-8Bit

Text Generation • 32B • Updated Dec 17, 2025 • 207 • 2

mlx-community/NVIDIA-Nemotron-3-Nano-30B-A3B-MLX-6Bit

Text Generation • 32B • Updated Dec 17, 2025 • 97 • 2

mlx-community/NVIDIA-Nemotron-3-Nano-30B-A3B-MLX-5Bit

Text Generation • 32B • Updated Dec 17, 2025 • 185

mlx-community/NVIDIA-Nemotron-3-Nano-30B-A3B-MLX-4Bit

Text Generation • 32B • Updated Dec 17, 2025 • 217 • 1

NikolayKozloff/Nemotron-Cascade-8B-Q8_0-GGUF

Text Generation • 8B • Updated Dec 17, 2025 • 2 • 1

NikolayKozloff/Nemotron-Cascade-8B-Thinking-Q8_0-GGUF

Text Generation • 8B • Updated Dec 17, 2025 • 7 • 1

NikolayKozloff/Nemotron-Cascade-14B-Thinking-Q5_K_S-GGUF

Text Generation • 15B • Updated Dec 17, 2025 • 11 • 1

NikolayKozloff/Nemotron-Cascade-14B-Thinking-Q4_K_M-GGUF

Text Generation • 15B • Updated Dec 17, 2025 • 9 • 1

NikolayKozloff/Nemotron-Cascade-14B-Thinking-Q4_K_S-GGUF

Text Generation • 15B • Updated Dec 17, 2025 • 10 • 1

smcleod/Nemotron-Cascade-14B-Thinking-mlx-6Bit

Text Generation • 15B • Updated Dec 17, 2025 • 24 • 1

smcleod/Nemotron-Cascade-8B-mlx-6Bit

Text Generation • 8B • Updated Dec 17, 2025 • 5

yueqis/NVIDIA-Nemotron-Nano-9B-v2

Text Generation • 9B • Updated Dec 17, 2025 • 146

SiddhJagani/NVIDIA-Nemotron-Nano-12B-v2-mlx-Q8

Text Generation • 12B • Updated Dec 17, 2025 • 27

SiddhJagani/NVIDIA-Nemotron-Nano-12B-v2-mlx-Q6

Text Generation • 12B • Updated Dec 26, 2025 • 132

cybermotaz/nemotron3-nano-nvfp4-w4a16

Text Generation • 18B • Updated Dec 18, 2025 • 3.33k • 13

cyankiwi/Nemotron-Cascade-14B-Thinking-AWQ-4bit

Text Generation • 4B • Updated Dec 18, 2025 • 856 • 1

cyankiwi/Nemotron-Cascade-14B-Thinking-AWQ-8bit

Text Generation • 5B • Updated Dec 18, 2025 • 28

cyankiwi/Nemotron-Cascade-8B-Thinking-AWQ-4bit

Text Generation • 2B • Updated Dec 18, 2025 • 212

cyankiwi/Nemotron-Cascade-8B-Thinking-AWQ-8bit

Text Generation • 3B • Updated Dec 18, 2025 • 9

cybermotaz/qwen3-vl-2b-thinking-nvfp4-w4a16

Image-Text-to-Text • 2B • Updated Dec 18, 2025 • 7 • 1

cybermotaz/qwen3-vl-4b-thinking-nvfp4-w4a16

Image-Text-to-Text • 3B • Updated Dec 18, 2025 • 138 • 1

cybermotaz/qwen3-vl-8b-thinking-nvfp4-w4a16

Image-Text-to-Text • 5B • Updated Dec 18, 2025 • 51 • 2

cyankiwi/Nemotron-Cascade-8B-AWQ-4bit

Text Generation • 2B • Updated Dec 18, 2025 • 114 • 1

cyankiwi/Nemotron-Cascade-8B-AWQ-8bit

Text Generation • 3B • Updated Dec 18, 2025 • 6 • 1

Ex0bit/Elbaz-NVIDIA-Nemotron-3-Nano-30B-A3B-PRISM

Text Generation • 32B • Updated Jan 23 • 610 • 30

Ex0bit/Elbaz-NVIDIA-Nemotron-3-Nano-30B-A3B-PRISM-NVFP4

Text Generation • 16B • Updated Dec 18, 2025 • 337 • 6

nvidia/Nemotron-Cascade-8B-Intermediate-ckpts

Text Generation • Updated Dec 19, 2025 • 13

introvoyz041/Nemotron-Cascade-8B-mlx-4Bit

Text Generation • 1B • Updated Dec 19, 2025 • 10

introvoyz041/Nemotron-Cascade-14B-Thinking-mlx-4Bit

Text Generation • 15B • Updated Dec 19, 2025 • 4