Edit Models filters

Model Tree

Apps

Docker Model Runner

Inference Providers

OVHcloud AI Endpoints

HF Inference API

Misc

Inference Endpoints

text-generation-inference

Eval Results (legacy)

text-embeddings-inference

4-bit precision

8-bit precision

Mixture of Experts

Carbon Emissions

Models

1,916

Base only

Active filters: nvidia

etsien/Llama-3.1-Nemotron-70B-Instruct-HF-GPTQ-W4A8

Text Generation • 71B • Updated Aug 19, 2025 • 5

jet-ai/Jet-Nemotron-2B

Text Generation • Updated Sep 28, 2025 • 170 • 17

jet-ai/Jet-Nemotron-4B

Text Generation • 4B • Updated Sep 28, 2025 • 131 • 30

tensorblock/nvidia_OpenReasoning-Nemotron-14B-GGUF

Text Generation • 15B • Updated Jan 27 • 19

nvidia/NVIDIA-Nemotron-Nano-12B-v2

Text Generation • 12B • Updated Nov 25, 2025 • 183k • 165

weathermanj/Nemotron-nano-9b-fp8

Text Generation • 9B • Updated Aug 29, 2025 • 14 • 6

LogicBombaklot/Llama-3_3-Nemotron-Super-49B-v1_5-mlx-8Bit

Text Generation • 50B • Updated Aug 21, 2025 • 49 • 1

JaHyoun/Llama-3.1-Nemotron-Nano-8B-v1-Q4_K_M-GGUF

Text Generation • 8B • Updated Aug 22, 2025 • 1

Zaixi/STELLA-VLM-JoVE-7B

Image-Text-to-Text • 8B • Updated Aug 24, 2025 • 3

mradermacher/STELLA-VLM-JoVE-7B-GGUF

8B • Updated Aug 25, 2025 • 79 • 1

mradermacher/STELLA-VLM-JoVE-7B-i1-GGUF

8B • Updated Dec 23, 2025 • 264 • 2

etsien/Llama-3.1-Nemotron-70B-Instruct-HF-GPTQ-w4a8-v2

Text Generation • 71B • Updated Aug 27, 2025 • 5

jonlizardo/affine-gpt-oss-120b-light

Text Generation • 0.2B • Updated Aug 27, 2025 • 3

Zaixi/STELLA-VLM-FineBio-7B

Image-Text-to-Text • 8B • Updated Aug 28, 2025 • 7 • 4

etsien/Llama-3_3-Nemotron-Super-49B-v1_5-GPTQ-w4a8

Text Generation • 50B • Updated Aug 28, 2025 • 172

dominguesm/NVIDIA-Nemotron-Nano-9B-v2-GGUF

Text Generation • 9B • Updated Aug 30, 2025 • 315 • 1

weathermanj/NVIDIA-Nemotron-Nano-9B-v2-gguf

Text Generation • 9B • Updated Aug 29, 2025 • 333 • 1

mlx-community/NVIDIA-Nemotron-Nano-9B-v2-4bits

Text Generation • Updated Aug 28, 2025 • 573 • 2

mradermacher/STELLA-VLM-FineBio-7B-GGUF

8B • Updated Aug 29, 2025 • 75 • 1

nvidia/gliner-PII

Token Classification • Updated Dec 7, 2025 • 5.73k • 139

Triago/NVIDIA-Nemotron-Nano-12B-v2-Q8_0-GGUF

Text Generation • 12B • Updated Aug 30, 2025 • 2

gabriellarson/NVIDIA-Nemotron-Nano-12B-v2-GGUF

Text Generation • 12B • Updated Aug 30, 2025 • 237

QuantFactory/NVIDIA-Nemotron-Nano-9B-v2-GGUF

Text Generation • Updated Aug 30, 2025 • 119 • 4

jesusoctavioas/Llama-3.1-Nemotron-Nano-8B-v1-mlx-4Bit

Text Generation • 1B • Updated Aug 30, 2025 • 36

NikolayKozloff/NVIDIA-Nemotron-Nano-12B-v2-Q6_K-GGUF

Text Generation • 12B • Updated Aug 30, 2025 • 1 • 2

QuantFactory/NVIDIA-Nemotron-Nano-12B-v2-GGUF

Text Generation • Updated Aug 30, 2025 • 204 • 2

NikolayKozloff/NVIDIA-Nemotron-Nano-12B-v2-Q5_K_M-GGUF

Text Generation • 12B • Updated Aug 30, 2025 • 16 • 1

NikolayKozloff/NVIDIA-Nemotron-Nano-12B-v2-Q5_K_S-GGUF

Text Generation • 12B • Updated Aug 30, 2025 • 5 • 1

NikolayKozloff/NVIDIA-Nemotron-Nano-12B-v2-Q4_K_M-GGUF

Text Generation • 12B • Updated Aug 30, 2025 • 6 • 1

Neural-Hacker/Qwen3-Math-Reasoning-LoRA

Text Generation • Updated 25 days ago • 2