Edit Models filters

Model Tree

Apps

Docker Model Runner

Inference Providers

OVHcloud AI Endpoints

HF Inference API

Misc

Inference Endpoints

text-generation-inference

Eval Results (legacy)

text-embeddings-inference

4-bit precision

8-bit precision

Mixture of Experts

Carbon Emissions

Models

1,915

Base only

Active filters: nvidia

introvoyz041/Nemotron-Cascade-14B-Thinking-mlx-4Bit

Text Generation • 15B • Updated Dec 19, 2025 • 4

introvoyz041/Nemotron-Cascade-8B-Thinking-mlx-4Bit

Text Generation • 1B • Updated Dec 19, 2025 • 4

Stan31/quantumflow-prototypes

Updated Dec 19, 2025

introvoyz041/Cosmos-Reason1-7B-mlx-4Bit

Image-Text-to-Text • 1B • Updated Dec 19, 2025 • 4

Jong-Seong/qwen3-next-gb10-guide

mradermacher/Nemotron-Cascade-8B-GGUF

8B • Updated Dec 21, 2025 • 47

mradermacher/Nemotron-Cascade-8B-Thinking-GGUF

8B • Updated Dec 21, 2025 • 62

mradermacher/Nemotron-Cascade-14B-Thinking-GGUF

15B • Updated Dec 21, 2025 • 70

mradermacher/Qwen3-Nemotron-235B-A22B-GenRM-i1-GGUF

235B • Updated Dec 22, 2025 • 151

mradermacher/Nemotron-Cascade-8B-i1-GGUF

8B • Updated Dec 21, 2025 • 2.22k

mradermacher/Nemotron-Cascade-8B-Thinking-i1-GGUF

8B • Updated Dec 21, 2025 • 121

mradermacher/Nemotron-Cascade-14B-Thinking-i1-GGUF

15B • Updated Dec 21, 2025 • 165 • 2

Mungert/NVIDIA-Nemotron-3-Nano-30B-A3B-BF16-GGUF

Text Generation • 32B • Updated Dec 22, 2025 • 958 • 1

Edge-Quant/OpenReasoning-Nemotron-1.5B-Q4_K_M-GGUF

Text Generation • 2B • Updated Dec 22, 2025 • 25

Edge-Quant/AceReason-Nemotron-1.1-7B-Q4_K_M-GGUF

Text Generation • 8B • Updated Dec 22, 2025 • 3

RedHatAI/NVIDIA-Nemotron-3-Nano-30B-A3B-FP8

Text Generation • 32B • Updated Apr 28 • 1.06k • 4

nvidia/Qwen3-VL-235B-A22B-Instruct-NVFP4

119B • Updated Mar 4 • 4.23k • 5

TevunahAi/Nemotron-3-Nano-30B-A3B-GPTQ

Text Generation • 6B • Updated Dec 25, 2025 • 59 • 2

mradermacher/Cascade-Droidz-GGUF

15B • Updated Dec 25, 2025 • 55 • 1

mradermacher/Cascade-Droidz-i1-GGUF

15B • Updated Dec 25, 2025 • 55 • 1

SiddhJagani/NVIDIA-Nemotron-Nano-12B-v2-mlx-Q4

Text Generation • 12B • Updated Dec 30, 2025 • 137

huihui-ai/Huihui-NVIDIA-Nemotron-Nano-9B-v2-abliterated

Text Generation • 9B • Updated Dec 29, 2025 • 205 • 2

nasedkinpv/parakeet-tdt-0.6b-v3-onnx-int8

Automatic Speech Recognition • Updated Dec 29, 2025 • 3

mradermacher/Huihui-NVIDIA-Nemotron-Nano-9B-v2-abliterated-GGUF

9B • Updated Jan 5 • 266 • 2

devisri050/OpenMath-Nemotron-1.5B-Q8_0-GGUF

Text Generation • 2B • Updated Dec 30, 2025 • 4

devisri050/OpenMath-Nemotron-1.5B-Q4_0-GGUF

Text Generation • 2B • Updated Dec 30, 2025 • 1

devisri050/OpenMath-Nemotron-1.5B-Q4_K_S-GGUF

Text Generation • 2B • Updated Dec 30, 2025 • 7

devisri050/OpenMath-Nemotron-1.5B-Q4_K_M-GGUF

Text Generation • 2B • Updated Dec 30, 2025 • 1

SiddhJagani/Nemotron-Cascade-14B-Thinking-Claude-4.5-Opus-High-Reasoning-Distill-mlx-Q6

Text Generation • 15B • Updated Dec 30, 2025 • 9

SiddhJagani/Nemotron-Cascade-14B-Thinking-Claude-4.5-Opus-High-Reasoning-Distill-mlx-Q3

Text Generation • 15B • Updated Dec 30, 2025 • 15