Inference Providers
Active filters: modelopt
baseten-admin/glm-4.6-fp8
353B • Updated • 1
baseten-admin/glm-4.6-fp4-mlp
183B • Updated • 22
shinedays1993/Qwen3-30B-A3B-nvfp4
16B • Updated • 4
shinedays1993/Qwen3-32B-nvfp4
17B • Updated • 3
Beambutbetter/Deepseek-V2-Lite-16B-NVFP4
Text Generation
• 8B • Updated • 15
• 3
ramblingpolymath/Qwen3-4B-Instruct-2507
2B • Updated • 2
literid/Qwen3-Coder-480B-A35B-Instruct_nvfp4_kv_fp8
241B • Updated • 5
DevQuasar/DeepSeek-R1-Distill-Llama-8B_nvfp4
Text Generation
• 5B • Updated • 13
DevQuasar/Qwen.Qwen3-4B-Thinking-2507_nvfp4
Text Generation
• 2B • Updated • 5
nvidia/NVIDIA-Nemotron-Nano-12B-v2-VL-NVFP4-QAD
Image-Text-to-Text
• 8B • Updated • 10.9k
• 28
177B • Updated • 17
• 6
guerilla7/Foundation-Sec-8B-Instruct-NVFP4-quantized
5B • Updated jiangchengchengNLP/L3.3-MS-Nevoria-70b-NVFP4-ONLY-MLP
42B • Updated • 2
jiangchengchengNLP/L3.3-MS-Nevoria-70b-NVFP4_A8
36B • Updated • 10
johnnyeric/Huihui-Qwen3-30B-A3B-Instruct-2507-abliterated-fp4
16B • Updated • 4
jiangchengchengNLP/L3.3-MS-Nevoria-70b-NVFP4_AWQ
36B • Updated • 5
Ex0bit/Qwen3-VLTO-32B-Instruct-NVFP4
Text Generation
• 17B • Updated • 1.69k
• 1
mdavidson83/Qwen3-Embedding-4B_nvfp4_hf
Updated • 135
Ex0bit/Qwen3-VLTO-32B-Instruct-NVFP4-256K
Text Generation
• 17B • Updated • 159
• 1
Image-Text-to-Text
• 13B • Updated • 25
lukealonso/MiniMax-M2-NVFP4
115B • Updated • 7
• 14
Text Generation
• 7B • Updated • 5
• 1
leatan95/Tongyi-DeepResearch-30B-A3B-NVFP4
16B • Updated • 4
DataSnake/Wayfarer-12B-NVFP4
Text Generation
• 7B • Updated • 7
• 1
DataSnake/Wayfarer-2-12B-NVFP4
Text Generation
• 7B • Updated • 5
• 2
Ex0bit/OLMo-3-7B-Instruct-NVFP4-1M
Text Generation
• 4B • Updated • 8
• 2
wangqia0309/Captain-Eris_Violet-V0.420-12B-FP8-KV-modelopt
12B • Updated • 4
rahtml/Qwen3-Coder-30B-A3B-Instruct-NVFP4
16B • Updated • 7
nvidia/Kimi-K2-Thinking-NVFP4
Text Generation
• Updated • 32.5k
• 29
zhuyksir/qwen3_30b_a3b_nvfp4_baseline
16B • Updated • 3