Inference Providers
Active filters: modelopt
zhuyksir/qwen3_30b_a3b_nvfp4_qat
16B • Updated • 5
alphatozeta/sglang_glm_4_6_fp4_modelopt
177B • Updated • 3
ericlewis/Nemotron-Orchestrator-8B-NVFP4
Text Generation
• 5B • Updated • 5
nvidia/Qwen3-Next-80B-A3B-Instruct-NVFP4
Text Generation
• Updated • 44.7k
• 41
trithemius/Velvet-14B-nvfp4
8B • Updated • 3
OPENZEKA/Qwen3-4B-Instruct-2507-NVFP4
2B • Updated • 158
Z841973620/Qwen3-30B-A3B-NVFP4
Text Generation
• 16B • Updated • 84
Z841973620/Qwen3-30B-A3B-FP8
Text Generation
• 31B • Updated • 3
OPENZEKA/Qwen3-Coder-30B-A3B-Instruct-NVFP4
Text Generation
• 16B • Updated • 162
josephdowling10/Mixtral-8x7B-Instruct-v0.1-NVFP4
Text Generation
• 23B • Updated • 58
taharmasmaliyev07/Llama-2-7b-hf-fp8
7B • Updated • 2
OPENZEKA/Qwen3-Coder-480B-A35B-Instruct-NVFP4
241B • Updated • 16
Shifusen/Llama-3.3-70B-Instruct-abliterated-NVFP4-modelopt
36B • Updated • 29
taharmasmaliyev07/Mistral-7B-v0.1-fp8
7B • Updated • 2
taharmasmaliyev07/Llama-3.1-8B-fp8
8B • Updated • 1
taharmasmaliyev07/gemma-2-9b-it-fp8
9B • Updated • 2
cybermotaz/qwen3-vl-2b-thinking-nvfp4-w4a16
Image-Text-to-Text
• 2B • Updated • 6
• 1
cybermotaz/qwen3-vl-4b-thinking-nvfp4-w4a16
Image-Text-to-Text
• 3B • Updated • 343
• 1
cybermotaz/qwen3-vl-8b-thinking-nvfp4-w4a16
Image-Text-to-Text
• 5B • Updated • 35
• 2
CedricHwang/qwen2.5-0.5b-modelopt-fp8-pc-pt
Text Generation
• 0.5B • Updated • 41
CedricHwang/qwen2.5-0.5b-modelopt-fp8-pb-wo
0.5B • Updated • 28
stepnoy/gpt-oss-120b-NVFP4
117B • Updated • 54
baseten-admin/glm-4.7-fp4
183B • Updated • 951
Text Generation
• 177B • Updated • 497
• 16
ericlewis/functiongemma-270m-it-nvfp4
0.2B • Updated • 3
cybermotaz/Qwen3-VL-32B-Instruct-NVFP4
Image-Text-to-Text
• 18B • Updated • 566
baseten-admin/glm-4.7-fp4-fp4kv
177B • Updated • 2
Text Generation
• 177B • Updated • 75
• 6
lukealonso/MiniMax-M2.1-NVFP4
115B • Updated • 258
• 24
nvidia/Qwen3-235B-A22B-Thinking-2507-NVFP4
Text Generation
• 120B • Updated • 1.37k
• 8