Edit Models filters

Model Tree

Apps

Docker Model Runner

Inference Providers

OVHcloud AI Endpoints

HF Inference API

Misc

Inference Endpoints

text-generation-inference

Eval Results (legacy)

text-embeddings-inference

4-bit precision

8-bit precision

Mixture of Experts

Carbon Emissions

Models

741

Base only

Active filters: modelopt

alphatozeta/sglang_glm_4_6_fp4_modelopt

177B • Updated Dec 5, 2025 • 3

ericlewis/Nemotron-Orchestrator-8B-NVFP4

Text Generation • 5B • Updated Dec 9, 2025 • 5

nvidia/Qwen3-Next-80B-A3B-Instruct-NVFP4

Text Generation • Updated Feb 9 • 44.7k • 41

trithemius/Velvet-14B-nvfp4

8B • Updated Dec 11, 2025 • 3

OPENZEKA/Qwen3-4B-Instruct-2507-NVFP4

2B • Updated Dec 25, 2025 • 158

Z841973620/Qwen3-30B-A3B-NVFP4

Text Generation • 16B • Updated Dec 15, 2025 • 84

Z841973620/Qwen3-30B-A3B-FP8

Text Generation • 31B • Updated Dec 15, 2025 • 3

OPENZEKA/Qwen3-Coder-30B-A3B-Instruct-NVFP4

Text Generation • 16B • Updated Jan 13 • 162

josephdowling10/Mixtral-8x7B-Instruct-v0.1-NVFP4

Text Generation • 23B • Updated Dec 16, 2025 • 58

taharmasmaliyev07/Llama-2-7b-hf-fp8

7B • Updated Dec 17, 2025 • 2

OPENZEKA/Qwen3-Coder-480B-A35B-Instruct-NVFP4

241B • Updated Dec 25, 2025 • 16

Shifusen/Llama-3.3-70B-Instruct-abliterated-NVFP4-modelopt

36B • Updated Dec 18, 2025 • 29

taharmasmaliyev07/Mistral-7B-v0.1-fp8

7B • Updated Dec 18, 2025 • 2

taharmasmaliyev07/Llama-3.1-8B-fp8

8B • Updated Dec 18, 2025 • 1

taharmasmaliyev07/gemma-2-9b-it-fp8

9B • Updated Dec 18, 2025 • 2

cybermotaz/qwen3-vl-2b-thinking-nvfp4-w4a16

Image-Text-to-Text • 2B • Updated Dec 18, 2025 • 6 • 1

cybermotaz/qwen3-vl-4b-thinking-nvfp4-w4a16

Image-Text-to-Text • 3B • Updated Dec 18, 2025 • 343 • 1

cybermotaz/qwen3-vl-8b-thinking-nvfp4-w4a16

Image-Text-to-Text • 5B • Updated Dec 18, 2025 • 35 • 2

CedricHwang/qwen2.5-0.5b-modelopt-fp8-pc-pt

Text Generation • 0.5B • Updated Dec 19, 2025 • 41

CedricHwang/qwen2.5-0.5b-modelopt-fp8-pb-wo

0.5B • Updated Dec 19, 2025 • 28

stepnoy/gpt-oss-120b-NVFP4

117B • Updated Dec 20, 2025 • 54

baseten-admin/glm-4.7-fp4

183B • Updated Dec 24, 2025 • 951

Salyut1/GLM-4.7-NVFP4

Text Generation • 177B • Updated Dec 23, 2025 • 497 • 16

ericlewis/functiongemma-270m-it-nvfp4

0.2B • Updated Dec 23, 2025 • 3

cybermotaz/Qwen3-VL-32B-Instruct-NVFP4

Image-Text-to-Text • 18B • Updated Dec 24, 2025 • 566

baseten-admin/glm-4.7-fp4-fp4kv

177B • Updated Dec 25, 2025 • 2

Tengyunw/GLM-4.7-NVFP4

Text Generation • 177B • Updated May 22 • 75 • 6

lukealonso/MiniMax-M2.1-NVFP4

115B • Updated Feb 14 • 258 • 24

nvidia/Qwen3-235B-A22B-Thinking-2507-NVFP4

Text Generation • 120B • Updated Jan 30 • 1.37k • 8

nvidia/Qwen3-235B-A22B-Instruct-2507-NVFP4

Text Generation • 120B • Updated Jan 30 • 4.61k • 10