Edit Models filters

Model Tree

Apps

Docker Model Runner

Inference Providers

OVHcloud AI Endpoints

HF Inference API

Misc

Inference Endpoints

text-generation-inference

Eval Results (legacy)

text-embeddings-inference

4-bit precision

8-bit precision

Mixture of Experts

Carbon Emissions

Models

274

Base only

Active filters: llm-compressor

vistralis/Qwen3-4B-FP8

Text Generation • 4B • Updated Feb 7 • 8

vistralis/Qwen3-4B-INT8

Text Generation • 4B • Updated Feb 7 • 7

vistralis/Qwen3-8B-INT8

Text Generation • 8B • Updated Feb 7 • 4

Sophia-AI/Qwen3-4B-Instruct-2507-AWQ-W4A16

Text Generation • 4B • Updated Feb 12 • 291

ludovicoYIN/MiniMax-M2-BF16-W4A16

Text Generation • 34B • Updated Feb 13 • 5

mratsim/MiniMax-M2.5-BF16-INT4-AWQ

Text Generation • 39B • Updated Feb 17 • 35k • 39

ekurtic/granite-4.0-h-small-FP8-block

Text Generation • 32B • Updated Feb 14 • 2

ekurtic/granite-4.0-h-small-FP8-dynamic

Text Generation • 32B • Updated Feb 14 • 1

kaitchup/Nanbeige4.1-3B-MXFP4

2B • Updated Feb 16 • 7 • 1

kaitchup/Nanbeige4.1-3B-NVFP4

3B • Updated Feb 17 • 7

kaitchup/Nanbeige4.1-3B-FP8-Dynamic

4B • Updated Feb 17 • 7

kaitchup/Nanbeige4.1-3B-AWQ-W4A16-ASYM

4B • Updated Feb 17 • 11 • 1

mratsim/MiniMax-M2.5-FP8-INT4-AWQ

Text Generation • 39B • Updated Feb 17 • 2.4k • 21

RedHatAI/Qwen3.5-397B-A17B-FP8-dynamic

Image-Text-to-Text • 397B • Updated Apr 28 • 1.21k • 5

prithivMLmods/ZwZ-8B-FP8

Image-Text-to-Text • 9B • Updated Feb 19 • 3 • 1

prithivMLmods/chandra-FP8-Latest

Image-Text-to-Text • 9B • Updated Feb 19 • 96 • 1

RedHatAI/Qwen3-Coder-Next-NVFP4

Text Generation • Updated Apr 28 • 25.5k • 30

prithivMLmods/FireRed-Image-Edit-1.0-FP8

Image-to-Image • Updated Feb 21 • 5

JongYeop/Qwen2.5-VL-3B-Instruct-FP8-W8A8-LM-Only

Image-Text-to-Text • 4B • Updated Feb 22 • 2

JongYeop/Qwen2.5-VL-3B-Instruct-FP4-W4A4-LM-Only

Image-Text-to-Text • 3B • Updated Feb 22 • 22

JongYeop/Qwen2.5-VL-7B-Instruct-FP8-W8A8-LM-Only

Image-Text-to-Text • 8B • Updated Feb 22 • 2

JongYeop/Qwen2.5-VL-7B-Instruct-FP4-W4A4-LM-Only

Image-Text-to-Text • 5B • Updated Feb 22 • 2

mradermacher/chandra-FP8-Latest-GGUF

8B • Updated Feb 24 • 61

mradermacher/chandra-FP8-Latest-i1-GGUF

8B • Updated Feb 24 • 51 • 1

Sehyo/Qwen3.5-122B-A10B-NVFP4

Image-Text-to-Text • 71B • Updated Mar 2 • 186k • 67

Sehyo/Qwen3.5-35B-A3B-NVFP4

Updated Mar 2 • 5.01k • 41

apolo13x/Qwen3.5-35B-A3B-NVFP4

Image-Text-to-Text • Updated Feb 27 • 16.9k • 15

apolo13x/Qwen3.5-35B-A3B-quantized.w4a16

Image-Text-to-Text • 35B • Updated Feb 27 • 1.32k • 2

apolo13x/Qwen3.5-27B-NVFP4

Image-Text-to-Text • 17B • Updated Feb 27 • 17.4k • 39

apolo13x/Qwen3.5-27B-quantized.w4a16

Image-Text-to-Text • 28B • Updated Feb 27 • 484 • 3