Edit Models filters

Model Tree

Apps

Docker Model Runner

Inference Providers

OVHcloud AI Endpoints

HF Inference API

Misc

Inference Endpoints

text-generation-inference

Eval Results (legacy)

text-embeddings-inference

4-bit precision

8-bit precision

Mixture of Experts

Carbon Emissions

Models

222

Base only

Active filters: vLLM

QuantTrio/Qwen3-Coder-30B-A3B-Instruct-GPTQ-Int8

Text Generation • 31B • Updated Sep 5, 2025 • 774 • 8

QuantTrio/Qwen3-Coder-30B-A3B-Instruct-AWQ

Text Generation • 31B • Updated Sep 5, 2025 • 396k • 8

EliovpAI/Qwen3-14B-FP8-KV

Text Generation • 15B • Updated Aug 1, 2025 • 3 • 2

QuantTrio/GLM-4.5V-AWQ

Image-Text-to-Text • 17B • Updated Aug 25, 2025 • 1.5k • 19

QuantTrio/Seed-OSS-36B-Instruct-AWQ

Text Generation • 36B • Updated Sep 15, 2025 • 458 • 8

QuantTrio/Seed-OSS-36B-Instruct-GPTQ-Int8

Text Generation • 36B • Updated Sep 15, 2025 • 103 • 4

QuantTrio/Seed-OSS-36B-Instruct-GPTQ-Int4

Text Generation • 36B • Updated Sep 15, 2025 • 13 • 5

QuantTrio/Seed-OSS-36B-Instruct-GPTQ-Int3

Text Generation • 34B • Updated Sep 15, 2025 • 8 • 3

amakhov/tiny-random-llama

Text Generation • 4.18M • Updated Aug 21, 2025 • 68

QuantTrio/KAT-V1-40B-AWQ

Text Generation • 41B • Updated Sep 5, 2025 • 4 • 2

QuantTrio/DeepSeek-V3.1-AWQ

Text Generation • 684B • Updated May 8 • 306 • 5

QuantTrio/DeepSeek-V3.1-AWQ-Fp16Mix

Text Generation • 684B • Updated Aug 27, 2025 • 15 • 1

QuantTrio/DeepSeek-V3.1-AWQ-Lite

Text Generation • 684B • Updated May 8 • 282 • 3

JunHowie/Qwen3-4B-Instruct-2507-GPTQ-Int4

Text Generation • 4B • Updated Sep 4, 2025 • 2.75k • 4

JunHowie/Qwen3-4B-Instruct-2507-GPTQ-Int8

Text Generation • 4B • Updated Sep 4, 2025 • 368

JunHowie/Qwen3-4B-Thinking-2507-GPTQ-Int4

Text Generation • 4B • Updated Sep 4, 2025 • 137 • 1

JunHowie/Qwen3-4B-Thinking-2507-GPTQ-Int8

Text Generation • 4B • Updated Sep 4, 2025 • 12 • 2

JunHowie/Qwen3-30B-A3B-Instruct-2507-GPTQ-Int4

Text Generation • 31B • Updated Sep 8, 2025 • 1.86k

JunHowie/Qwen3-30B-A3B-Instruct-2507-GPTQ-Int8

Text Generation • 31B • Updated Sep 8, 2025 • 6

JunHowie/Qwen3-30B-A3B-Thinking-2507-GPTQ-Int4

Text Generation • 31B • Updated Sep 8, 2025 • 144

JunHowie/Qwen2-7B-Instruct-GPTQ-Int4

Text Generation • 8B • Updated Sep 3, 2025 • 4.27k

JunHowie/Qwen2-7B-Instruct-GPTQ-Int8

Text Generation • 8B • Updated Sep 3, 2025

EliovpAI/Deepseek-R1-0528-Qwen3-8B-FP8-KV

Text Generation • 8B • Updated Sep 18, 2025 • 65

JunHowie/Qwen3-30B-A3B-Thinking-2507-GPTQ-Int8

Text Generation • 31B • Updated Sep 8, 2025 • 5

JunHowie/Seed-OSS-36B-Instruct-GPTQ-Int4

Text Generation • 36B • Updated Sep 15, 2025 • 2

JunHowie/Seed-OSS-36B-Instruct-GPTQ-Int8

Text Generation • 36B • Updated Sep 15, 2025

QuantTrio/Qwen3-VL-235B-A22B-Instruct-AWQ

Text Generation • 236B • Updated Oct 8, 2025 • 3.13k • 13

QuantTrio/Qwen3-VL-235B-A22B-Instruct-FP8

Text Generation • Updated Oct 8, 2025 • 137

QuantTrio/Qwen3-VL-235B-A22B-Thinking-AWQ

Text Generation • 236B • Updated Oct 8, 2025 • 1.4k • 8

QuantTrio/Qwen3-VL-235B-A22B-Thinking-FP8

Text Generation • 236B • Updated Oct 8, 2025 • 30