Edit Models filters

Model Tree

Apps

Docker Model Runner

Inference Providers

OVHcloud AI Endpoints

HF Inference API

Misc

Inference Endpoints

text-generation-inference

Eval Results (legacy)

text-embeddings-inference

4-bit precision

8-bit precision

Mixture of Experts

Carbon Emissions

Models

1,832

Base only

Active filters: quantization

LargitData/gemma-4-31b-it-fp8

Text Generation • 31B • Updated Apr 6 • 4.14k • 2

lew96123/qwen3.5-0.8b-custom-packed-turboquant_mse-true-uniform-4bit

Image-Text-to-Text • Updated Apr 7 • 3 • 1

lew96123/qwen3.5-0.8b-custom-packed-turboquant_mse-true-uniform-2bit

Image-Text-to-Text • Updated Apr 7 • 4

MoringLabs/Qwen3.5-122B-A10B-MLX-4.7bit-VL

Image-Text-to-Text • 20B • Updated Apr 6 • 62

RthItalia/NanoLLM-Qwen2.5-3B-v3.1

Text Generation • Updated May 4 • 44

lew96123/qwen3.5-0.8b-custom-packed-turboquant_mse-true-uniform-1bit

Image-Text-to-Text • Updated Apr 7 • 4

sowilow/gpt-oss-20b-DGX-Spark-GGUF

Text Generation • 21B • Updated Apr 13 • 61

Funkylazer/dm-qwen3.5-27b-b1p-knowledgecorridor

Text Generation • 27B • Updated Apr 7 • 3

Abhinav-Anand/Two-And-A-Half-Qwen

Text Generation • 0.5B • Updated Apr 7 • 674 •

MilyaShams/Qwen3-1.7B-PTQ_W8A8

2B • Updated Apr 7 • 2

MilyaShams/Qwen3-1.7B-PTQ_W8A16

2B • Updated Apr 7 • 2

MilyaShams/Qwen3-1.7B-PTQ_W4A16

2B • Updated Apr 7 • 2

MilyaShams/Qwen3-1.7B-PTQ_W8A8_ign

2B • Updated Apr 7 • 2

MilyaShams/Qwen3-1.7B-PTQ_W8A16_ign

2B • Updated Apr 7 • 2

MilyaShams/Qwen3-1.7B-PTQ_W4A16_ign

2B • Updated Apr 7 • 2

MilyaShams/Qwen3-1.7B-GPTQ_W8A8_bs64_damp0.01

2B • Updated Apr 7 • 2

Neural-ICE/Gemma-4-31B-IT-NVFP4

Text Generation • 21B • Updated Apr 8 • 8 • 1

MilyaShams/Qwen3-1.7B-GPTQ_W8A8_bs64_damp0.05

2B • Updated Apr 7 • 1

MilyaShams/Qwen3-1.7B-GPTQ_W8A8_bs128_damp0.01

2B • Updated Apr 7 • 1

ebircak/gemma-4-31B-it-4bit-W4A16-AWQ

Text Generation • 32B • Updated Apr 7 • 26.6k • 3

cloud19/gemma-4-26B-A4B-it-heretic-FP8-Static

Image-Text-to-Text • 26B • Updated Apr 8 • 248k • 4

MilyaShams/Qwen3-1.7B-GPTQ_W8A8_bs128_damp0.05

2B • Updated Apr 7 • 2

MilyaShams/Qwen3-1.7B-GPTQ_W8A16_bs64_damp0.01

2B • Updated Apr 7 • 2

MilyaShams/Qwen3-1.7B-GPTQ_W8A16_bs64_damp0.05

2B • Updated Apr 7 • 1

MilyaShams/Qwen3-1.7B-GPTQ_W8A16_bs128_damp0.01

2B • Updated Apr 7 • 1

MilyaShams/Qwen3-1.7B-GPTQ_W8A16_bs128_damp0.05

2B • Updated Apr 7 • 2

MilyaShams/Qwen3-1.7B-GPTQ_W4A16_bs64_damp0.01

2B • Updated Apr 7 • 3

MilyaShams/Qwen3-1.7B-GPTQ_W4A16_bs64_damp0.05

2B • Updated Apr 7 • 2

IvanSmit05/gemma-4-31B-it-heretic-MLX-4bit

Image-Text-to-Text • 5B • Updated Apr 7 • 105 • 1

MilyaShams/Qwen3-1.7B-GPTQ_W4A16_bs128_damp0.01

2B • Updated Apr 7 • 2