Edit Models filters

Model Tree

Apps

Docker Model Runner

Inference Providers

OVHcloud AI Endpoints

HF Inference API

Misc

Inference Endpoints

text-generation-inference

Eval Results (legacy)

text-embeddings-inference

4-bit precision

8-bit precision

Mixture of Experts

Carbon Emissions

Models

621

Base only

Active filters: int4

RedHatAI/whisper-medium-quantized.w4a16

Automatic Speech Recognition • 0.8B • Updated Apr 22, 2025 • 111

RedHatAI/whisper-small-quantized.w4a16

Automatic Speech Recognition • 0.3B • Updated Apr 22, 2025 • 255 • 1

ISTA-DASLab/gemma-3-4b-it-GPTQ-4b-128g

Image-Text-to-Text • 4B • Updated Apr 13, 2025 • 1.16k • 7

ISTA-DASLab/gemma-3-12b-it-GPTQ-4b-128g

Image-Text-to-Text • 12B • Updated Apr 13, 2025 • 4.47k • 7

RedHatAI/Qwen2.5-7B-Instruct-quantized.w4a16

Text Generation • 8B • Updated Sep 22, 2025 • 276

prashantarya/orpheus-tts-shaurya-4bit-new

Text Generation • Updated May 2, 2025 • 5

CarlOwOs/Qwen3-0.6B-Base-int4

Text Generation • 0.5B • Updated Jun 5, 2025 • 1

nikodoz/qwen2.5-7b-instruct-int4

Text Generation • 8B • Updated Jun 10, 2025 • 6 • 1

Prince-1/Veena-Onnx-Int4

Text-to-Speech • Updated Jun 26, 2025

liuming9157/Wan2.1-T2V-1.3B-Diffusers-ov-int4

Text-to-Video • Updated Jul 11, 2025

cuongpp/gemma-3-12b-it-GPTQ-4bit

Image-Text-to-Text • 12B • Updated Jul 16, 2025 • 129

dev-bjoern/smolvlm-int4-ov

Image-Text-to-Text • Updated Aug 6, 2025 • 7

dev-bjoern/smollm3-int4-ov

Text Generation • Updated Aug 6, 2025 • 4

RedHatAI/SmolLM3-3B-quantized.w4a16

3B • Updated Jul 31, 2025 • 28 • 1

RedHatAI/gemma-3n-E4B-it-quantized.w4a16

4B • Updated Aug 1, 2025 • 9

RedHatAI/gemma-3n-E2B-it-quantized.w4a16

4B • Updated Aug 1, 2025 • 6 • 1

Ju214/Mistral-Small-24B-3.1

Image-Text-to-Text • 24B • Updated Aug 7, 2025 • 16

2imi9/Qwen3-1.7b-gptq-int4

Text Generation • 2B • Updated Sep 12, 2025 • 3

Dhruvil03/Perception-LM-1B-Int4bit

Image-Text-to-Text • 2B • Updated Dec 2, 2025 • 2 • 1

RiverkanIT/Ling-mini-2.0-Quantized

Text Generation • Updated Sep 17, 2025 • 2

ForeseeLab/foreseeai-qwen3-4b-iot-int4

Text Generation • 4B • Updated Sep 30, 2025 • 6 • 1

ISTA-DASLab/Llama-3.1-8B-Instruct-MR-GPTQ-nvfp

Image-Text-to-Text • 5B • Updated Oct 3, 2025 • 4

ISTA-DASLab/Llama-3.1-8B-Instruct-MR-GPTQ-mxfp

Image-Text-to-Text • 5B • Updated Oct 3, 2025 • 4

huawei-csl/Qwen3-1.7B-4bit-SINQ

Text Generation • 1B • Updated Feb 2 • 4 • 5

huawei-csl/Qwen3-1.7B-4bit-ASINQ

Text Generation • 1B • Updated Feb 2 • 4 • 5

huawei-csl/Qwen3-32B-4bit-SINQ

Text Generation • 18B • Updated Feb 2 • 15 • 7

huawei-csl/Qwen3-14B-4bit-SINQ

Text Generation • 9B • Updated Feb 2 • 5 • 5

huawei-csl/Qwen3-14B-4bit-ASINQ

Text Generation • 9B • Updated Feb 2 • 1 • 6

huawei-csl/Qwen3-32B-4bit-ASINQ

Text Generation • 18B • Updated Feb 2 • 6 • 8

ModelCloud/GLM-4.6-GPTQMODEL-W4A16-v1

Text Generation • 357B • Updated Oct 28, 2025 • 7