Edit Models filters

Model Tree

Apps

Docker Model Runner

Inference Providers

OVHcloud AI Endpoints

HF Inference API

Misc

Inference Endpoints

text-generation-inference

Eval Results (legacy)

text-embeddings-inference

4-bit precision

8-bit precision

Mixture of Experts

Carbon Emissions

Models

1,830

Base only

Active filters: quantization

tonera/Qwen3-text-Nunchaku

Updated Apr 9 • 228 • 6

caiovicentino1/Qwen3.5-9B-EOQ-v3

Text Generation • 5B • Updated Apr 6 • 21 • 1

caiovicentino1/Qwen3.5-9B-HLWQ-Engine-v4

Text Generation • 7B • Updated Apr 13 • 6

YTan2000/Qwen3.5-27B-TQ3_1S

Image-Text-to-Text • 27B • Updated Apr 23 • 127 • 38

arjunbroepic/edgy-commenter-GGUF

Text Generation • 0.8B • Updated Mar 30 • 5 • 1

caiovicentino1/Qwen3.5-9B-HLWQ-MLX-4bit

Text Generation • 1B • Updated Apr 13 • 68 • 5

viberobin/Wan2.2-TI2V-5B-VedioQuant

Text-to-Video • Updated Mar 31

gam30/nepali-automatic-speech-recognition

Automatic Speech Recognition • Updated Apr 5 • 40

YakovSava/RouWei-08-gguf

3B • Updated Apr 1 • 12 • 1

HyX3/MiniCPM-V-4-OpenVINO-INT4

Image-to-Text • Updated Apr 2 • 2

YTan2000/Qwen3.5-27B-TQ3_4S

Image-Text-to-Text • 27B • Updated Apr 23 • 103 • 18

Octen/Octen-Embedding-4B-INT8

Sentence Similarity • 4B • Updated Apr 2 • 2.07k • 5

YTan2000/Qwopus3.5-27B-v3-TQ3_4S

Image-Text-to-Text • 27B • Updated Apr 23 • 81 • 32

vibegavin/HY-WorldPlay-FP8

Text-to-Video • Updated Apr 2 • 1

caiovicentino1/polarquant-skills

Updated Apr 10 • 1

Lorbus/Qwopus3.5-27B-v3-int4-autoround

6B • Updated Apr 3 • 197 • 4

cs2764/DeepSeek-V3.2_dq4-mlx

Text Generation • 672B • Updated Apr 3 • 130

jjjunho/gemma-4-26B-A4B-it-GGUF

Text Generation • 25B • Updated Apr 3 • 14

Ankushbl6/Qwopus3.5-27B-v3-TQ3_4S

Text Generation • 27B • Updated Apr 3 • 8 • 1

coverblew/llamita.cpp

Text Generation • Updated Apr 3 • 1

deepak-p-yadav/sarvam-30b-IQ2_M-indic

Text Generation • Updated Apr 5

rockerritesh/EUPE-ONNX

Image Feature Extraction • Updated Apr 4 • 1

ebircak/gemma-4-31B-it-4bit-W4A16-GPTQ

Text Generation • 32B • Updated Apr 5 • 6.41k • 3

benjiaiplayground/HY-OmniWeaving-FP8

Updated Apr 5 • 4

ebircak/gemma-4-31B-it-4bit-NVFP4A16-GPTQ

Text Generation • 18B • Updated Apr 5 • 3.06k • 2

MoringLabs/Qwen3.5-122B-A10B-MLX-3.7bit-VL-v2

Image-Text-to-Text • 17B • Updated Apr 6 • 377

MoringLabs/MiniMax-M2.5-REAP-139B-A10B-MLX-3.7bit

Text Generation • 139B • Updated Apr 5 • 68

daksh-neo/qwen3.5-1bit-quantization-study

Text Generation • Updated Apr 10 • 2

LargitData/gemma-4-26b-a4b-it-fp8

Text Generation • 26B • Updated Apr 7 • 151k

LargitData/gemma-4-31b-it-fp8

Text Generation • 31B • Updated Apr 6 • 4.25k • 2