Edit Models filters

Model Tree

Apps

Docker Model Runner

Inference Providers

OVHcloud AI Endpoints

HF Inference API

Misc

Inference Endpoints

text-generation-inference

Eval Results (legacy)

text-embeddings-inference

4-bit precision

8-bit precision

Mixture of Experts

Carbon Emissions

Models

1,821

Base only

Active filters: quantization

HangGuo/QWen2.5-32B-FlatQuant-OBR-GPTQ-W4A4KV4S50

Text Generation • Updated Sep 26, 2025 • 12

swayamsingal/tencent-Hunyuan-MT-7B-light-nanoquant-light

8B • Updated Sep 3, 2025 • 3 • 1

swayamsingal/tencent-Hunyuan-MT-7B-medium-nanoquant-medium

8B • Updated Sep 3, 2025 • 3

llmat/Apertus-8B-Instruct-2509-NVFP4

Text Generation • 5B • Updated Sep 3, 2025 • 3 • 1

continuedev/instinct-GGUF

8B • Updated Sep 4, 2025 • 117 • 8

HangGuo/QWen2.5-1.5B-FlatQuant-OBR-GPTQ-W4A8KV16S50

Text Generation • Updated Sep 26, 2025 • 10

HangGuo/QWen2.5-3B-FlatQuant-OBR-GPTQ-W4A8KV16S50

Text Generation • Updated Sep 26, 2025 • 10

HangGuo/QWen2.5-3B-FlatQuant-OBR-GPTQ-W4A4KV4S50

Text Generation • Updated Sep 26, 2025 • 13

groxaxo/Qwen3-30B-A3B-Thinking-2507-Deepseek-v3.1-Distill-INT8-A8W16

8B • Updated Sep 6, 2025 • 2 • 1

SutanRifkyt/komodo7b-sunda-lemess-gguf

Text Generation • 7B • Updated Sep 6, 2025

HangGuo/QWen2.5-1.5B-FlatQuant-OBR-GPTQ-W4A4KV4S50

Text Generation • Updated Sep 26, 2025 • 4

skatzR/USER-BGE-M3-ONNX-INT8

Updated Sep 12, 2025 • 123 • 1

NangWeiLun/MiMo-VL-7B-SFT-2508-bnb-4bit-fp4

Image-Text-to-Text • 8B • Updated Sep 9, 2025 • 2

bluejude10/Bllossom-3B-DTRO-3LINE-POWER-Q6

Text Generation • 3B • Updated Feb 2 • 8

NangWeiLun/MiMo-VL-7B-RL-2508-bnb-4bit-fp4

Image-Text-to-Text • 8B • Updated Sep 9, 2025 • 174

rpanchum/lcm-sdxl-ov-fp16-quant_unet

Text-to-Image • Updated Sep 10, 2025

2imi9/Qwen3-1.7b-gptq-int4

Text Generation • 2B • Updated Sep 12, 2025 • 2

bluejude10/kanana-1.5-2.1b-DTRO-3LINE-POWER-q4-k-m

Text Generation • 2B • Updated Feb 2

SutanRifkyt/komodo7b-sunda-lemes-gguf

Text Generation • 7B • Updated Sep 23, 2025 • 5

RiverkanIT/Ling-mini-2.0-Quantized

Text Generation • Updated Sep 17, 2025 • 2

aghatage/SFR-Embedding-2_R-4bit-NF4

Feature Extraction • 7B • Updated Sep 23, 2025 • 44

ShahzebKhoso/Qwen3Guard-Gen-8B-GGUF

8B • Updated Sep 24, 2025 • 132 • 1

Sunbird/Sunflower-14B-FP8

Text Generation • 15B • Updated Oct 9, 2025

Sunbird/Sunflower-14B-FP4A16

Text Generation • 9B • Updated Oct 9, 2025 • 1

Sunbird/Sunflower-32B-FP8

Text Generation • 33B • Updated Oct 9, 2025

Sunbird/Sunflower-32B-FP4A16

Text Generation • 19B • Updated Oct 9, 2025

Lerelou/Brains4b.q4_k_m-GGUF

4B • Updated Oct 21, 2025 • 1

SandLogicTechnologies/Qwen3-4B-Thinking-2507-GGUF

Text Generation • 4B • Updated Sep 29, 2025 • 9

ShahzebKhoso/Qwen3-4B-SafeRL-GGUF

4B • Updated Oct 1, 2025 • 140

Sunbird/Sunflower-32B-W8A8

Text Generation • 33B • Updated Oct 9, 2025 • 2