Edit Models filters

Model Tree

Apps

Docker Model Runner

Inference Providers

OVHcloud AI Endpoints

HF Inference API

Misc

Inference Endpoints

text-generation-inference

Eval Results (legacy)

text-embeddings-inference

4-bit precision

8-bit precision

Mixture of Experts

Carbon Emissions

Models

954

Base only

Active filters: 4bit

MidnightPhreaker/KAT-Dev-72B-Exp-GPTQ-INT4-gs32-0.01

75B • Updated Oct 22, 2025 • 1 • 1

MidnightPhreaker/KAT-Dev-72B-Exp-GPTQ-INT4-gs32

75B • Updated Oct 22, 2025

MidnightPhreaker/KAT-Dev-72B-Exp-GPTQ-INT4-gs128

12B • Updated Oct 22, 2025 • 2

MidnightPhreaker/GLM-4.5-Air-REAP-82B-A12B-GPTQ-INT4-gs32

14B • Updated Oct 22, 2025 • 7 • 6

sweatSmile/Gemma-2-2B-MedicalQA-Assistant

Text Generation • 3B • Updated Oct 26, 2025 • 2 • 1

ModelCloud/GLM-4.6-REAP-268B-A32B-GPTQMODEL-W4A16

Text Generation • 269B • Updated Oct 28, 2025 • 1 • 2

mradermacher/Gemma-2-2B-MedicalQA-Assistant-GGUF

3B • Updated Oct 27, 2025 • 44

toddie314/toddric_v2_merged

Text Generation • 8B • Updated Oct 29, 2025 • 5

sanchezalonsodavid17/DeepSeek-OCR-MBQ-Quantized-v1

Image-Text-to-Text • 3B • Updated Nov 6, 2025 • 72 • 6

ModelCloud/MiniMax-M2-GPTQMODEL-W4A16

Text Generation • 229B • Updated Oct 28, 2025 • 12 • 3

CHF0101/medquad-lora-r4

Updated Nov 2, 2025 • 3

CHF0101/medquad-lora-r32

Updated Nov 2, 2025 • 1

ModelCloud/Marin-32B-Base-GPTQMODEL-W4A16

Text Generation • 33B • Updated Oct 29, 2025 • 5 • 1

ModelCloud/Marin-32B-Base-GPTQMODEL-AWQ-W4A16

Text Generation • 33B • Updated Oct 30, 2025 • 8 • 2

ModelCloud/Granite-4.0-H-1B-GPTQMODEL-W4A16

Text Generation • 1B • Updated Oct 31, 2025 • 9 • 1

ModelCloud/Granite-4.0-H-350M-GPTQMODEL-W4A16

Text Generation • 0.3B • Updated Oct 31, 2025 • 4 • 1

ModelCloud/Brumby-14B-Base-GPTQMODEL-W4A16

Text Generation • 15B • Updated Oct 31, 2025 • 5 • 1

ModelCloud/Brumby-14B-Base-GPTQMODEL-W4A16-v2

Text Generation • 15B • Updated Oct 31, 2025 • 5 • 1

CHF0101/medquad-lora-r4-best

Updated Nov 2, 2025

CHF0101/medquad-lora-r4-best-v2

Updated Nov 2, 2025

CHF0101/medquad-lora-r32-best-v2

Updated Nov 2, 2025

sweatSmile/Phi3-Mini-FinSight-FinancialQA

Text Generation • 4B • Updated Nov 2, 2025 • 6 • 1

ikarius/Granite-3.2-8b-instruct-Abliterated-NF4

Text Generation • 8B • Updated Nov 17, 2025 • 3 • 1

ikarius/NeuralDaredevil-8B-abliterated-NF4

Text Generation • 8B • Updated Dec 15, 2025 • 3 • 1

ikarius/Qwen2.5-Coder-14B-Instruct-Abliterated-NF4

Text Generation • 15B • Updated Nov 18, 2025 • 39 • 1

Infiniaai/teddy-3.5b

4B • Updated Nov 17, 2025 • 11

wcosmas/sbcc-qwen

Updated Nov 26, 2025

lunovian/Qwen2.5-Math-7B-Instruct-4bit

2B • Updated Nov 21, 2025 • 14

Plurigrid/DR-Tulu-8B-MLX-4bit

1B • Updated Nov 22, 2025 • 10

ujjwal52/Llama-2-7b-FLASH-UK

Text Generation • 7B • Updated Nov 22, 2025 • 3 • 1