Edit Models filters

Model Tree

Apps

Docker Model Runner

Inference Providers

OVHcloud AI Endpoints

HF Inference API

Misc

Inference Endpoints

text-generation-inference

Eval Results (legacy)

text-embeddings-inference

4-bit precision

8-bit precision

Mixture of Experts

Carbon Emissions

Models

954

Base only

Active filters: 4bit

manu02/all-MiniLM-L6-v2-bnb-4bit-nf4-dq

Text Generation • 23.1M • Updated Feb 12 • 9

manu02/Qwen3-0.6B-bnb-4bit-nf4-dq

Text Generation • 0.8B • Updated Feb 12 • 2

manu02/HY-1.8B-2Bit-bnb-4bit-nf4-dq

Text Generation • 2B • Updated Feb 12 • 4

manu02/MioTTS-2.6B-bnb-4bit-nf4-dq

Text Generation • 3B • Updated Feb 12 • 4

manu02/Qwen3-4B-Instruct-2507-bnb-4bit-nf4-dq

Text Generation • 4B • Updated Feb 12 • 1

manu02/gemma-3-4b-it-bnb-4bit-nf4-dq

Image-Text-to-Text • 4B • Updated Feb 12 • 1

vipsehgal/qwen3-8b-jee-sdpo-mlx-4bit

Text Generation • 1B • Updated Feb 13 • 10

ahoybrotherbear/MiniMax-M2.5-4bit-MLX

Text Generation • 229B • Updated Feb 13 • 22 • 1

manu02/DMind-3-mini-bnb-4bit-nf4-dq

Text Generation • 4B • Updated Feb 14 • 2

Ashley101179/medgemma-1.5-4b-it-4bit-autoround

4B • Updated Feb 14 • 1

manu02/QED-Nano-bnb-4bit-nf4-dq

Text Generation • 4B • Updated Feb 14 • 3

manu02/kani-tts-2-en-bnb-4bit-nf4-dq

Text Generation • 0.4B • Updated Feb 15 • 3 • 1

shieldstackllc/GLM-4.7-Flash-PRISM-mlx-4bit

Text Generation • 30B • Updated Feb 16 • 20

shieldstackllc/MiniMax-M2-5-REAP-39-mlx-4bit

Text Generation • 139B • Updated Feb 16 • 15

shieldstackllc/MiniMax-M2.5-REAP-29-mlx-4bit

Text Generation • 162B • Updated Feb 16 • 29

shieldstackllc/MiniMax-M2-5-REAP-19-mlx-4bit

Text Generation • Updated Feb 16 • 14

Runpod/FLUX.2-klein-4B-mflux-4bit

Text-to-Image • Updated Apr 10 • 1.69k • 8

mlx-community/Qwen3.5-397B-A17B-nvfp4

Text Generation • 396B • Updated Feb 16 • 217 • 5

manu02/kani-tts-2-pt-bnb-4bit-nf4-dq

Text Generation • 0.4B • Updated Feb 16 • 3 • 1

Al-hin/Llama-3.2-1B-Instruct-IMat-GGUF

Text Generation • 1B • Updated Feb 17 • 124

nagireddy5/medgemma-1.5-4b-lora-adapter-rank-8

Image-to-Text • Updated Feb 18

UtsuSl0th/trajectory-lora-repo-AWQ

Text Generation • 8B • Updated Feb 20 • 2

EricRollei/HunyuanImage-3-NF4-v2

Text-to-Image • 83B • Updated Feb 21 • 93 • 1

EricRollei/HunyuanImage-3.0-Instruct-NF4-v2

Text-to-Image • 83B • Updated Feb 22 • 74 • 5

EricRollei/HunyuanImage-3.0-Instruct-Distil-NF4-v2

Text-to-Image • 83B • Updated Feb 22 • 283 • 15

taniguchi-kyoichi/Qwen3-4B-Instruct-2507-ja-4bit

0.6B • Updated Feb 22 • 4

Tushe/nllb-200-600M-nf4-custom-weights-bare-metal

Translation • 0.4B • Updated Feb 24 • 2

Rarakiyo/Qwen2.5-7B-Agent-AWQ-v1

Text Generation • 8B • Updated Feb 23 • 5

UtsuSl0th/mixed-lora-1-awq

Text Generation • 8B • Updated Feb 24 • 2

gumush/turkish-finance-llama3.1-8b-mlx-4bit

Text Generation • 1B • Updated Feb 23 • 29