Edit Models filters

Model Tree

Apps

Docker Model Runner

Inference Providers

OVHcloud AI Endpoints

HF Inference API

Misc

Inference Endpoints

text-generation-inference

Eval Results (legacy)

text-embeddings-inference

4-bit precision

8-bit precision

Mixture of Experts

Carbon Emissions

Models

1,824

Base only

Active filters: quantization

Fu01978/Hunyuan-0.5B-Instruct-GGUF

Text Generation • 0.5B • Updated Jan 31 • 218 • 1

SuperPauly/Gemmasutra_LiteRT-LM

Text Generation • Updated Feb 1 • 7

Fu01978/DialoGPT-medium-GGUF

Text Generation • 0.4B • Updated Feb 11 • 83

tonera/Beyond_Reality_Zimage_v2_svdq

Text-to-Image • Updated Feb 1 • 37 • 1

ecorbari/Gemma-2b-it-Psych-GGUF

Text Generation • 3B • Updated Feb 1 • 4

JongYeop/Qwen2.5-7B-Instruct-FP8-W8A8

Text Generation • 8B • Updated Feb 2 • 42

JongYeop/Qwen2.5-7B-Instruct-FP4-W4A4

Text Generation • 5B • Updated Feb 2 • 5

ghostai1/Easter-Spirit2b

Text Generation • 2B • Updated Feb 2 • 92 • 1

ghostai1/halloween1.3b_poltergeist

Text Generation • 2B • Updated Feb 3 • 62 • 1

stepfun-ai/Step3-VL-10B-FP8

Image-Text-to-Text • Updated Feb 4 • 192 • 10

JEILDLWLRMA/Qwen3-VL-8B-Instruct-NVFP4

Image-to-Text • 6B • Updated Feb 2 • 902 • 1

cloud19/sainemo-12b-fp4-blackwell

12B • Updated Feb 2 • 2

Habibur2/Llama-3.2-1B-Instruct-GGUF

Text Generation • 1B • Updated Feb 2 • 75

Heehobino/nihao-zhenzhen-kws

Audio Classification • Updated Feb 4 • 4

ghostai1/NewYears1.5b

Text Generation • 2B • Updated Feb 4 • 19

rtj1/Qwen2.5-0.5B-AWQ-FP8-Dynamic

Text Generation • 0.6B • Updated Feb 10 • 7

rtj1/Qwen2.5-0.5B-AWQ-FP8-Block

Text Generation • 0.6B • Updated Feb 10 • 5

JongYeop/Qwen2.5-14B-Instruct-FP4-W4A4

Text Generation • 9B • Updated Feb 6 • 17

JongYeop/Qwen2.5-14B-Instruct-FP8-W8A8

Text Generation • 15B • Updated Feb 6 • 40

ghostai1/Egypt_Historical7b

Text Generation • 7B • Updated Feb 6 • 40 • 4

Geodd/GLM-4.7-Flash-FP8

Text Generation • 30B • Updated Feb 6 • 16

Geodd/GLM-4.7-Flash-W8A16

Text Generation • 0.7B • Updated Feb 7 • 19 • 1

Shifusen/Maginum-Cydoms-24B-NVFP4

Text Generation • 14B • Updated Feb 7 • 5

Silan10/flux-quanto-int8

Text-to-Image • Updated Feb 8 • 14

Vishva007/Qwen3-VL-2B-Instruct-W4A16-AutoRound

Image-Text-to-Text • 0.9B • Updated Feb 7 • 3

Vishva007/Qwen3-VL-2B-Instruct-W4A16-AutoRound-GPTQ

Image-Text-to-Text • 2B • Updated Feb 7 • 3

Vishva007/Qwen3-VL-2B-Instruct-W4A16-AutoRound-AWQ

Image-Text-to-Text • 2B • Updated Feb 7 • 26

Vishva007/Qwen3-VL-4B-Instruct-W4A16-AutoRound

Image-Text-to-Text • 1B • Updated Feb 7 • 2

Vishva007/Qwen3-VL-4B-Instruct-W4A16-AutoRound-GPTQ

Image-Text-to-Text • 4B • Updated Feb 7 • 15

Vishva007/Qwen3-VL-4B-Instruct-W4A16-AutoRound-AWQ

Image-Text-to-Text • 4B • Updated Feb 7 • 161 • 1