Edit Models filters

Model Tree

Apps

Docker Model Runner

Inference Providers

OVHcloud AI Endpoints

HF Inference API

Misc

Inference Endpoints

text-generation-inference

Eval Results (legacy)

text-embeddings-inference

4-bit precision

8-bit precision

Mixture of Experts

Carbon Emissions

Models

11,616

Base only

Active filters: quantized

nvidia/Llama-4-Scout-17B-16E-Instruct-NVFP4

56B • Updated Dec 12, 2025 • 81.5k • 32

nvidia/Llama-4-Maverick-17B-128E-Instruct-FP8

402B • Updated Jul 29, 2025 • 284 • 15

nvidia/Llama-4-Scout-17B-16E-Instruct-FP8

109B • Updated Dec 12, 2025 • 333k • 16

RedHatAI/Mistral-Small-3.1-24B-Instruct-2503-quantized.w4a16

Image-Text-to-Text • 24B • Updated Jun 9 • 32.1k • 10

bubblspace/Bubbl-P4-multimodal-instruct

6B • Updated Apr 15, 2025 • 11 • 7

RedHatAI/Mistral-Small-3.1-24B-Instruct-2503-quantized.w8a8

Image-Text-to-Text • 24B • Updated Jun 9 • 1.13k • 5

JWMD/cogito-v1-preview-qwen-32B-GPTQ-8bit-fixed-tokenizer

Text Generation • 33B • Updated Apr 19, 2025 • 10

TheMelonGod/ZR1-1.5B-exl2

Text Generation • Updated Apr 18, 2025 • 4

RedHatAI/DeepSeek-R1-quantized.w4a16

Text Generation • 676B • Updated Sep 19, 2025 • 8.61k • 7

ABDALLALSWAITI/FLUX.1-dev-ControlNet-Union-Pro-2.0-fp8

Text-to-Image • Updated Apr 19, 2025 • 3.23k • 58

phires/Llama-3.2-1B-Instruct-GGUF-rk3588-1.1.2

Text Generation • Updated Apr 20, 2025 • 5

MaziyarPanahi/GLM-4-32B-0414-GGUF

Text Generation • 33B • Updated Apr 22, 2025 • 37 • 1

MaziyarPanahi/cogito-v1-preview-llama-3B-GGUF

Text Generation • 4B • Updated Apr 21, 2025 • 204 • 1

MaziyarPanahi/cogito-v1-preview-llama-8B-GGUF

Text Generation • 8B • Updated Apr 21, 2025 • 125 • 1

MaziyarPanahi/cogito-v1-preview-llama-70B-GGUF

Text Generation • 71B • Updated Apr 22, 2025 • 41 • 1

TheMelonGod/DeepCoder-1.5B-Preview-exl2

Text Generation • Updated Apr 23, 2025 • 15

KanNaga/qwen2.5-vl-7b-instruct-text-iq1s_20250410

8B • Updated Apr 23, 2025 • 2

christopherheuer/phi3-mini-4k-instruct-pii-redactor-q4_k_m-gguf

Text Generation • 4B • Updated Apr 25, 2025 • 19

RedHatAI/Llama-4-Scout-17B-16E-Instruct-quantized.w4a16

Image-Text-to-Text • 109B • Updated Jun 9 • 4.87k • 13

steampunque/Llama-4-Scout-17B-16E-Instruct-MP-GGUF

Text Generation • 108B • Updated Feb 18 • 203 • 1

MaziyarPanahi/Qwen3-0.6B-GGUF

Text Generation • 0.8B • Updated Apr 28, 2025 • 197k • 13

MaziyarPanahi/Qwen3-1.7B-GGUF

Text Generation • 2B • Updated Apr 28, 2025 • 190k • 8

MaziyarPanahi/Qwen3-8B-GGUF

Text Generation • 8B • Updated Apr 28, 2025 • 193k • 10

MaziyarPanahi/Qwen3-4B-GGUF

Text Generation • 4B • Updated Apr 28, 2025 • 192k • 7

MaziyarPanahi/Qwen3-14B-GGUF

Text Generation • 15B • Updated Apr 28, 2025 • 196k • 11

MaziyarPanahi/Qwen3-32B-GGUF

Text Generation • 33B • Updated Apr 29, 2025 • 189k • 2

MaziyarPanahi/Qwen3-30B-A3B-GGUF

Text Generation • 31B • Updated Apr 29, 2025 • 189k • 5

NextGenC/erynn-1-774m

Text Generation • Updated Apr 30, 2025 • 2

jedisct1/MiMo-7B-RL-GGUF

8B • Updated Apr 30, 2025 • 1.24k • 29

lex-au/shuttle-3.5-Q8_0-GGUF

33B • Updated May 1, 2025 • 1