Edit Models filters

Model Tree

Apps

Docker Model Runner

Inference Providers

OVHcloud AI Endpoints

HF Inference API

Misc

Inference Endpoints

text-generation-inference

Eval Results (legacy)

text-embeddings-inference

4-bit precision

8-bit precision

Mixture of Experts

Carbon Emissions

Models

1,820

Base only

Active filters: quantization

brandonbeiler/InternVL3-8B-FP8-Dynamic

Image-Text-to-Text • 8B • Updated Jun 23, 2025 • 62 • 2

devatar/quantized_Llama-3.1-8B-Instruct

Text Generation • 8B • Updated Jun 20, 2025 • 3

gpahal/bge-m3-onnx-int8

Feature Extraction • Updated Jun 25, 2025 • 2.38k • 2

pravdin/merged-context-labs-meta-llama-Llama-3.2-3B-Instruct-FP16-unsloth-Llama-3.2-3B-Instruct-gguf

Text Generation • 3B • Updated Jun 26, 2025 • 18

pravdin/merged-Gensyn-Qwen2.5-1.5B-Instruct-Qwen-Qwen2.5-1.5B-Instruct-gguf

Text Generation • 2B • Updated Jun 27, 2025 • 27

pravdin/merged-Gensyn-Qwen2.5-1.5B-Instruct-deepseek-ai-DeepSeek-R1-Distill-Qwen-1.5B-gguf

Text Generation • 2B • Updated Jun 27, 2025 • 47

pravdin/Qwen2.5-1.5B-dare_linear-merge-gguf

Text Generation • 2B • Updated Jul 1, 2025 • 44

pravdin/meta-llama-Llam-Llama-3.2-3B-dare_linear-gguf

Text Generation • 3B • Updated Jun 28, 2025 • 9

pravdin/meta-I-Hermes-3-dare_linear-gguf

Text Generation • 3B • Updated Jun 28, 2025 • 19

pravdin/Qwen2.5-1.5B-DeepSeek-R1-dare_linear-gguf

Text Generation • 2B • Updated Jul 1, 2025 • 12

169Pi/Alpie-Core

Text Generation • Updated 8 days ago • 24 • 7

JustJaro/SmolLM-135M-FP8-Static

Image-Text-to-Text • 0.2B • Updated Jul 7, 2025 • 3

JustJaro/SmolLM-135M-FP8-Dynamic-Test

Image-Text-to-Text • 0.1B • Updated Jul 7, 2025 • 7

JustJaro/GOT-OCR-2.0-hf-FP8-Static

Image-Text-to-Text • 0.6B • Updated Jul 7, 2025 • 14

JustJaro/SmolLM-135M_gptq_g32_4bit

Text Generation • 0.1B • Updated Jul 7, 2025 • 19

ConfidentialMind/Mistral-Small-3.2-24B-Instruct-2506-GPTQ-AutoRound-TextOnly

Text Generation • 24B • Updated Jul 7, 2025 • 603

HighCWu/FLUX.1-Kontext-dev-bnb-hqq-4bit

Text-to-Image • Updated Jul 5, 2025 • 392 • 12

Compumacy/Llama-3.2-3B-Instruct-WINT8

Text Generation • 4B • Updated Jul 6, 2025 • 1

Compumacy/Meta-Llama-3-8B-Instruct-WFP8

Text Generation • 8B • Updated Jul 6, 2025 • 2

Compumacy/Meta-Llama-3-8B-Instruct-WINT8

Text Generation • 8B • Updated Jul 6, 2025 • 1

Compumacy/Phi-4-mini-instruct-WINT4

Text Generation • 4B • Updated Jul 6, 2025 • 2

Compumacy/Meta-Llama-3-8B-Instruct-WINT4

Text Generation • 8B • Updated Jul 6, 2025 • 1

Compumacy/Llama-3.2-3B-Instruct-WINT4

Text Generation • 4B • Updated Jul 6, 2025 • 1

brandonbeiler/Skywork-R1V3-38B-FP8-Dynamic

Image-Text-to-Text • 38B • Updated Jul 18, 2025 • 8 • 2

tflsxyy/DeepSeek-V2-Lite-BiMoE-w1g16

Text Generation • 2B • Updated Jul 14, 2025 • 5

codelion/Qwen3-0.6B-accuracy-recovery-lora

Text Generation • Updated Jul 13, 2025 • 4.68k • 5

Arivukkarasu/TinyLlama-1.1B-Chat-GGUF

1B • Updated Jul 15, 2025 • 16 • 1

PJEDeveloper/Mistral-Nemo-Instruct-2407-4bit-20250714_213418

13B • Updated Jul 15, 2025 • 4

Arivukkarasu/Mistral-7B-Instruct-v0.3-GGUF

7B • Updated Jul 15, 2025 • 4

PJEDeveloper/Mistral-7B-Instruct-v0.3-4bit-20250716_003938

Text Generation • 7B • Updated Jul 16, 2025 • 4