Edit Models filters

Model Tree

Apps

Docker Model Runner

Inference Providers

OVHcloud AI Endpoints

HF Inference API

Misc

Inference Endpoints

text-generation-inference

Eval Results (legacy)

text-embeddings-inference

4-bit precision

8-bit precision

Mixture of Experts

Carbon Emissions

Models

1,821

Base only

Active filters: quantization

NoorNizar/Meta-Llama-3-8B-Instruct-WFP8

Text Generation • 8B • Updated Apr 21, 2025 • 3

NoorNizar/Meta-Llama-3-8B-Instruct-WINT8

Text Generation • 8B • Updated Apr 21, 2025 • 4

btaskel/Illustrious-XL-v2.0-GGUF

Text-to-Image • 3B • Updated Apr 21, 2025 • 447 • 8

agoor97/onnx-models

Updated Apr 28, 2025

TechyCode/tinyllama-sciq-lora

Text Generation • Updated Apr 23, 2025

Sumo10/Phi-4-mini-instruct-AWQ-4bit

4B • Updated Apr 25, 2025 • 50 • 1

Sumo10/Llama-3.2-3B-Instruct-AWQ-4bit

3B • Updated Apr 25, 2025 • 14

NoorNizar/Phi-4-mini-instruct-WINT4

Text Generation • 4B • Updated May 3, 2025 • 4

NoorNizar/Meta-Llama-3-8B-Instruct-WINT4

Text Generation • 8B • Updated May 3, 2025 • 2

NoorNizar/Llama-3.2-3B-Instruct-WINT4

Text Generation • 4B • Updated May 3, 2025 • 3

mengqin1/RedidreamNSFWI1-bnb-4bit

Updated May 10, 2025 • 1

stabilityai/stable-diffusion-3.5-large-tensorrt

Text-to-Image • Updated Oct 20, 2025 • 1.89k • 70

abdou-u/MNLP_M2_quantized_model

Text Generation • 0.6B • Updated May 19, 2025 • 3

diffusers/FLUX.1-dev-bnb-4bit

Text-to-Image • Updated May 20, 2025 • 860 • 5

diffusers/FLUX.1-dev-bnb-8bit

Text-to-Image • Updated May 20, 2025 • 106 • 3

diffusers/FLUX.1-dev-torchao-int8

Text-to-Image • Updated May 20, 2025 • 15 • 5

diffusers/FLUX.1-dev-torchao-int4

Text-to-Image • Updated May 20, 2025 • 7 • 1

diffusers/FLUX.1-dev-torchao-fp8

Text-to-Image • Updated May 21, 2025 • 350 • 3

textgeflecht/Devstral-Small-2505-FP8-llmcompressor

Text Generation • 24B • Updated May 25, 2025 • 13

fukayatti0/nllb-200-distilled-600M-4bit-efqat

Translation • Updated May 28, 2025 • 4

HighCWu/FLUX.1-dev-bnb-hqq-4bit

Text-to-Image • Updated May 29, 2025 • 17 • 1

fdtn-ai/Foundation-Sec-8B-Q8_0-GGUF

Text Generation • 8B • Updated Aug 26, 2025 • 45 • 4

ConfidentialMind/InternVL3-38B-FP8-Dynamic

Image-Text-to-Text • 38B • Updated Jul 7, 2025 • 97 • 2

fdtn-ai/Foundation-Sec-8B-Q4_K_M-GGUF

Text Generation • 8B • Updated Aug 26, 2025 • 173 • 3

mr-abhisharma/AceNemotron-14B-Quantize-8bit

Text Generation • 15B • Updated Jun 2, 2025 • 3

DESUCLUB/Llama-3.1-8B-Instruct-quantized.w8a8

Text Generation • 8B • Updated Jun 2, 2025 • 6

Thomaschtl/qwen3-0.6b-qat-test

Text Generation • Updated Jun 3, 2025

Thomaschtl/qwen3-06b-qat-test

Text Generation • Updated Jun 3, 2025 • 1

abdou-u/MNLP_M3_quantized_model

Text Generation • 0.6B • Updated Jun 8, 2025 • 5

DESUCLUB/Llama-3.1-8B-Instruct-bf16-quantized.w8a8

Text Generation • 8B • Updated Jun 4, 2025 • 3