Edit Models filters

Apps

Docker Model Runner

Inference Providers

OVHcloud AI Endpoints

HF Inference API

Misc

Inference Endpoints

text-generation-inference

Eval Results (legacy)

text-embeddings-inference

4-bit precision

8-bit precision

Mixture of Experts

Carbon Emissions

Models

467

Full-text search

Active filters: int8

FriendliAI/Meta-Llama-3.1-70B-Instruct-int8

Text Generation • 71B • Updated Sep 9, 2024 • 2

RedHatAI/Qwen2.5-7B-Instruct-quantized.w8a8

Text Generation • 8B • Updated Sep 22, 2025 • 292 • 2

RedHatAI/Qwen2.5-0.5B-quantized.w8a16

Text Generation • 0.4B • Updated Nov 26, 2024

RedHatAI/Qwen2.5-1.5B-quantized.w8a16

Text Generation • 0.8B • Updated Nov 26, 2024 • 1

RedHatAI/Qwen2.5-3B-quantized.w8a16

Text Generation • 1B • Updated Nov 26, 2024 • 1

RedHatAI/Qwen2.5-7B-quantized.w8a16

Text Generation • 3B • Updated Nov 26, 2024 • 1 • 1

RedHatAI/Qwen2.5-32B-quantized.w8a16

Text Generation • 9B • Updated Nov 26, 2024 • 2

RedHatAI/Qwen2.5-72B-quantized.w8a16

Text Generation • 20B • Updated Nov 26, 2024 • 1

avans06/Meta-Llama-3.1-8B-Instruct-ct2-int8_float16

Text Generation • Updated Oct 10, 2024 • 2

avans06/Meta-Llama-3.2-8B-Instruct-ct2-int8_float16

Text Generation • Updated Oct 13, 2024 • 20

minpeter/Qwen-Qwen2.5-14B-Instruct-fmo-int8

15B • Updated Nov 8, 2024

minpeter/Qwen-Qwen2.5-32B-Instruct-fmo-int8

33B • Updated Nov 8, 2024

minpeter/anthracite-org-magnum-v4-72b-fmo-int8

73B • Updated Nov 8, 2024

SteveTran/T5-small-query-expansion-INT8

Text Generation • Updated Nov 16, 2024 • 7 • 2

McaTech/Nonet

Text Generation • 0.1B • Updated Jun 30, 2025 • 303 • 3

mradermacher/ecastera-eva-westlake-7b-spanish-GGUF

7B • Updated Dec 22, 2024 • 86

RedHatAI/Llama-3.1-Nemotron-70B-Instruct-HF-quantized.w8a8

Text Generation • 71B • Updated Jan 3, 2025 • 3

RedHatAI/QwQ-32B-Preview-quantized.w8a8

Text Generation • 33B • Updated Jan 3, 2025 • 1

NeoChen1024/Dolphin3.0-Llama3.1-8B-W8A8

8B • Updated Apr 18, 2025

NeoChen1024/dolphin-2.9.3-mistral-7B-32k-W8A8

7B • Updated Jan 6, 2025 • 3

RedHatAI/granite-3.1-8b-instruct-quantized.w8a8

Text Generation • 8B • Updated Sep 25, 2025 • 113 • 2

RedHatAI/granite-3.1-2b-instruct-quantized.w8a8

Text Generation • 3B • Updated Feb 28, 2025 • 7

RedHatAI/granite-3.1-2b-base-quantized.w8a8

Text Generation • 3B • Updated Feb 28, 2025 • 7

RedHatAI/granite-3.1-8b-base-quantized.w8a8

Text Generation • 8B • Updated Feb 28, 2025 • 80

NeoChen1024/Ministral-8B-Instruct-2410-W8A8

8B • Updated Jan 17, 2025 • 3 • 2

RedHatAI/Llama-3.3-70B-Instruct-quantized.w8a8

Text Generation • 71B • Updated Sep 22, 2025 • 1.93k • 13

RedHatAI/DeepSeek-R1-Distill-Llama-8B-quantized.w8a8

Text Generation • 8B • Updated Feb 27, 2025 • 4.78k • 2

RedHatAI/DeepSeek-R1-Distill-Llama-70B-quantized.w8a8

Text Generation • 71B • Updated Feb 27, 2025 • 298 • 2

RedHatAI/DeepSeek-R1-Distill-Qwen-14B-quantized.w8a8

Text Generation • 15B • Updated Feb 27, 2025 • 3.18k • 2

RedHatAI/DeepSeek-R1-Distill-Qwen-32B-quantized.w8a8

Text Generation • Updated Feb 27, 2025 • 121 • 13