Edit Models filters

Model Tree

Apps

Docker Model Runner

Inference Providers

OVHcloud AI Endpoints

HF Inference API

Misc

Inference Endpoints

text-generation-inference

Eval Results (legacy)

text-embeddings-inference

4-bit precision

8-bit precision

Mixture of Experts

Carbon Emissions

Models

2,655

Base only

Active filters: fp8

InfiX-ai/InfiR2-1.5B-base-FP8

2B • Updated Oct 15, 2025 • 4

InfiX-ai/InfiR2-7B-base-FP8

8B • Updated Oct 15, 2025 • 2

InfiX-ai/InfiR2-R1-7B-FP8-Preview

8B • Updated Oct 15, 2025 • 1

InfiX-ai/InfiR2-1.5B-Instruct-FP8

2B • Updated Oct 15, 2025 • 6

InfiX-ai/InfiR2-7B-Instruct-FP8

8B • Updated Oct 15, 2025 • 1

wangkanai/sdxl-fp8-loras

Text-to-Image • Updated Oct 28, 2025 • 1

wangkanai/wan21-fp8-480p

Image-to-Video • Updated Oct 28, 2025 • 1

kathywu95/deepseek-v3-small-random-fp8

14B • Updated Jan 16 • 88

Bellesteck/Apriel-1.5-15b-Thinker-FP8-W8A8

Image-Text-to-Text • 14B • Updated Oct 13, 2025 • 2

cerebras/Qwen3-Coder-REAP-363B-A35B-FP8

Text Generation • Updated Oct 14, 2025 • 39 • 16

cerebras/Qwen3-Coder-REAP-246B-A35B-FP8

Text Generation • 246B • Updated Oct 14, 2025 • 143 • 22

wangkanai/wan22-fp8-i2v-loras

Text-to-Video • Updated Oct 28, 2025 • 1

wangkanai/wan25-fp8-i2v

Image-to-Video • Updated Oct 28, 2025 • 1

unsloth/Qwen3-VL-8B-Thinking-FP8

Image-Text-to-Text • 9B • Updated Nov 24, 2025 • 298 • 2

unsloth/Qwen3-VL-4B-Instruct-FP8

Image-Text-to-Text • 5B • Updated Nov 24, 2025 • 49 • 2

unsloth/Qwen3-VL-8B-Instruct-FP8

Image-Text-to-Text • 9B • Updated Nov 24, 2025 • 925 • 7

philkuz/llama-3.3-70b-instruct-fp8

Text Generation • 71B • Updated Oct 14, 2025 • 16 • 1

theostos/LLM4Docq-annotator-fp8

33B • Updated Oct 15, 2025 • 1

theostos/babel-translate-fp8

33B • Updated Oct 15, 2025 • 4

theostos/babel-ssreflect-fp8

33B • Updated Oct 15, 2025 • 1

WenxinChen66/DeepSeek-R1-0528-Channel-INT8

Text Generation • 685B • Updated Oct 16, 2025 • 2 • 2

FlagRelease/DeepSeek-V3.2-Exp-FlagOS

685B • Updated Mar 9 • 2

fraseque/Llama-3.3-70B-FP8-Instruct-Neuron

Text Generation • 71B • Updated Nov 3, 2025 • 19

Qwen/Qwen3-VL-32B-Thinking-FP8

Image-Text-to-Text • 33B • Updated Nov 26, 2025 • 128k • 27

Qwen/Qwen3-VL-32B-Instruct-FP8

Image-Text-to-Text • 33B • Updated Oct 22, 2025 • 725k • 46

Qwen/Qwen3-VL-2B-Instruct-FP8

Image-Text-to-Text • 2B • Updated Oct 20, 2025 • 30.1k • 42

Qwen/Qwen3-VL-2B-Thinking-FP8

Image-Text-to-Text • 2B • Updated Nov 26, 2025 • 798 • 31

selimaktas/Qwen3-14B-FP8-MinMax

15B • Updated Oct 20, 2025 • 1

nm-testing/Llama-4-Scout-17B-16E-Instruct-BLOCK-FP8

Text Generation • 109B • Updated Oct 27, 2025 • 13 • 1

RedHatAI/Llama-3.3-70B-Instruct-FP8-block

Text Generation • 71B • Updated Oct 24, 2025 • 695