FP8 Models - a nm-testing Collection

nm-testing 's Collections

KV Cache Quantization

FP8-Block Quantized Models

LLM Compressor testing

Speculators testing

Sparse-Llama-3.1-8B-2of4

FP8 Models

updated Nov 17, 2025

RedHatAI/Meta-Llama-3-8B-Instruct-FP8

Text Generation • 8B • Updated Jul 18, 2024 • 3.35k • • 24
RedHatAI/Meta-Llama-3-8B-Instruct-FP8-KV

Text Generation • 8B • Updated Sep 15, 2025 • 31.9k • • 10
RedHatAI/Mixtral-8x7B-Instruct-v0.1-AutoFP8

Text Generation • 47B • Updated Jul 18, 2024 • 44 • 3
RedHatAI/Meta-Llama-3-70B-Instruct-FP8

Text Generation • 71B • Updated Jul 18, 2024 • 2.31k • 13
RedHatAI/Qwen2-72B-Instruct-FP8

Text Generation • 73B • Updated Jul 18, 2024 • 1.2k • 16