Straightforward dynamic FP8 quant using llmcompressor. Nice for performance on Hopper and Blackwell GPUs.
Tested with nightly vllm on November 25-26, 2025.
- Downloads last month
- 10
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
Model tree for ramendik/Vistral-24B-Instruct-FP8
Base model
mistralai/Mistral-Small-3.1-24B-Base-2503
Finetuned
Vikhrmodels/Vistral-24B-Instruct