Qwen3.5-27B-NVFP4

This is a quantized version of Qwen/Qwen3.5-27B using the NVFP4 quantization scheme. Supports MTP and multi-modal.

Please use nightly vLLM for support.

Changelog

  • 02/03/2026: Initial upload.

Calibration

Creation

This model was created using VLLM's LLM Compressor. The quantization ignores lm_head, re:.*linear_attn.*, and re:model\.visual\..* layers, preserving the vision encoder and linear attention modules at full precision. The model was loaded with AutoModelForImageTextToText.

Downloads last month
2,164
Safetensors
Model size
19B params
Tensor type
F32
BF16
F8_E4M3
U8
Inference Providers NEW
This model isn't deployed by any Inference Provider. 馃檵 Ask for provider support

Model tree for Sehyo/Qwen3.5-27B-NVFP4

Base model

Qwen/Qwen3.5-27B
Quantized
(83)
this model