Qwen-3.5 Collection
Collection
Quantized Qwen3.5 models for efficient image-text understanding (AutoRound W4A16). • 7 items • Updated • 1
This is a W4A16 (4-bit weight, 16-bit activation) quantized version of Qwen/Qwen3.5-0.8B, produced using AutoRound — Intel's sign gradient descent based quantization method designed for production-grade accuracy retention.
| Parameter | Value |
|---|---|
| Method | AutoRound (W4A16) |
| Group Size | 128 |
| Symmetric | Yes |
| Iterations | 1000 |
| Calibration Samples | 512 |
| Sequence Length | 2048 |
| Torch Compile | Enabled |
This model is compatible with transformers and backends that support AutoRound format weights (e.g., vLLM, SGLang). For full model details, architecture, and capabilities, refer to the base model page.