Qwen2.5-VL-7B-Instruct-INT4-quantized / quantization_config.json
Azaz666's picture
INT4 quantization of Qwen/Qwen2.5-VL-7B-Instruct (LLM backbone quantized, vision encoder fp16)
544728e verified
{
"base_model": "Qwen/Qwen2.5-VL-7B-Instruct",
"quantization": "int4_per_group_symmetric",
"group_size": 128,
"bits": 4,
"method": "static_int4_dequantized",
"description": "INT4 per-group symmetric quantization of LLM backbone weights. Vision encoder kept in fp16. Weights stored as dequantized fp16 for maximum compatibility.",
"quantized_layers": 197,
"skipped_vision_layers": 162
}