INT4 quantization of Qwen/Qwen2.5-VL-7B-Instruct (LLM backbone quantized, vision encoder fp16)

544728e verified 2 months ago

401 Bytes

	{
	"base_model": "Qwen/Qwen2.5-VL-7B-Instruct",
	"quantization": "int4_per_group_symmetric",
	"group_size": 128,
	"bits": 4,
	"method": "static_int4_dequantized",
	"description": "INT4 per-group symmetric quantization of LLM backbone weights. Vision encoder kept in fp16. Weights stored as dequantized fp16 for maximum compatibility.",
	"quantized_layers": 197,
	"skipped_vision_layers": 162
	}