--- base_model: - playable/Playable1 --- ### Quark Quantized Playable1 This is a fine-tuned and quark quantized version of Qwen/Qwen2.5-Coder-7B-Instruct using the 'iat-05-1' adapter. ### Model Details - Base Model: Qwen/Qwen2.5-Coder-7B-Instruct - Adapter: iat-05-1 - Quantization: Quark / UINT4 / AWQ / BFLOAT16 - Format: SafeTensors - Perplexity Score: 10.953088 - Dataset: wikitext-2-raw-v1 ### Quark Info Quantizing with the quantization configuration: ```bash Config( global_quant_config=QuantizationConfig( input_tensors=None, output_tensors=None, weight=QuantizationSpec( dtype=Dtype.uint4, observer_cls=, is_dynamic=False, qscheme=QSchemeType.per_group, ch_axis=-1, group_size=128, symmetric=False, round_method=RoundType.half_even, scale_type=ScaleType.float, scale_format=None, scale_calculation_mode=None, qat_spec=None, mx_element_dtype=None, zero_point_type=ZeroPointType.int32, is_scale_quant=False, ), bias=None, target_device=None, ), layer_type_quant_config={}, layer_quant_config={}, kv_cache_quant_config={}, kv_cache_group=['*k_proj', '*v_proj'], min_kv_scale=0.0, softmax_quant_spec=None, exclude=['[]'], algo_config=[ AWQConfig( name="awq", scaling_layers=[{'prev_op': 'input_layernorm', 'layers': ['self_attn.q_proj', 'self_attn.k_proj', 'self_attn.v_proj'], 'inp': 'self_attn.q_proj', 'module2inspect': 'self_attn'}, {'prev_op': 'self_attn.v_proj', 'layers': ['self_attn.o_proj'], 'inp': 'self_attn.o_proj'}, {'prev_op': 'post_attention_layernorm', 'layers': ['mlp.gate_proj', 'mlp.up_proj'], 'inp': 'mlp.gate_proj', 'module2inspect': 'mlp'}, {'prev_op': 'mlp.up_proj', 'layers': ['mlp.down_proj'], 'inp': 'mlp.down_proj'}], model_decoder_layers="model.layers", ), ], quant_mode=QuantizationMode.eager_mode, log_severity_level=1, version="0.10", ) ```