playable
/

playable1-int4-bfloat16

Model card Files Files and versions

iswaalex commited on Oct 14, 2025

Commit

9ec69e0

·

verified ·

1 Parent(s): 1356f91

Create README.md

Files changed (1) hide show

README.md +65 -0

README.md ADDED Viewed

	@@ -0,0 +1,65 @@

+---
+base_model:
+- playable/Playable1
+---
+### Quark Quantized Playable1
+This is a fine-tuned and quark quantized version of Qwen/Qwen2.5-Coder-7B-Instruct using the 'iat-05-1' adapter.
+Model Details
+Base Model: Qwen/Qwen2.5-Coder-7B-Instruct
+Adapter: iat-05-1
+Quantization: Quark / UINT4 / AWQ / BFLOAT16
+Format: SafeTensors
+Perplexity Score:  10.953088
+Dataset: wikitext-2-raw-v1
+[QUARK-INFO]: Quantizing with the quantization configuration:
+Config(
+    global_quant_config=QuantizationConfig(
+        input_tensors=None,
+        output_tensors=None,
+        weight=QuantizationSpec(
+            dtype=Dtype.uint4,
+            observer_cls=<class 'quark.torch.quantization.observer.observer.PerGroupMinMaxObserver'>,
+            is_dynamic=False,
+            qscheme=QSchemeType.per_group,
+            ch_axis=-1,
+            group_size=128,
+            symmetric=False,
+            round_method=RoundType.half_even,
+            scale_type=ScaleType.float,
+            scale_format=None,
+            scale_calculation_mode=None,
+            qat_spec=None,
+            mx_element_dtype=None,
+            zero_point_type=ZeroPointType.int32,
+            is_scale_quant=False,
+        ),
+        bias=None,
+        target_device=None,
+    ),
+    layer_type_quant_config={},
+    layer_quant_config={},
+    kv_cache_quant_config={},
+    kv_cache_group=['*k_proj', '*v_proj'],
+    min_kv_scale=0.0,
+    softmax_quant_spec=None,
+    exclude=['[]'],
+    algo_config=[
+        AWQConfig(
+            name="awq",
+            scaling_layers=[{'prev_op': 'input_layernorm', 'layers': ['self_attn.q_proj', 'self_attn.k_proj', 'self_attn.v_proj'], 'inp': 'self_attn.q_proj', 'module2inspect': 'self_attn'}, {'prev_op': 'self_attn.v_proj', 'layers': ['self_attn.o_proj'], 'inp': 'self_attn.o_proj'}, {'prev_op': 'post_attention_layernorm', 'layers': ['mlp.gate_proj', 'mlp.up_proj'], 'inp': 'mlp.gate_proj', 'module2inspect': 'mlp'}, {'prev_op': 'mlp.up_proj', 'layers': ['mlp.down_proj'], 'inp': 'mlp.down_proj'}],
+            model_decoder_layers="model.layers",
+        ),
+    ],
+    quant_mode=QuantizationMode.eager_mode,
+    log_severity_level=1,
+    version="0.10",
+)
+[QUARK-INFO]: In-place OPs replacement start.
+[QUARK-INFO]: Module exclusion from quantization summary:
+|      Exclude pattern       | Number of modules excluded |
+|             []             |             0              |