iswaalex commited on
Commit
9ec69e0
·
verified ·
1 Parent(s): 1356f91

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +65 -0
README.md ADDED
@@ -0,0 +1,65 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model:
3
+ - playable/Playable1
4
+ ---
5
+ ### Quark Quantized Playable1
6
+
7
+ This is a fine-tuned and quark quantized version of Qwen/Qwen2.5-Coder-7B-Instruct using the 'iat-05-1' adapter.
8
+
9
+ Model Details
10
+ Base Model: Qwen/Qwen2.5-Coder-7B-Instruct
11
+ Adapter: iat-05-1
12
+ Quantization: Quark / UINT4 / AWQ / BFLOAT16
13
+ Format: SafeTensors
14
+ Perplexity Score: 10.953088
15
+ Dataset: wikitext-2-raw-v1
16
+
17
+ [QUARK-INFO]: Quantizing with the quantization configuration:
18
+ Config(
19
+ global_quant_config=QuantizationConfig(
20
+ input_tensors=None,
21
+ output_tensors=None,
22
+ weight=QuantizationSpec(
23
+ dtype=Dtype.uint4,
24
+ observer_cls=<class 'quark.torch.quantization.observer.observer.PerGroupMinMaxObserver'>,
25
+ is_dynamic=False,
26
+ qscheme=QSchemeType.per_group,
27
+ ch_axis=-1,
28
+ group_size=128,
29
+ symmetric=False,
30
+ round_method=RoundType.half_even,
31
+ scale_type=ScaleType.float,
32
+ scale_format=None,
33
+ scale_calculation_mode=None,
34
+ qat_spec=None,
35
+ mx_element_dtype=None,
36
+ zero_point_type=ZeroPointType.int32,
37
+ is_scale_quant=False,
38
+ ),
39
+ bias=None,
40
+ target_device=None,
41
+ ),
42
+ layer_type_quant_config={},
43
+ layer_quant_config={},
44
+ kv_cache_quant_config={},
45
+ kv_cache_group=['*k_proj', '*v_proj'],
46
+ min_kv_scale=0.0,
47
+ softmax_quant_spec=None,
48
+ exclude=['[]'],
49
+ algo_config=[
50
+ AWQConfig(
51
+ name="awq",
52
+ scaling_layers=[{'prev_op': 'input_layernorm', 'layers': ['self_attn.q_proj', 'self_attn.k_proj', 'self_attn.v_proj'], 'inp': 'self_attn.q_proj', 'module2inspect': 'self_attn'}, {'prev_op': 'self_attn.v_proj', 'layers': ['self_attn.o_proj'], 'inp': 'self_attn.o_proj'}, {'prev_op': 'post_attention_layernorm', 'layers': ['mlp.gate_proj', 'mlp.up_proj'], 'inp': 'mlp.gate_proj', 'module2inspect': 'mlp'}, {'prev_op': 'mlp.up_proj', 'layers': ['mlp.down_proj'], 'inp': 'mlp.down_proj'}],
53
+ model_decoder_layers="model.layers",
54
+ ),
55
+ ],
56
+ quant_mode=QuantizationMode.eager_mode,
57
+ log_severity_level=1,
58
+ version="0.10",
59
+ )
60
+
61
+ [QUARK-INFO]: In-place OPs replacement start.
62
+
63
+ [QUARK-INFO]: Module exclusion from quantization summary:
64
+ | Exclude pattern | Number of modules excluded |
65
+ | [] | 0 |