soroushtabesh commited on
Commit
89fa3b5
·
verified ·
1 Parent(s): 5893d4a

Add storage layout details

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -46,7 +46,7 @@ standard zero-shot suite (ARC-C/E, HellaSwag, PIQA, Winogrande):
46
  - **Bits / weight (effective):** ≈2.13 bpp
47
  - **Codebook:** 2-bit symmetric scalar `{-2, -1, 0, +1} × scale`
48
  - **Group size:** 128
49
- - **Format:** [Humming](https://github.com/IST-DASLab/humming) (`quant_method: "humming"`, `b_dtype: "uint2"`)
50
  - **Pipeline:** GPTQ initialization → Gumbel-Softmax refinement (Lion optimizer)
51
 
52
  ### Storage layout (why the HF UI shows I32 + BF16)
@@ -84,7 +84,7 @@ weight is `2 bits (packed) + 16 bits / 128 (group scale) ≈ 2.13 bpp`. The
84
  ```
85
 
86
  Loading this checkpoint requires a vLLM build with the
87
- [`humming`](https://github.com/IST-DASLab/humming) MoE kernel installed (see
88
  the [GSQ repo](https://github.com/IST-DASLab/GSQ) `scripts/setup_env.sh` for
89
  the exact install line).
90
 
 
46
  - **Bits / weight (effective):** ≈2.13 bpp
47
  - **Codebook:** 2-bit symmetric scalar `{-2, -1, 0, +1} × scale`
48
  - **Group size:** 128
49
+ - **Format:** [Humming](https://github.com/inclusionAI/humming) (`quant_method: "humming"`, `b_dtype: "uint2"`)
50
  - **Pipeline:** GPTQ initialization → Gumbel-Softmax refinement (Lion optimizer)
51
 
52
  ### Storage layout (why the HF UI shows I32 + BF16)
 
84
  ```
85
 
86
  Loading this checkpoint requires a vLLM build with the
87
+ [`humming`](https://github.com/inclusionAI/humming) MoE kernel installed (see
88
  the [GSQ repo](https://github.com/IST-DASLab/GSQ) `scripts/setup_env.sh` for
89
  the exact install line).
90