Update README.md
Browse files
README.md
CHANGED
|
@@ -10,6 +10,9 @@ This model is a mixed int4 model with group_size 128 and symmetric quantization
|
|
| 10 |
Non expert layers are fallback to 8 bits. Please refer to Section Generate the model for more details.
|
| 11 |
Please follow the license of the original model.
|
| 12 |
|
|
|
|
|
|
|
|
|
|
| 13 |
## How To Use
|
| 14 |
|
| 15 |
### INT4 Inference
|
|
|
|
| 10 |
Non expert layers are fallback to 8 bits. Please refer to Section Generate the model for more details.
|
| 11 |
Please follow the license of the original model.
|
| 12 |
|
| 13 |
+
**The `e_score_correction_bias` is stored in BF16** because, when loaded in Transformers, its dtype is automatically converted to BF16. As a result, it is difficult for us to preserve it in FP32 within our tools.
|
| 14 |
+
Please use it with causion
|
| 15 |
+
|
| 16 |
## How To Use
|
| 17 |
|
| 18 |
### INT4 Inference
|