Update README.md
Browse files
README.md
CHANGED
|
@@ -12,7 +12,7 @@ base_model:
|
|
| 12 |
|
| 13 |
This model is an int2 model with group_size 64 and symmetric quantization of [deepseek-ai/DeepSeek-R1](https://huggingface.co/deepseek-ai/DeepSeek-R1) generated by [intel/auto-round](https://github.com/intel/auto-round) algorithm. Some layers are fallback to 4/16 bits. Refer to Section "Generate the model" for more details of mixed bits setting.
|
| 14 |
|
| 15 |
-
Please follow the license of the original model. This model could **NOT** run on other severing
|
| 16 |
|
| 17 |
## How To Use
|
| 18 |
|
|
@@ -20,7 +20,7 @@ Please follow the license of the original model. This model could **NOT** run on
|
|
| 20 |
|
| 21 |
please note int2 **may be slower** than int4 on CUDA due to kernel issue.
|
| 22 |
|
| 23 |
-
**To prevent potential overflow, we recommend using the CPU version detailed in the next section.**
|
| 24 |
|
| 25 |
~~~python
|
| 26 |
import transformers
|
|
|
|
| 12 |
|
| 13 |
This model is an int2 model with group_size 64 and symmetric quantization of [deepseek-ai/DeepSeek-R1](https://huggingface.co/deepseek-ai/DeepSeek-R1) generated by [intel/auto-round](https://github.com/intel/auto-round) algorithm. Some layers are fallback to 4/16 bits. Refer to Section "Generate the model" for more details of mixed bits setting.
|
| 14 |
|
| 15 |
+
Please follow the license of the original model. This model could **NOT** run on other severing frameworks.
|
| 16 |
|
| 17 |
## How To Use
|
| 18 |
|
|
|
|
| 20 |
|
| 21 |
please note int2 **may be slower** than int4 on CUDA due to kernel issue.
|
| 22 |
|
| 23 |
+
**To prevent potential overflow and achieve better accuracy, we recommend using the CPU version detailed in the next section.**
|
| 24 |
|
| 25 |
~~~python
|
| 26 |
import transformers
|