Improve model card: add pipeline tag, library name, update license, and paper reference
#1
by
nielsr
HF Staff
- opened
README.md
CHANGED
|
@@ -1,16 +1,20 @@
|
|
| 1 |
---
|
| 2 |
-
datasets:
|
| 3 |
-
- NeelNanda/pile-10k
|
| 4 |
base_model:
|
| 5 |
- deepseek-ai/DeepSeek-R1
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 6 |
|
|
|
|
| 7 |
|
| 8 |
-
|
| 9 |
-
---
|
| 10 |
|
| 11 |
## Model Details
|
| 12 |
|
| 13 |
-
|
| 14 |
|
| 15 |
Please follow the license of the original model. This model could **NOT** run on other severing frameworks.
|
| 16 |
|
|
@@ -439,6 +443,13 @@ The license on this model does not constitute legal advice. We are not responsib
|
|
| 439 |
|
| 440 |
## Cite
|
| 441 |
|
| 442 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 443 |
|
| 444 |
-
[arxiv](https://arxiv.org/abs/
|
|
|
|
| 1 |
---
|
|
|
|
|
|
|
| 2 |
base_model:
|
| 3 |
- deepseek-ai/DeepSeek-R1
|
| 4 |
+
datasets:
|
| 5 |
+
- NeelNanda/pile-10k
|
| 6 |
+
pipeline_tag: text-generation
|
| 7 |
+
library_name: transformers
|
| 8 |
+
license: apache-2.0
|
| 9 |
+
---
|
| 10 |
|
| 11 |
+
This model is an int2 model with group_size 64 and symmetric quantization of [deepseek-ai/DeepSeek-R1](https://huggingface.co/deepseek-ai/DeepSeek-R1), generated by the **SignRoundV2** algorithm described in the paper [SignRoundV2: Closing the Performance Gap in Extremely Low-Bit Post-Training Quantization for LLMs](https://huggingface.co/papers/2512.04746).
|
| 12 |
|
| 13 |
+
For more details on the AutoRound project and its implementation, see the [GitHub repository](https://github.com/intel/auto-round).
|
|
|
|
| 14 |
|
| 15 |
## Model Details
|
| 16 |
|
| 17 |
+
Some layers are fallback to 4/16 bits. Refer to Section "Generate the model" for more details of mixed bits setting.
|
| 18 |
|
| 19 |
Please follow the license of the original model. This model could **NOT** run on other severing frameworks.
|
| 20 |
|
|
|
|
| 443 |
|
| 444 |
## Cite
|
| 445 |
|
| 446 |
+
```bibtex
|
| 447 |
+
@article{cheng2025signroundv2,
|
| 448 |
+
title={SignRoundV2: Closing the Performance Gap in Extremely Low-Bit Post-Training Quantization for LLMs},
|
| 449 |
+
author={Cheng, Wenhua and Zhang, Weiwei and Guo, Heng and Shen, Haihao},
|
| 450 |
+
journal={arXiv preprint arXiv:2512.04746},
|
| 451 |
+
year={2025}
|
| 452 |
+
}
|
| 453 |
+
```
|
| 454 |
|
| 455 |
+
[arxiv](https://arxiv.org/abs/2512.04746)
|