Add pipeline tag, library name, and base_model metadata
#1
by
nielsr HF Staff - opened
README.md
CHANGED
|
@@ -4,7 +4,11 @@ tags:
|
|
| 4 |
- 3-bit
|
| 5 |
- Quantization
|
| 6 |
- Pseudo-Quantization
|
|
|
|
|
|
|
|
|
|
| 7 |
---
|
|
|
|
| 8 |
# QuantLRM-R1-Llama-70B-3-bit
|
| 9 |
|
| 10 |
3-bit quantized `DeepSeek-R1-Distill-Llama-70B` based on [QuantLRM](https://www.arxiv.org/abs/2602.02581), a state-of-the-art quantization method of large reasoning models via fine-tuning signals
|
|
@@ -15,15 +19,12 @@ This is the pseudo-quantized model (weights are dequantized back to full-precisi
|
|
| 15 |
|
| 16 |
### Model Description
|
| 17 |
|
| 18 |
-
<!-- Provide a longer summary of what this model is. -->
|
| 19 |
-
|
| 20 |
|
| 21 |
- **Developed by:** Nan Zhang (njz5124@psu.edu)
|
| 22 |
- **Model type:** 3-bit pseudo-quantized version of `DeepSeek-R1-Distill-Llama-70B`
|
| 23 |
|
| 24 |
### Model Sources
|
| 25 |
|
| 26 |
-
<!-- Provide the basic links for the model. -->
|
| 27 |
|
| 28 |
- **Repository:** https://github.com/psunlpgroup/QuantLRM
|
| 29 |
- **Paper:** https://www.arxiv.org/abs/2602.02581
|
|
@@ -31,7 +32,6 @@ This is the pseudo-quantized model (weights are dequantized back to full-precisi
|
|
| 31 |
|
| 32 |
## Uses
|
| 33 |
|
| 34 |
-
<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
|
| 35 |
|
| 36 |
This model is designed to be used with `vLLM` due to its inference optimization. Please use the tokenizer of `deepseek-ai/DeepSeek-R1-Distill-Llama-70B`.
|
| 37 |
|
|
@@ -49,7 +49,6 @@ This model achieves 2.12% improvement (based on average scores of various reason
|
|
| 49 |
|
| 50 |
## Citation
|
| 51 |
|
| 52 |
-
<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
|
| 53 |
|
| 54 |
**BibTeX:**
|
| 55 |
|
|
|
|
| 4 |
- 3-bit
|
| 5 |
- Quantization
|
| 6 |
- Pseudo-Quantization
|
| 7 |
+
pipeline_tag: text-generation
|
| 8 |
+
library_name: transformers
|
| 9 |
+
base_model: deepseek-ai/DeepSeek-R1-Distill-Llama-70B
|
| 10 |
---
|
| 11 |
+
|
| 12 |
# QuantLRM-R1-Llama-70B-3-bit
|
| 13 |
|
| 14 |
3-bit quantized `DeepSeek-R1-Distill-Llama-70B` based on [QuantLRM](https://www.arxiv.org/abs/2602.02581), a state-of-the-art quantization method of large reasoning models via fine-tuning signals
|
|
|
|
| 19 |
|
| 20 |
### Model Description
|
| 21 |
|
|
|
|
|
|
|
| 22 |
|
| 23 |
- **Developed by:** Nan Zhang (njz5124@psu.edu)
|
| 24 |
- **Model type:** 3-bit pseudo-quantized version of `DeepSeek-R1-Distill-Llama-70B`
|
| 25 |
|
| 26 |
### Model Sources
|
| 27 |
|
|
|
|
| 28 |
|
| 29 |
- **Repository:** https://github.com/psunlpgroup/QuantLRM
|
| 30 |
- **Paper:** https://www.arxiv.org/abs/2602.02581
|
|
|
|
| 32 |
|
| 33 |
## Uses
|
| 34 |
|
|
|
|
| 35 |
|
| 36 |
This model is designed to be used with `vLLM` due to its inference optimization. Please use the tokenizer of `deepseek-ai/DeepSeek-R1-Distill-Llama-70B`.
|
| 37 |
|
|
|
|
| 49 |
|
| 50 |
## Citation
|
| 51 |
|
|
|
|
| 52 |
|
| 53 |
**BibTeX:**
|
| 54 |
|