guanwenyu1995 commited on
Commit
68beec3
·
verified ·
1 Parent(s): a415ed2

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -12,7 +12,7 @@ library_name: transformers
12
 
13
  <p align="center">
14
  <a href="https://github.com/OpenBMB/MiniCPM/" target="_blank">GitHub Repo</a> |
15
- <a href="TODO_TECHNICAL_REPORT_LINK" target="_blank">Technical Report</a>
16
  </p>
17
  <p align="center">
18
  👋 Join us on <a href="https://discord.gg/3cGQn9b3YM" target="_blank">Discord</a> and <a href="https://github.com/OpenBMB/MiniCPM/blob/main/assets/wechat.jpg" target="_blank">WeChat</a>
@@ -108,7 +108,7 @@ The converted model can then be loaded for inference in the same way as [openbmb
108
 
109
  BitCPM4-CANN uses a ternary quantizer that maps each weight group to {-1, 0, 1} scaled by a group-wise factor, trained with Straight-Through Estimator (STE) for gradient flow. The unquantized checkpoint preserves the full-precision latent weights alongside the quantizer parameters, allowing the model to continue learning under quantization constraints during fine-tuning.
110
 
111
- For full technical details, please refer to our [Technical Report](TODO_TECHNICAL_REPORT_LINK).
112
 
113
  ## Statement
114
  - As a language model, BitCPM4-CANN generates content by learning from a vast amount of text.
 
12
 
13
  <p align="center">
14
  <a href="https://github.com/OpenBMB/MiniCPM/" target="_blank">GitHub Repo</a> |
15
+ <a href="https://github.com/OpenBMB/MiniCPM/blob/main/docs/BitCPM_CANN.pdf" target="_blank">Technical Report</a>
16
  </p>
17
  <p align="center">
18
  👋 Join us on <a href="https://discord.gg/3cGQn9b3YM" target="_blank">Discord</a> and <a href="https://github.com/OpenBMB/MiniCPM/blob/main/assets/wechat.jpg" target="_blank">WeChat</a>
 
108
 
109
  BitCPM4-CANN uses a ternary quantizer that maps each weight group to {-1, 0, 1} scaled by a group-wise factor, trained with Straight-Through Estimator (STE) for gradient flow. The unquantized checkpoint preserves the full-precision latent weights alongside the quantizer parameters, allowing the model to continue learning under quantization constraints during fine-tuning.
110
 
111
+ For full technical details, please refer to our [Technical Report](https://github.com/OpenBMB/MiniCPM/blob/main/docs/BitCPM_CANN.pdf).
112
 
113
  ## Statement
114
  - As a language model, BitCPM4-CANN generates content by learning from a vast amount of text.