guanwenyu1995 commited on
Commit
f784def
·
verified ·
1 Parent(s): 19c76bb

Update README naming from BitCPM4 to BitCPM

Browse files
Files changed (1) hide show
  1. README.md +15 -15
README.md CHANGED
@@ -20,9 +20,9 @@ library_name: transformers
20
 
21
  ## Introduction
22
 
23
- BitCPM4-CANN is the first end-to-end 1.58-bit (ternary) large language model training system natively built on Huawei Ascend NPU. The system integrates quantization-aware training (QAT) into the Megatron-LM framework with MindSpeed acceleration, covering the full training stack from custom ternary operators to distributed parallel training on Ascend 910B.
24
 
25
- We train a family of four models—BitCPM4-CANN-0.5B/1B/3B/8B—and evaluate them against their full-precision MiniCPM4 counterparts across 11 benchmarks. The 1B/3B/8B models retain **95.7%–97.2%** of full-precision performance, while enabling approximately **6× memory reduction** at inference time. QAT introduces only **5% training throughput overhead** (148 vs. 155 TFLOP/s per NPU).
26
 
27
  ### Key Features
28
 
@@ -35,27 +35,27 @@ We train a family of four models—BitCPM4-CANN-0.5B/1B/3B/8B—and evaluate the
35
 
36
  > The models in this repository are in **pseudo-quantized (fake quantization) format**. This means the weights are stored in standard floating-point format with ternary values already applied during training. You can load and run inference with these models **exactly the same way as full-precision models**—no special quantization libraries or custom kernels are required.
37
 
38
- ## BitCPM4-CANN Model Family
39
 
40
  | Model | HuggingFace | GGUF |
41
  |-------|-------------|------|
42
- | BitCPM4-CANN-0.5B | [openbmb/BitCPM4-CANN-0.5B](https://huggingface.co/openbmb/BitCPM4-CANN-0.5B) | [openbmb/BitCPM4-CANN-0.5B-gguf](https://huggingface.co/openbmb/BitCPM4-CANN-0.5B-gguf) |
43
- | BitCPM4-CANN-1B | [openbmb/BitCPM4-CANN-1B](https://huggingface.co/openbmb/BitCPM4-CANN-1B) | [openbmb/BitCPM4-CANN-1B-gguf](https://huggingface.co/openbmb/BitCPM4-CANN-1B-gguf) |
44
- | BitCPM4-CANN-3B | [openbmb/BitCPM4-CANN-3B](https://huggingface.co/openbmb/BitCPM4-CANN-3B) | [openbmb/BitCPM4-CANN-3B-gguf](https://huggingface.co/openbmb/BitCPM4-CANN-3B-gguf) |
45
- | BitCPM4-CANN-8B | [openbmb/BitCPM4-CANN-8B](https://huggingface.co/openbmb/BitCPM4-CANN-8B) | [openbmb/BitCPM4-CANN-8B-gguf](https://huggingface.co/openbmb/BitCPM4-CANN-8B-gguf) |
46
 
47
  ## Usage
48
 
49
  ### Inference with Transformers
50
 
51
- Since BitCPM4-CANN models are in pseudo-quantized format, you can use them exactly like standard full-precision models:
52
 
53
  ```python
54
  from transformers import AutoModelForCausalLM, AutoTokenizer
55
  import torch
56
  torch.manual_seed(0)
57
 
58
- path = 'openbmb/BitCPM4-CANN-1B'
59
  device = "cuda"
60
  tokenizer = AutoTokenizer.from_pretrained(path)
61
  model = AutoModelForCausalLM.from_pretrained(path, torch_dtype=torch.bfloat16, device_map=device, trust_remote_code=True)
@@ -93,7 +93,7 @@ print(responds)
93
 
94
  ### Main Results
95
 
96
- BitCPM4-CANN models are evaluated against their full-precision MiniCPM4 counterparts across 11 benchmarks spanning commonsense reasoning, domain knowledge, and mathematics & reasoning.
97
 
98
  | Task | 8B FP | 8B Ternary | 3B FP | 3B Ternary | 1B FP | 1B Ternary | 0.5B FP | 0.5B Ternary |
99
  |------|-------|------------|-------|------------|-------|------------|---------|--------------|
@@ -141,19 +141,19 @@ The system is built as a four-layer vertical stack on Ascend NPU:
141
  For full technical details, please refer to our [Technical Report](https://github.com/OpenBMB/MiniCPM/blob/main/docs/BitCPM_CANN.pdf).
142
 
143
  ## Statement
144
- - As a language model, BitCPM4-CANN generates content by learning from a vast amount of text.
145
  - However, it does not possess the ability to comprehend or express personal opinions or value judgments.
146
- - Any content generated by BitCPM4-CANN does not represent the viewpoints or positions of the model developers.
147
- - Therefore, when using content generated by BitCPM4-CANN, users should take full responsibility for evaluating and verifying it on their own.
148
 
149
  ## LICENSE
150
- - This repository and BitCPM4-CANN models are released under the [Apache-2.0](https://github.com/OpenBMB/MiniCPM/blob/main/LICENSE) License.
151
 
152
  ## Citation
153
  - Please cite our technical report if you find our work valuable.
154
 
155
  ```bibtex
156
- @article{bitcpm4cann,
157
  title={{BitCPM-CANN}: Native 1.58-Bit Large Language Model Training on Ascend NPU},
158
  author={BitCPM Team},
159
  year={2026}
 
20
 
21
  ## Introduction
22
 
23
+ BitCPM-CANN is the first end-to-end 1.58-bit (ternary) large language model training system natively built on Huawei Ascend NPU. The system integrates quantization-aware training (QAT) into the Megatron-LM framework with MindSpeed acceleration, covering the full training stack from custom ternary operators to distributed parallel training on Ascend 910B.
24
 
25
+ We train a family of four models—BitCPM-CANN-0.5B/1B/3B/8B—and evaluate them against their full-precision MiniCPM4 counterparts across 11 benchmarks. The 1B/3B/8B models retain **95.7%–97.2%** of full-precision performance, while enabling approximately **6× memory reduction** at inference time. QAT introduces only **5% training throughput overhead** (148 vs. 155 TFLOP/s per NPU).
26
 
27
  ### Key Features
28
 
 
35
 
36
  > The models in this repository are in **pseudo-quantized (fake quantization) format**. This means the weights are stored in standard floating-point format with ternary values already applied during training. You can load and run inference with these models **exactly the same way as full-precision models**—no special quantization libraries or custom kernels are required.
37
 
38
+ ## BitCPM-CANN Model Family
39
 
40
  | Model | HuggingFace | GGUF |
41
  |-------|-------------|------|
42
+ | BitCPM-CANN-0.5B | [openbmb/BitCPM-CANN-0.5B](https://huggingface.co/openbmb/BitCPM-CANN-0.5B) | [openbmb/BitCPM-CANN-0.5B-gguf](https://huggingface.co/openbmb/BitCPM-CANN-0.5B-gguf) |
43
+ | BitCPM-CANN-1B | [openbmb/BitCPM-CANN-1B](https://huggingface.co/openbmb/BitCPM-CANN-1B) | [openbmb/BitCPM-CANN-1B-gguf](https://huggingface.co/openbmb/BitCPM-CANN-1B-gguf) |
44
+ | BitCPM-CANN-3B | [openbmb/BitCPM-CANN-3B](https://huggingface.co/openbmb/BitCPM-CANN-3B) | [openbmb/BitCPM-CANN-3B-gguf](https://huggingface.co/openbmb/BitCPM-CANN-3B-gguf) |
45
+ | BitCPM-CANN-8B | [openbmb/BitCPM-CANN-8B](https://huggingface.co/openbmb/BitCPM-CANN-8B) | [openbmb/BitCPM-CANN-8B-gguf](https://huggingface.co/openbmb/BitCPM-CANN-8B-gguf) |
46
 
47
  ## Usage
48
 
49
  ### Inference with Transformers
50
 
51
+ Since BitCPM-CANN models are in pseudo-quantized format, you can use them exactly like standard full-precision models:
52
 
53
  ```python
54
  from transformers import AutoModelForCausalLM, AutoTokenizer
55
  import torch
56
  torch.manual_seed(0)
57
 
58
+ path = 'openbmb/BitCPM-CANN-1B'
59
  device = "cuda"
60
  tokenizer = AutoTokenizer.from_pretrained(path)
61
  model = AutoModelForCausalLM.from_pretrained(path, torch_dtype=torch.bfloat16, device_map=device, trust_remote_code=True)
 
93
 
94
  ### Main Results
95
 
96
+ BitCPM-CANN models are evaluated against their full-precision MiniCPM4 counterparts across 11 benchmarks spanning commonsense reasoning, domain knowledge, and mathematics & reasoning.
97
 
98
  | Task | 8B FP | 8B Ternary | 3B FP | 3B Ternary | 1B FP | 1B Ternary | 0.5B FP | 0.5B Ternary |
99
  |------|-------|------------|-------|------------|-------|------------|---------|--------------|
 
141
  For full technical details, please refer to our [Technical Report](https://github.com/OpenBMB/MiniCPM/blob/main/docs/BitCPM_CANN.pdf).
142
 
143
  ## Statement
144
+ - As a language model, BitCPM-CANN generates content by learning from a vast amount of text.
145
  - However, it does not possess the ability to comprehend or express personal opinions or value judgments.
146
+ - Any content generated by BitCPM-CANN does not represent the viewpoints or positions of the model developers.
147
+ - Therefore, when using content generated by BitCPM-CANN, users should take full responsibility for evaluating and verifying it on their own.
148
 
149
  ## LICENSE
150
+ - This repository and BitCPM-CANN models are released under the [Apache-2.0](https://github.com/OpenBMB/MiniCPM/blob/main/LICENSE) License.
151
 
152
  ## Citation
153
  - Please cite our technical report if you find our work valuable.
154
 
155
  ```bibtex
156
+ @article{bitcpmcann,
157
  title={{BitCPM-CANN}: Native 1.58-Bit Large Language Model Training on Ascend NPU},
158
  author={BitCPM Team},
159
  year={2026}