DarkSca commited on
Commit
3685eee
·
verified ·
1 Parent(s): 20090b4

Upload folder using huggingface_hub

Browse files
.trillim-quantize-complete ADDED
@@ -0,0 +1 @@
 
 
1
+ ready
README.md CHANGED
@@ -1,38 +1,62 @@
1
  ---
2
  license: mit
 
 
 
 
 
 
3
  ---
 
4
 
5
- This is a reproduction of the <a href="https://arxiv.org/abs/2402.17764"> BitNet b1.58</a> paper. The models are trained with <a href="https://github.com/togethercomputer/RedPajama-Data">RedPajama dataset</a> for 100B tokens. The hypers, as well as two-stage LR and weight decay, are implemented as suggested in their following <a href="https://github.com/microsoft/unilm/blob/master/bitnet/The-Era-of-1-bit-LLMs__Training_Tips_Code_FAQ.pdf">paper</a>. All models are open-source in the <a href="https://huggingface.co/1bitLLM">repo</a>. We will train larger models and/or more tokens when resource is available.
6
-
7
- ## Results
8
- PPL and zero-shot accuracy:
9
- | Models | PPL| ARCe| ARCc| HS | BQ | OQ | PQ | WGe | Avg
10
- |-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|
11
- | FP16 700M (reported) | 12.33 | 54.7 | 23.0 | 37.0 | 60.0 | 20.2 | 68.9 | 54.8 | 45.5 |
12
- | BitNet b1.58 700M (reported) | 12.87 | 51.8 | 21.4 | 35.1 | 58.2 | 20.0 | 68.1 | 55.2 | 44.3 |
13
- | BitNet b1.58 700M (reproduced) | 12.78 | 51.4 | 21.8 | 35.0 | 59.6 | 20.6 | 67.5 | 55.4 | 44.5 |
14
- | FP16 1.3B (reported) | 11.25 | 56.9 | 23.5 | 38.5 | 59.1 | 21.6 | 70.0 | 53.9 | 46.2
15
- | BitNet b1.58 1.3B (reported) | 11.29 | 54.9 | 24.2 | 37.7 | 56.7 | 19.6 | 68.8 | 55.8 | 45.4 |
16
- | BitNet b1.58 1.3B (reproduced) | 11.19 | 55.8 | 23.7 | 37.6 | 59.0 | 20.2 | 69.2 | 56.0 | 45.9
17
- | FP16 3B (reported) | 10.04 | 62.1 | 25.6 | 43.3 | 61.8 | 24.6 | 72.1 | 58.2 | 49.7
18
- | BitNet b1.58 3B (reported) | 9.91 | 61.4 | 28.3 | 42.9 | 61.5 | 26.6 | 71.5 | 59.3 | 50.2
19
- | BitNet b1.58 3B (reproduced) | 9.88 | 60.9 | 28.0 | 42.3 | 58.3 | 26.0 | 71.4 | 60.3 | 49.6 |
20
-
21
- The differences between the reported numbers and the reproduced results are possibly variances from the training data processing, seeds, or other random factors.
22
-
23
- ## Evaluation
24
- The evaluation pipelines are from the paper authors. Here is the commands to run the evaluation:
25
- ```
26
- pip install lm-eval==0.3.0
27
- ```
28
- ```
29
- python eval_ppl.py --hf_path 1bitLLM/bitnet_b1_58-3B --seqlen 2048
30
- ```
31
  ```
32
- python eval_task.py --hf_path 1bitLLM/bitnet_b1_58-3B \
33
- --batch_size 1 \
34
- --tasks \
35
- --output_path result.json \
36
- --num_fewshot 0 \
37
- --ctx_size 2048
 
38
  ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: mit
3
+ tags:
4
+ - bitnet
5
+ - ternary
6
+ - trillim
7
+ - cpu-inference
8
+ base_model: 1bitLLM/bitnet_b1_58-3B
9
  ---
10
+ # BitNet-3B-TRNQ
11
 
12
+ Ternary-quantized version of [1bitLLM/bitnet_b1_58-3B](https://huggingface.co/1bitLLM/bitnet_b1_58-3B), packaged for the [Trillim DarkNet](https://huggingface.co/Trillim) inference engine.
13
+
14
+ This model runs entirely on CPU — no GPU required.
15
+
16
+ ## Model Details
17
+
18
+ | | |
19
+ |---|---|
20
+ | **Architecture** | BitNet (BitnetForCausalLM) |
21
+ | **Parameters** | ~3B |
22
+ | **Hidden size** | 3200 |
23
+ | **Layers** | 26 |
24
+ | **Attention heads** | 32 |
25
+ | **Context length** | 2048 |
26
+ | **Quantization** | Ternary ({-1, 0, 1}) |
27
+ | **Source model** | [1bitLLM/bitnet_b1_58-3B](https://huggingface.co/1bitLLM/bitnet_b1_58-3B) |
28
+ | **License** | MIT |
29
+
30
+ ## Usage
31
+
32
+ ```bash
33
+ pip install trillim
34
+ trillim pull Trillim/BitNet-3B-TRNQ
35
+ trillim serve Trillim/BitNet-3B-TRNQ
 
 
36
  ```
37
+
38
+ This starts an OpenAI-compatible API server at `http://127.0.0.1:8000`.
39
+
40
+ For interactive CLI chat:
41
+
42
+ ```bash
43
+ trillim chat Trillim/BitNet-3B-TRNQ
44
  ```
45
+
46
+ ## What's in this repo
47
+
48
+ | File | Description |
49
+ |---|---|
50
+ | `qmodel.tensors` | Ternary-quantized weights in Trillim format |
51
+ | `rope.cache` | Precomputed RoPE embeddings |
52
+ | `config.json` | Model configuration |
53
+ | `tokenizer.json` | Tokenizer |
54
+ | `tokenizer_config.json` | Tokenizer configuration |
55
+ | `tokenizer.model` | SentencePiece model |
56
+ | `tokenization_bitnet.py` | Custom tokenizer class |
57
+ | `trillim_config.json` | Trillim metadata |
58
+
59
+ ## License
60
+
61
+ This model is released under the [MIT License](https://opensource.org/licenses/MIT), following the license of the source model.
62
+
qmodel.tensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:13eb6f30ab7a76ad419e13afeb7c7592975f7dd03e0e08fe9b842427a4f963bb
3
  size 1015142080
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:45fabbe4e311d12d7dd0e98329bdf949884a645d1727f4518d9a6c45ca2a4d77
3
  size 1015142080
rope.cache CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:3f6ac5fde5685ed746ae0b8e22acad235a22ab518d5112c6174b2b63cb430c7c
3
  size 1638412
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6969d480c4dda9d5fa6091c45c5d213df44ee39fda73a177fbc6f545796cecd6
3
  size 1638412
tokenizer_config.json CHANGED
@@ -58,5 +58,11 @@
58
  "spaces_between_special_tokens": false,
59
  "tokenizer_class": "BitnetTokenizer",
60
  "unk_token": "<unk>",
61
- "use_default_system_prompt": false
 
 
 
 
 
 
62
  }
 
58
  "spaces_between_special_tokens": false,
59
  "tokenizer_class": "BitnetTokenizer",
60
  "unk_token": "<unk>",
61
+ "use_default_system_prompt": false,
62
+ "auto_map": {
63
+ "AutoTokenizer": [
64
+ "tokenization_bitnet.BitnetTokenizer",
65
+ null
66
+ ]
67
+ }
68
  }
trillim_config.json CHANGED
@@ -1,6 +1,6 @@
1
  {
2
- "trillim_version": "0.3.0",
3
- "format_version": 3,
4
  "type": "model",
5
  "quantization": "ternary",
6
  "source_model": "1bitLLM/bitnet_b1_58-3B",
@@ -9,5 +9,6 @@
9
  "x86_64",
10
  "aarch64"
11
  ],
12
- "base_model_config_hash": "be323c0873e9bcd2e636aaf4caae13ff89954e17e8d2e8712ccb5256c1d150dd"
 
13
  }
 
1
  {
2
+ "trillim_version": "0.6.0",
3
+ "format_version": 4,
4
  "type": "model",
5
  "quantization": "ternary",
6
  "source_model": "1bitLLM/bitnet_b1_58-3B",
 
9
  "x86_64",
10
  "aarch64"
11
  ],
12
+ "base_model_config_hash": "db910c219c28fd9387eeae01a9ef81292b09247b5e5a805f567971a785fab3fd",
13
+ "remote_code": true
14
  }