Upload folder using huggingface_hub

Browse files

Files changed (6) hide show

.trillim-quantize-complete +1 -0
README.md +56 -32
qmodel.tensors +1 -1
rope.cache +1 -1
tokenizer_config.json +7 -1
trillim_config.json +4 -3

.trillim-quantize-complete ADDED Viewed

	@@ -0,0 +1 @@


1	+ ready

README.md CHANGED Viewed

@@ -1,38 +1,62 @@
 ---
 license: mit
 ---
-This is a reproduction of the <a href="https://arxiv.org/abs/2402.17764"> BitNet b1.58</a> paper. The models are trained with <a href="https://github.com/togethercomputer/RedPajama-Data">RedPajama dataset</a> for 100B tokens. The hypers, as well as two-stage LR and weight decay, are implemented as suggested in their following <a href="https://github.com/microsoft/unilm/blob/master/bitnet/The-Era-of-1-bit-LLMs__Training_Tips_Code_FAQ.pdf">paper</a>. All models are open-source in the <a href="https://huggingface.co/1bitLLM">repo</a>. We will train larger models and/or more tokens when resource is available.
-## Results
-PPL and zero-shot accuracy:
-| Models | PPL| ARCe| ARCc| HS | BQ | OQ | PQ | WGe | Avg
-|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|
-| FP16 700M (reported) | 12.33 | 54.7 | 23.0 | 37.0 | 60.0 | 20.2 | 68.9 | 54.8 | 45.5 |
-| BitNet b1.58 700M (reported) | 12.87 | 51.8 | 21.4 | 35.1 | 58.2 | 20.0 | 68.1 | 55.2 | 44.3 |
-| BitNet b1.58 700M (reproduced) | 12.78 | 51.4 | 21.8 | 35.0 | 59.6 | 20.6 | 67.5 | 55.4 | 44.5 |
-| FP16 1.3B (reported)    | 11.25  | 56.9 | 23.5 | 38.5 | 59.1 | 21.6 | 70.0 | 53.9 | 46.2
-| BitNet b1.58 1.3B (reported)    | 11.29  | 54.9 | 24.2 | 37.7 | 56.7 | 19.6 | 68.8 | 55.8 | 45.4 |
-| BitNet b1.58 1.3B (reproduced)    | 11.19 | 55.8 | 23.7 | 37.6 | 59.0 | 20.2 | 69.2 | 56.0 | 45.9
-| FP16 3B (reported)    | 10.04   | 62.1 | 25.6 | 43.3 | 61.8 | 24.6 | 72.1 | 58.2 | 49.7
-| BitNet b1.58 3B (reported)    | 9.91   | 61.4 | 28.3 | 42.9 | 61.5 | 26.6 | 71.5 | 59.3 | 50.2
-| BitNet b1.58 3B (reproduced)    | 9.88 | 60.9 | 28.0 | 42.3 | 58.3 | 26.0 | 71.4 | 60.3 | 49.6 |
-The differences between the reported numbers and the reproduced results are possibly variances from the training data processing, seeds, or other random factors.
-## Evaluation
-The evaluation pipelines are from the paper authors. Here is the commands to run the evaluation:
-```
-pip install lm-eval==0.3.0
-```
-```
-python eval_ppl.py --hf_path 1bitLLM/bitnet_b1_58-3B --seqlen 2048
-```
 ```
-python eval_task.py --hf_path 1bitLLM/bitnet_b1_58-3B \
-    --batch_size 1 \
-    --tasks \
-    --output_path result.json \
-    --num_fewshot 0 \
-    --ctx_size 2048
 ```

 ---
 license: mit
+tags:
+  - bitnet
+  - ternary
+  - trillim
+  - cpu-inference
+base_model: 1bitLLM/bitnet_b1_58-3B
 ---
+# BitNet-3B-TRNQ
+Ternary-quantized version of [1bitLLM/bitnet_b1_58-3B](https://huggingface.co/1bitLLM/bitnet_b1_58-3B), packaged for the [Trillim DarkNet](https://huggingface.co/Trillim) inference engine.
+This model runs entirely on CPU — no GPU required.
+## Model Details
+| | |
+|---|---|
+| **Architecture** | BitNet (BitnetForCausalLM) |
+| **Parameters** | ~3B |
+| **Hidden size** | 3200 |
+| **Layers** | 26 |
+| **Attention heads** | 32 |
+| **Context length** | 2048 |
+| **Quantization** | Ternary ({-1, 0, 1}) |
+| **Source model** | [1bitLLM/bitnet_b1_58-3B](https://huggingface.co/1bitLLM/bitnet_b1_58-3B) |
+| **License** | MIT |
+## Usage
+```bash
+pip install trillim
+trillim pull Trillim/BitNet-3B-TRNQ
+trillim serve Trillim/BitNet-3B-TRNQ
 ```
+This starts an OpenAI-compatible API server at `http://127.0.0.1:8000`.
+For interactive CLI chat:
+```bash
+trillim chat Trillim/BitNet-3B-TRNQ
 ```
+## What's in this repo
+| File | Description |
+|---|---|
+| `qmodel.tensors` | Ternary-quantized weights in Trillim format |
+| `rope.cache` | Precomputed RoPE embeddings |
+| `config.json` | Model configuration |
+| `tokenizer.json` | Tokenizer |
+| `tokenizer_config.json` | Tokenizer configuration |
+| `tokenizer.model` | SentencePiece model |
+| `tokenization_bitnet.py` | Custom tokenizer class |
+| `trillim_config.json` | Trillim metadata |
+## License
+This model is released under the [MIT License](https://opensource.org/licenses/MIT), following the license of the source model.

qmodel.tensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:13eb6f30ab7a76ad419e13afeb7c7592975f7dd03e0e08fe9b842427a4f963bb
 size 1015142080

 version https://git-lfs.github.com/spec/v1
+oid sha256:45fabbe4e311d12d7dd0e98329bdf949884a645d1727f4518d9a6c45ca2a4d77
 size 1015142080

rope.cache CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:3f6ac5fde5685ed746ae0b8e22acad235a22ab518d5112c6174b2b63cb430c7c
 size 1638412

 version https://git-lfs.github.com/spec/v1
+oid sha256:6969d480c4dda9d5fa6091c45c5d213df44ee39fda73a177fbc6f545796cecd6
 size 1638412

tokenizer_config.json CHANGED Viewed

@@ -58,5 +58,11 @@
   "spaces_between_special_tokens": false,
   "tokenizer_class": "BitnetTokenizer",
   "unk_token": "<unk>",
-  "use_default_system_prompt": false
 }

   "spaces_between_special_tokens": false,
   "tokenizer_class": "BitnetTokenizer",
   "unk_token": "<unk>",
+  "use_default_system_prompt": false,
+  "auto_map": {
+    "AutoTokenizer": [
+      "tokenization_bitnet.BitnetTokenizer",
+      null
+    ]
+  }
 }

trillim_config.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
-  "trillim_version": "0.3.0",
-  "format_version": 3,
   "type": "model",
   "quantization": "ternary",
   "source_model": "1bitLLM/bitnet_b1_58-3B",
@@ -9,5 +9,6 @@
     "x86_64",
     "aarch64"
   ],
-  "base_model_config_hash": "be323c0873e9bcd2e636aaf4caae13ff89954e17e8d2e8712ccb5256c1d150dd"
 }

 {
+  "trillim_version": "0.6.0",
+  "format_version": 4,
   "type": "model",
   "quantization": "ternary",
   "source_model": "1bitLLM/bitnet_b1_58-3B",
     "x86_64",
     "aarch64"
   ],
+  "base_model_config_hash": "db910c219c28fd9387eeae01a9ef81292b09247b5e5a805f567971a785fab3fd",
+  "remote_code": true
 }