jeffasante commited on
Commit
226a053
·
verified ·
1 Parent(s): 373f4f2

Upload folder using huggingface_hub

Browse files
.gitattributes CHANGED
@@ -48,3 +48,4 @@ gemma-4-E2B-it-int4-aggr-v5/gemma-4-E2B-it-int4-aggr-v5.cellmd filter=lfs diff=l
48
  gemma-4-E2B-it-int4-aggr-v5/tokenizer.json filter=lfs diff=lfs merge=lfs -text
49
  gemma-4-E2B-it-int4-aggr-v2/gemma-4-E2B-it-int4-aggr-v2.cellmd filter=lfs diff=lfs merge=lfs -text
50
  gemma-4-E2B-it-int4-aggr-v2/gemma-4-E2B-it-int4-aggr-v2.tokenizer.json filter=lfs diff=lfs merge=lfs -text
 
 
48
  gemma-4-E2B-it-int4-aggr-v5/tokenizer.json filter=lfs diff=lfs merge=lfs -text
49
  gemma-4-E2B-it-int4-aggr-v2/gemma-4-E2B-it-int4-aggr-v2.cellmd filter=lfs diff=lfs merge=lfs -text
50
  gemma-4-E2B-it-int4-aggr-v2/gemma-4-E2B-it-int4-aggr-v2.tokenizer.json filter=lfs diff=lfs merge=lfs -text
51
+ Bonsai-1.7B_v2/Bonsai-1.7B_v2.cellm filter=lfs diff=lfs merge=lfs -text
Bonsai-1.7B_v2/Bonsai-1.7B_v2.cellm ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:131e932a24e8e75f9baa6aa572d1ccb2bde442da0fa68d688df67f231a01ae69
3
+ size 242149312
Bonsai-1.7B_v2/README.md ADDED
@@ -0,0 +1,32 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Bonsai 1.7B (1-Bit Quantized)
2
+
3
+ Bonsai 1.7B is an experimental 1-bit quantized Large Language Model. It uses a specialized `Q1_0_g128` format that achieves approximately 1.125 bits per parameter.
4
+
5
+ ## Model Details
6
+
7
+ - **Parameters**: 1.7 Billion
8
+ - **Format**: `.cellm` (Cellm binary format)
9
+ - **Quantization**: 1-bit sign-magnitude with 16-bit group scales (g128)
10
+ - **Size**: 231 MB
11
+ - **Base Architecture**: Qwen2-style Transformer
12
+
13
+ ## Usage in Cellm
14
+
15
+ To run inference using the Cellm CLI:
16
+
17
+ ```bash
18
+ ./target/release/infer \
19
+ --model Bonsai-1.7B_v2.cellm \
20
+ --tokenizer tokenizer.json \
21
+ --prompt "What is sycophancy?" \
22
+ --backend metal \
23
+ --gen 100
24
+ ```
25
+
26
+ ## Performance Note
27
+
28
+ This model is optimized for extremely low-memory environments. At 231MB, it can run on devices with very limited RAM. While the quantization is aggressive, it maintains coherent English generation for simple prompts.
29
+
30
+ ## Implementation Analysis
31
+
32
+ For a detailed technical breakdown of how the 1-bit quantization works and how it was implemented in cellm, see the [Bonsai 1-Bit Analysis](https://github.com/jeffasante/cellm/blob/main/docs/bonsai_1bit_analysis.md).
Bonsai-1.7B_v2/tokenizer.json ADDED
The diff for this file is too large to render. See raw diff