0sparsh2 commited on
Commit
4831dd7
·
verified ·
1 Parent(s): ab0e169

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +58 -0
README.md ADDED
@@ -0,0 +1,58 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ license: mit
5
+ tags:
6
+ - bitnet
7
+ - 1.58-bit
8
+ - ternary
9
+ - tinystories
10
+ - edge-device
11
+ datasets:
12
+ - roneneldan/TinyStories
13
+ ---
14
+
15
+ # BitNet-TinyStories-V2 (3.9 MB)
16
+
17
+ This is an ultra-compressed **1.58-bit** language model trained entirely from scratch on the `TinyStories` dataset.
18
+
19
+ It implements the **BitNet (1.58b)** architecture, where all internal Linear layers are heavily quantized into ternary weights (`-1, 0, 1`). This version uses **Weight Tying**, allowing it to achieve a deep 12-Layer architecture while staying under a 4MB footprint!
20
+
21
+ ## Model Details
22
+ - **Architecture:** BitNet (1.58b)
23
+ - **Parameters:** ~21 Million
24
+ - **Layers:** 12 (Tied)
25
+ - **Precision:** 1.58-bit (Ternary) for internal weights
26
+ - **File Size:** 3.96 MB
27
+ - **Tokenizer:** `arnir0/Tiny-LLM` SentencePiece (32,000 vocab size)
28
+ - **Dataset:** `roneneldan/TinyStories`
29
+ - **Validation Perplexity:** 23.7
30
+
31
+ ## Usage
32
+
33
+ Because this model uses a highly customized ternary architecture, it cannot be loaded using standard HuggingFace `AutoModel`. You must use the `BitGPT` class implementation.
34
+
35
+ ```python
36
+ import torch
37
+ from transformers import AutoTokenizer
38
+ from bitnet_test import BitGPT
39
+
40
+ # 1. Load Tokenizer
41
+ tokenizer = AutoTokenizer.from_pretrained("arnir0/Tiny-LLM")
42
+
43
+ # 2. Initialize Model
44
+ model = BitGPT(vocab_size=len(tokenizer), embed_dim=256, num_layers=12, num_heads=4, tie_weights=True)
45
+
46
+ # 3. Load 1.58-bit Weights
47
+ model.load_state_dict(torch.load("bitnet_tied.pt", map_location="cpu"))
48
+ model.eval()
49
+
50
+ # 4. Generate Text
51
+ prompt = "Once upon a time, there was a tiny cat named"
52
+ input_ids = tokenizer.encode(prompt, return_tensors="pt")
53
+
54
+ # ... Run standard auto-regressive generation loop
55
+ ```
56
+
57
+ ## Intended Use
58
+ This model is intended purely as a research demonstration of the viability of 1.58-bit LLMs on edge devices. Because it was trained exclusively on the TinyStories dataset, it is completely incapable of performing complex reasoning, answering factual questions, or following instructions. It will only generate children's storybooks.