Rahulwale12 commited on
Commit
f1413cd
·
verified ·
1 Parent(s): 8edaeeb

Add base CPU-optimized SLM model

Browse files

Base 3.7M parameter CPU-optimized language model ready for fine-tuning

Files changed (5) hide show
  1. .gitattributes +2 -35
  2. README.md +90 -3
  3. config.json +15 -0
  4. pytorch_model.bin +3 -0
  5. tokenizer.json +0 -0
.gitattributes CHANGED
@@ -1,35 +1,2 @@
1
- *.7z filter=lfs diff=lfs merge=lfs -text
2
- *.arrow filter=lfs diff=lfs merge=lfs -text
3
- *.bin filter=lfs diff=lfs merge=lfs -text
4
- *.bz2 filter=lfs diff=lfs merge=lfs -text
5
- *.ckpt filter=lfs diff=lfs merge=lfs -text
6
- *.ftz filter=lfs diff=lfs merge=lfs -text
7
- *.gz filter=lfs diff=lfs merge=lfs -text
8
- *.h5 filter=lfs diff=lfs merge=lfs -text
9
- *.joblib filter=lfs diff=lfs merge=lfs -text
10
- *.lfs.* filter=lfs diff=lfs merge=lfs -text
11
- *.mlmodel filter=lfs diff=lfs merge=lfs -text
12
- *.model filter=lfs diff=lfs merge=lfs -text
13
- *.msgpack filter=lfs diff=lfs merge=lfs -text
14
- *.npy filter=lfs diff=lfs merge=lfs -text
15
- *.npz filter=lfs diff=lfs merge=lfs -text
16
- *.onnx filter=lfs diff=lfs merge=lfs -text
17
- *.ot filter=lfs diff=lfs merge=lfs -text
18
- *.parquet filter=lfs diff=lfs merge=lfs -text
19
- *.pb filter=lfs diff=lfs merge=lfs -text
20
- *.pickle filter=lfs diff=lfs merge=lfs -text
21
- *.pkl filter=lfs diff=lfs merge=lfs -text
22
- *.pt filter=lfs diff=lfs merge=lfs -text
23
- *.pth filter=lfs diff=lfs merge=lfs -text
24
- *.rar filter=lfs diff=lfs merge=lfs -text
25
- *.safetensors filter=lfs diff=lfs merge=lfs -text
26
- saved_model/**/* filter=lfs diff=lfs merge=lfs -text
27
- *.tar.* filter=lfs diff=lfs merge=lfs -text
28
- *.tar filter=lfs diff=lfs merge=lfs -text
29
- *.tflite filter=lfs diff=lfs merge=lfs -text
30
- *.tgz filter=lfs diff=lfs merge=lfs -text
31
- *.wasm filter=lfs diff=lfs merge=lfs -text
32
- *.xz filter=lfs diff=lfs merge=lfs -text
33
- *.zip filter=lfs diff=lfs merge=lfs -text
34
- *.zst filter=lfs diff=lfs merge=lfs -text
35
- *tfevents* filter=lfs diff=lfs merge=lfs -text
 
1
+ *.bin filter=lfs diff=lfs merge=lfs -text
2
+ *.safetensors filter=lfs diff=lfs merge=lfs -text
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
README.md CHANGED
@@ -1,3 +1,90 @@
1
- ---
2
- license: unknown
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Base Small Language Model (SLM)
2
+
3
+ ## 🚀 CPU-First Base Language Model
4
+
5
+ This is the **base model** before fine-tuning - a blazing-fast, CPU-optimized Small Language Model foundation:
6
+
7
+ ### ⚡ Performance Highlights
8
+ - **164 tokens/sec** on CPU (fast base performance)
9
+ - **45.2MB model size** (base model)
10
+ - **3.7M parameters** (tiny but powerful)
11
+ - **General language understanding** (pre-fine-tuning)
12
+
13
+ ### 🎯 Training Speed
14
+ - **28 minutes** for base training (4 epochs)
15
+ - **Fast convergence** with efficient architecture
16
+ - **Ready for fine-tuning** on any domain
17
+
18
+ ### 🔧 Technical Specs
19
+ - **Architecture:** Transformer-lite with RMSNorm, SwiGLU, Rotary embeddings
20
+ - **Optimization:** CPU-first with memory mapping and efficient batching
21
+ - **Framework:** PyTorch (CPU optimized)
22
+ - **Training:** Trained on conversational data
23
+
24
+ ### 📱 Deployment Ready
25
+ - **CPU optimized:** No GPU required
26
+ - **Fast startup:** Instant model loading
27
+ - **Low memory:** Efficient memory usage
28
+ - **Fine-tuning ready:** Perfect base for domain adaptation
29
+
30
+ ## Usage
31
+
32
+ ### Load and Use Base Model
33
+
34
+ ```python
35
+ import torch
36
+ import sys
37
+ sys.path.append('src')
38
+ from model import create_model_from_config
39
+ from tokenizer import BPETokenizer
40
+
41
+ # Load model
42
+ checkpoint = torch.load("checkpoints/model_latest.pt", map_location='cpu')
43
+ config = checkpoint['config']
44
+ model = create_model_from_config(config)
45
+ model.load_state_dict(checkpoint['model_state_dict'])
46
+
47
+ # Load tokenizer
48
+ tokenizer = BPETokenizer()
49
+ tokenizer.load("data/tokenizer.json")
50
+
51
+ # Generate
52
+ prompt = "Hello, how are you?"
53
+ input_ids = tokenizer.encode(prompt, add_special_tokens=True)
54
+ input_ids = torch.tensor([input_ids], dtype=torch.long)
55
+
56
+ model.eval()
57
+ with torch.no_grad():
58
+ for _ in range(20):
59
+ logits = model(input_ids)[0, -1, :]
60
+ next_token = torch.argmax(logits, dim=-1).unsqueeze(0)
61
+ input_ids = torch.cat([input_ids, next_token.unsqueeze(0)], dim=1)
62
+
63
+ response = tokenizer.decode(input_ids[0].tolist(), skip_special_tokens=True)
64
+ print(response)
65
+ ```
66
+
67
+ ### Fine-tune on Your Data
68
+
69
+ ```python
70
+ # Use this base model for fine-tuning
71
+ python finetune_qa.py --base_model checkpoints/model_latest.pt --conversations your_data.json
72
+ ```
73
+
74
+ ## Model Details
75
+
76
+ - **Base Model:** Trained on conversational data
77
+ - **Architecture:** Transformer-lite with modern optimizations
78
+ - **Size:** 45.2MB (base model)
79
+ - **License:** MIT
80
+
81
+ ## Performance
82
+
83
+ | Metric | Value |
84
+ |--------|-------|
85
+ | Speed | 164 tokens/sec |
86
+ | Size | 45.2MB |
87
+ | Parameters | 3.7M |
88
+ | Training Time | 28 minutes |
89
+
90
+ This base model provides an excellent foundation for fine-tuning on specific domains or tasks.
config.json ADDED
@@ -0,0 +1,15 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "model_type": "transformer_lite",
3
+ "architectures": [
4
+ "TransformerLite"
5
+ ],
6
+ "vocab_size": 12288,
7
+ "hidden_size": 128,
8
+ "num_hidden_layers": 2,
9
+ "num_attention_heads": 4,
10
+ "intermediate_size": 512,
11
+ "max_position_embeddings": 64,
12
+ "model_format": "base",
13
+ "framework": "pytorch",
14
+ "device": "cpu"
15
+ }
pytorch_model.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:05d26d5c2c64675cfcd3093a61f6568d33b15ba6432490ee6d2b6819a5b9359d
3
+ size 45224313
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff