# Base Small Language Model (SLM) ## 🚀 CPU-First Base Language Model This is the **base model** before fine-tuning - a blazing-fast, CPU-optimized Small Language Model foundation: ### ⚡ Performance Highlights - **164 tokens/sec** on CPU (fast base performance) - **45.2MB model size** (base model) - **3.7M parameters** (tiny but powerful) - **General language understanding** (pre-fine-tuning) ### 🎯 Training Speed - **28 minutes** for base training (4 epochs) - **Fast convergence** with efficient architecture - **Ready for fine-tuning** on any domain ### 🔧 Technical Specs - **Architecture:** Transformer-lite with RMSNorm, SwiGLU, Rotary embeddings - **Optimization:** CPU-first with memory mapping and efficient batching - **Framework:** PyTorch (CPU optimized) - **Training:** Trained on conversational data ### 📱 Deployment Ready - **CPU optimized:** No GPU required - **Fast startup:** Instant model loading - **Low memory:** Efficient memory usage - **Fine-tuning ready:** Perfect base for domain adaptation ## Usage ### Load and Use Base Model ```python import torch import sys sys.path.append('src') from model import create_model_from_config from tokenizer import BPETokenizer # Load model checkpoint = torch.load("checkpoints/model_latest.pt", map_location='cpu') config = checkpoint['config'] model = create_model_from_config(config) model.load_state_dict(checkpoint['model_state_dict']) # Load tokenizer tokenizer = BPETokenizer() tokenizer.load("data/tokenizer.json") # Generate prompt = "Hello, how are you?" input_ids = tokenizer.encode(prompt, add_special_tokens=True) input_ids = torch.tensor([input_ids], dtype=torch.long) model.eval() with torch.no_grad(): for _ in range(20): logits = model(input_ids)[0, -1, :] next_token = torch.argmax(logits, dim=-1).unsqueeze(0) input_ids = torch.cat([input_ids, next_token.unsqueeze(0)], dim=1) response = tokenizer.decode(input_ids[0].tolist(), skip_special_tokens=True) print(response) ``` ### Fine-tune on Your Data ```python # Use this base model for fine-tuning python finetune_qa.py --base_model checkpoints/model_latest.pt --conversations your_data.json ``` ## Model Details - **Base Model:** Trained on conversational data - **Architecture:** Transformer-lite with modern optimizations - **Size:** 45.2MB (base model) - **License:** MIT ## Performance | Metric | Value | |--------|-------| | Speed | 164 tokens/sec | | Size | 45.2MB | | Parameters | 3.7M | | Training Time | 28 minutes | This base model provides an excellent foundation for fine-tuning on specific domains or tasks.