File size: 2,645 Bytes
4be8ffd |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 |
# QuantMobileLLM โ Lightweight GPT-Style Language Model
MobileLLM is a **lightweight GPT-style language model** designed for efficiency, fast inference, and small deployment environments.
Itโs trained on **FineWeb-MINI** and optimized with **modern attention techniques**.
---
## ๐ Model Highlights
- **Architecture**: Decoder-only GPT-style transformer
- **Parameters**: ~17M (6 layers, 8 heads, 256 embedding dim)
- **Context Length**: 512 tokens
- **Vocabulary Size**: 50,304 tokens
- **Precision**: Supports both `fp16` and `bf16`
- **Optimized for**: Small GPUs, mobile inference,
---
## ๐ง Architecture Details
| **Component** | **Value** |
|--------------------|-----------|
| Layers | 6 |
| Attention Heads | 8 |
| KV Heads | 4 |
| Embedding Dim | 256 |
| Context Length | 512 |
| Vocab Size | 50,304 |
| Attention Type | Multi-Query Attention |
| Norm Type | RMSNorm |
| Position Encoding | Rotary Position Embeddings (RoPE) |
| FFN Activation | SwiGLU (`silu`) |
### ๐น Key Optimizations
- **RMSNorm** โ Improves training stability over LayerNorm.
- **Multi-Query Attention** โ Reduces KV-cache size โ lower memory footprint.
- **Rotary Embeddings (RoPE)** โ Better handling of long context windows.
- **`safetensors` checkpoints** โ Faster & safer loading.
---
## ๐ Training Setup
| **Property** | **Value** |
|------------------------|-----------|
| Dataset | [FineWeb-MINI](https://huggingface.co/datasets/AryanNsc/FineWeb-Mini) |
| Tokens Trained | ~100M |
| Optimizer | AdamW |
| Learning Rate | 6e-4 (cosine decay) |
| Warmup Steps | 100 |
| Batch Size | 64 ร 2 grad accum |
| Effective Batch Size | 128 |
| Mixed Precision | `fp16` / `bf16` (auto-detect) |
| Distributed Training | DDP |
| Logging | Weights & Biases (`wandb`) |
| Checkpoint Format | `.safetensors` |
---
## ๐งฉ Model Checkpoints
| **Step** | **Filename** | **Format** |
|----------|------------|------------|
| Final | `mobile_llm_final.safetensors` | safetensors |
| Intermediate | `checkpoints/mobile_llm_step_<step>.safetensors` | safetensors |
---
## ๐ฎ Roadmap
- [x] Train **MobileLLM** on **FineWeb-MINI**
- [x] Add **multi-query attention**
- [x] Export **safetensors** checkpoints
- [ ] Quantized **int8** & **int4** inference
- [ ] Expand training on **FineWeb-1B**
---
## ๐ License
This model is licensed under the [MIT License](LICENSE).
---
## ๐ Links
- **Github** โ [MobileLLM training code](https://github.com/Guney-olu/Quantgpt)
|