QuantMobile-17M / README.md
AryanNsc's picture
Create README.md
4be8ffd verified
# QuantMobileLLM โ€” Lightweight GPT-Style Language Model
MobileLLM is a **lightweight GPT-style language model** designed for efficiency, fast inference, and small deployment environments.
Itโ€™s trained on **FineWeb-MINI** and optimized with **modern attention techniques**.
---
## ๐Ÿš€ Model Highlights
- **Architecture**: Decoder-only GPT-style transformer
- **Parameters**: ~17M (6 layers, 8 heads, 256 embedding dim)
- **Context Length**: 512 tokens
- **Vocabulary Size**: 50,304 tokens
- **Precision**: Supports both `fp16` and `bf16`
- **Optimized for**: Small GPUs, mobile inference,
---
## ๐Ÿง  Architecture Details
| **Component** | **Value** |
|--------------------|-----------|
| Layers | 6 |
| Attention Heads | 8 |
| KV Heads | 4 |
| Embedding Dim | 256 |
| Context Length | 512 |
| Vocab Size | 50,304 |
| Attention Type | Multi-Query Attention |
| Norm Type | RMSNorm |
| Position Encoding | Rotary Position Embeddings (RoPE) |
| FFN Activation | SwiGLU (`silu`) |
### ๐Ÿ”น Key Optimizations
- **RMSNorm** โ†’ Improves training stability over LayerNorm.
- **Multi-Query Attention** โ†’ Reduces KV-cache size โ†’ lower memory footprint.
- **Rotary Embeddings (RoPE)** โ†’ Better handling of long context windows.
- **`safetensors` checkpoints** โ†’ Faster & safer loading.
---
## ๐Ÿ“Š Training Setup
| **Property** | **Value** |
|------------------------|-----------|
| Dataset | [FineWeb-MINI](https://huggingface.co/datasets/AryanNsc/FineWeb-Mini) |
| Tokens Trained | ~100M |
| Optimizer | AdamW |
| Learning Rate | 6e-4 (cosine decay) |
| Warmup Steps | 100 |
| Batch Size | 64 ร— 2 grad accum |
| Effective Batch Size | 128 |
| Mixed Precision | `fp16` / `bf16` (auto-detect) |
| Distributed Training | DDP |
| Logging | Weights & Biases (`wandb`) |
| Checkpoint Format | `.safetensors` |
---
## ๐Ÿงฉ Model Checkpoints
| **Step** | **Filename** | **Format** |
|----------|------------|------------|
| Final | `mobile_llm_final.safetensors` | safetensors |
| Intermediate | `checkpoints/mobile_llm_step_<step>.safetensors` | safetensors |
---
## ๐Ÿ”ฎ Roadmap
- [x] Train **MobileLLM** on **FineWeb-MINI**
- [x] Add **multi-query attention**
- [x] Export **safetensors** checkpoints
- [ ] Quantized **int8** & **int4** inference
- [ ] Expand training on **FineWeb-1B**
---
## ๐Ÿ“œ License
This model is licensed under the [MIT License](LICENSE).
---
## ๐ŸŒ Links
- **Github** โ†’ [MobileLLM training code](https://github.com/Guney-olu/Quantgpt)