File size: 2,645 Bytes

4be8ffd

# QuantMobileLLM — Lightweight GPT-Style Language Model

MobileLLM is a **lightweight GPT-style language model** designed for efficiency, fast inference, and small deployment environments.  
It’s trained on **FineWeb-MINI** and optimized with **modern attention techniques**.

---

## 🚀 Model Highlights
- **Architecture**: Decoder-only GPT-style transformer
- **Parameters**: ~17M (6 layers, 8 heads, 256 embedding dim)
- **Context Length**: 512 tokens
- **Vocabulary Size**: 50,304 tokens
- **Precision**: Supports both `fp16` and `bf16`
- **Optimized for**: Small GPUs, mobile inference,

---

## 🧠 Architecture Details

| **Component**      | **Value** |
|--------------------|-----------|
| Layers            | 6 |
| Attention Heads    | 8 |
| KV Heads          | 4 |
| Embedding Dim     | 256 |
| Context Length    | 512 |
| Vocab Size        | 50,304 |
| Attention Type    | Multi-Query Attention |
| Norm Type         | RMSNorm |
| Position Encoding | Rotary Position Embeddings (RoPE) |
| FFN Activation    | SwiGLU (`silu`) |

### 🔹 Key Optimizations
- **RMSNorm** → Improves training stability over LayerNorm.
- **Multi-Query Attention** → Reduces KV-cache size → lower memory footprint.
- **Rotary Embeddings (RoPE)** → Better handling of long context windows.
- **`safetensors` checkpoints** → Faster & safer loading.

---

## 📊 Training Setup

| **Property**            | **Value** |
|------------------------|-----------|
| Dataset                | [FineWeb-MINI](https://huggingface.co/datasets/AryanNsc/FineWeb-Mini) |
| Tokens Trained         | ~100M |
| Optimizer             | AdamW |
| Learning Rate         | 6e-4 (cosine decay) |
| Warmup Steps          | 100 |
| Batch Size            | 64 × 2 grad accum |
| Effective Batch Size  | 128 |
| Mixed Precision       | `fp16` / `bf16` (auto-detect) |
| Distributed Training  | DDP |
| Logging               | Weights & Biases (`wandb`) |
| Checkpoint Format     | `.safetensors` |

---

## 🧩 Model Checkpoints

| **Step** | **Filename** | **Format** |
|----------|------------|------------|
| Final    | `mobile_llm_final.safetensors` | safetensors |
| Intermediate | `checkpoints/mobile_llm_step_<step>.safetensors` | safetensors |

---

## 🔮 Roadmap
- [x] Train **MobileLLM** on **FineWeb-MINI**
- [x] Add **multi-query attention**
- [x] Export **safetensors** checkpoints
- [ ] Quantized **int8** & **int4** inference
- [ ] Expand training on **FineWeb-1B**

---

## 📜 License
This model is licensed under the [MIT License](LICENSE).

---

## 🌐 Links
- **Github** → [MobileLLM training code](https://github.com/Guney-olu/Quantgpt)