AryanNsc
/

QuantMobile-17M

Model card Files Files and versions

AryanNsc commited on Aug 28, 2025

Commit

4be8ffd

·

verified ·

1 Parent(s): 2ac8573

Create README.md

Files changed (1) hide show

README.md +83 -0

README.md ADDED Viewed

	@@ -0,0 +1,83 @@

+# QuantMobileLLM — Lightweight GPT-Style Language Model
+MobileLLM is a **lightweight GPT-style language model** designed for efficiency, fast inference, and small deployment environments.
+It’s trained on **FineWeb-MINI** and optimized with **modern attention techniques**.
+---
+## 🚀 Model Highlights
+- **Architecture**: Decoder-only GPT-style transformer
+- **Parameters**: ~17M (6 layers, 8 heads, 256 embedding dim)
+- **Context Length**: 512 tokens
+- **Vocabulary Size**: 50,304 tokens
+- **Precision**: Supports both `fp16` and `bf16`
+- **Optimized for**: Small GPUs, mobile inference,
+---
+## 🧠 Architecture Details
+| **Component**      | **Value** |
+|--------------------|-----------|
+| Layers            | 6 |
+| Attention Heads    | 8 |
+| KV Heads          | 4 |
+| Embedding Dim     | 256 |
+| Context Length    | 512 |
+| Vocab Size        | 50,304 |
+| Attention Type    | Multi-Query Attention |
+| Norm Type         | RMSNorm |
+| Position Encoding | Rotary Position Embeddings (RoPE) |
+| FFN Activation    | SwiGLU (`silu`) |
+### 🔹 Key Optimizations
+- **RMSNorm** → Improves training stability over LayerNorm.
+- **Multi-Query Attention** → Reduces KV-cache size → lower memory footprint.
+- **Rotary Embeddings (RoPE)** → Better handling of long context windows.
+- **`safetensors` checkpoints** → Faster & safer loading.
+---
+## 📊 Training Setup
+| **Property**            | **Value** |
+|------------------------|-----------|
+| Dataset                | [FineWeb-MINI](https://huggingface.co/datasets/AryanNsc/FineWeb-Mini) |
+| Tokens Trained         | ~100M |
+| Optimizer             | AdamW |
+| Learning Rate         | 6e-4 (cosine decay) |
+| Warmup Steps          | 100 |
+| Batch Size            | 64 × 2 grad accum |
+| Effective Batch Size  | 128 |
+| Mixed Precision       | `fp16` / `bf16` (auto-detect) |
+| Distributed Training  | DDP |
+| Logging               | Weights & Biases (`wandb`) |
+| Checkpoint Format     | `.safetensors` |
+---
+## 🧩 Model Checkpoints
+| **Step** | **Filename** | **Format** |
+|----------|------------|------------|
+| Final    | `mobile_llm_final.safetensors` | safetensors |
+| Intermediate | `checkpoints/mobile_llm_step_<step>.safetensors` | safetensors |
+---
+## 🔮 Roadmap
+- [x] Train **MobileLLM** on **FineWeb-MINI**
+- [x] Add **multi-query attention**
+- [x] Export **safetensors** checkpoints
+- [ ] Quantized **int8** & **int4** inference
+- [ ] Expand training on **FineWeb-1B**
+---
+## 📜 License
+This model is licensed under the [MIT License](LICENSE).
+---
+## 🌐 Links
+- **Github** → [MobileLLM training code](https://github.com/Guney-olu/Quantgpt)