YAML Metadata
Warning:
empty or missing yaml metadata in repo card
(https://huggingface.co/docs/hub/model-cards#model-card-metadata)
QuantMobileLLM โ Lightweight GPT-Style Language Model
MobileLLM is a lightweight GPT-style language model designed for efficiency, fast inference, and small deployment environments.
Itโs trained on FineWeb-MINI and optimized with modern attention techniques.
๐ Model Highlights
- Architecture: Decoder-only GPT-style transformer
- Parameters: ~17M (6 layers, 8 heads, 256 embedding dim)
- Context Length: 512 tokens
- Vocabulary Size: 50,304 tokens
- Precision: Supports both
fp16andbf16 - Optimized for: Small GPUs, mobile inference,
๐ง Architecture Details
| Component | Value |
|---|---|
| Layers | 6 |
| Attention Heads | 8 |
| KV Heads | 4 |
| Embedding Dim | 256 |
| Context Length | 512 |
| Vocab Size | 50,304 |
| Attention Type | Multi-Query Attention |
| Norm Type | RMSNorm |
| Position Encoding | Rotary Position Embeddings (RoPE) |
| FFN Activation | SwiGLU (silu) |
๐น Key Optimizations
- RMSNorm โ Improves training stability over LayerNorm.
- Multi-Query Attention โ Reduces KV-cache size โ lower memory footprint.
- Rotary Embeddings (RoPE) โ Better handling of long context windows.
safetensorscheckpoints โ Faster & safer loading.
๐ Training Setup
| Property | Value |
|---|---|
| Dataset | FineWeb-MINI |
| Tokens Trained | ~100M |
| Optimizer | AdamW |
| Learning Rate | 6e-4 (cosine decay) |
| Warmup Steps | 100 |
| Batch Size | 64 ร 2 grad accum |
| Effective Batch Size | 128 |
| Mixed Precision | fp16 / bf16 (auto-detect) |
| Distributed Training | DDP |
| Logging | Weights & Biases (wandb) |
| Checkpoint Format | .safetensors |
๐งฉ Model Checkpoints
| Step | Filename | Format |
|---|---|---|
| Final | mobile_llm_final.safetensors |
safetensors |
| Intermediate | checkpoints/mobile_llm_step_<step>.safetensors |
safetensors |
๐ฎ Roadmap
- Train MobileLLM on FineWeb-MINI
- Add multi-query attention
- Export safetensors checkpoints
- Quantized int8 & int4 inference
- Expand training on FineWeb-1B
๐ License
This model is licensed under the MIT License.
๐ Links
- Github โ MobileLLM training code
- Downloads last month
- 4
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support