--- title: Tiny-LLM Text Generator emoji: 🤖 colorFrom: blue colorTo: purple sdk: gradio sdk_version: 4.44.0 python_version: "3.11" app_file: app.py pinned: false license: apache-2.0 --- # Tiny-LLM Text Generator A **54 million parameter** language model trained **from scratch** on Wikipedia. ## About This demonstrates that meaningful language models can be trained on consumer hardware with modest compute budgets! ## Architecture | Component | Value | |-----------|-------| | Parameters | 54.93M | | Layers | 12 | | Hidden Size | 512 | | Attention Heads | 8 | | Intermediate (FFN) | 1408 | | Vocab Size | 32,000 | | Max Sequence Length | 512 | | Position Encoding | RoPE | | Normalization | RMSNorm | | Activation | SwiGLU | ## Training - **Training Steps**: 50,000 - **Tokens**: ~100M - **Hardware**: NVIDIA RTX 5090 (32GB) - **Training Time**: ~3 hours ## Model [jonmabe/tiny-llm-54m](https://huggingface.co/jonmabe/tiny-llm-54m) ## Limitations - Small model size limits knowledge and capabilities - Trained only on Wikipedia - limited domain coverage - May generate factually incorrect information - Not instruction-tuned