tiny-llm-demo / README.md
jonmabe's picture
Upload folder using huggingface_hub
53dd03d verified
---
title: Tiny-LLM Text Generator
emoji: 🤖
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 4.44.0
python_version: "3.11"
app_file: app.py
pinned: false
license: apache-2.0
---
# Tiny-LLM Text Generator
A **54 million parameter** language model trained **from scratch** on Wikipedia.
## About
This demonstrates that meaningful language models can be trained on consumer hardware with modest compute budgets!
## Architecture
| Component | Value |
|-----------|-------|
| Parameters | 54.93M |
| Layers | 12 |
| Hidden Size | 512 |
| Attention Heads | 8 |
| Intermediate (FFN) | 1408 |
| Vocab Size | 32,000 |
| Max Sequence Length | 512 |
| Position Encoding | RoPE |
| Normalization | RMSNorm |
| Activation | SwiGLU |
## Training
- **Training Steps**: 50,000
- **Tokens**: ~100M
- **Hardware**: NVIDIA RTX 5090 (32GB)
- **Training Time**: ~3 hours
## Model
[jonmabe/tiny-llm-54m](https://huggingface.co/jonmabe/tiny-llm-54m)
## Limitations
- Small model size limits knowledge and capabilities
- Trained only on Wikipedia - limited domain coverage
- May generate factually incorrect information
- Not instruction-tuned