Spaces:
Sleeping
Sleeping
A newer version of the Gradio SDK is available:
6.5.1
metadata
title: Tiny-LLM Text Generator
emoji: 🤖
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 4.44.0
python_version: '3.11'
app_file: app.py
pinned: false
license: apache-2.0
Tiny-LLM Text Generator
A 54 million parameter language model trained from scratch on Wikipedia.
About
This demonstrates that meaningful language models can be trained on consumer hardware with modest compute budgets!
Architecture
| Component | Value |
|---|---|
| Parameters | 54.93M |
| Layers | 12 |
| Hidden Size | 512 |
| Attention Heads | 8 |
| Intermediate (FFN) | 1408 |
| Vocab Size | 32,000 |
| Max Sequence Length | 512 |
| Position Encoding | RoPE |
| Normalization | RMSNorm |
| Activation | SwiGLU |
Training
- Training Steps: 50,000
- Tokens: ~100M
- Hardware: NVIDIA RTX 5090 (32GB)
- Training Time: ~3 hours
Model
Limitations
- Small model size limits knowledge and capabilities
- Trained only on Wikipedia - limited domain coverage
- May generate factually incorrect information
- Not instruction-tuned