tiny-llm-demo / README.md
jonmabe's picture
Upload folder using huggingface_hub
53dd03d verified

A newer version of the Gradio SDK is available: 6.5.1

Upgrade
metadata
title: Tiny-LLM Text Generator
emoji: 🤖
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 4.44.0
python_version: '3.11'
app_file: app.py
pinned: false
license: apache-2.0

Tiny-LLM Text Generator

A 54 million parameter language model trained from scratch on Wikipedia.

About

This demonstrates that meaningful language models can be trained on consumer hardware with modest compute budgets!

Architecture

Component Value
Parameters 54.93M
Layers 12
Hidden Size 512
Attention Heads 8
Intermediate (FFN) 1408
Vocab Size 32,000
Max Sequence Length 512
Position Encoding RoPE
Normalization RMSNorm
Activation SwiGLU

Training

  • Training Steps: 50,000
  • Tokens: ~100M
  • Hardware: NVIDIA RTX 5090 (32GB)
  • Training Time: ~3 hours

Model

jonmabe/tiny-llm-54m

Limitations

  • Small model size limits knowledge and capabilities
  • Trained only on Wikipedia - limited domain coverage
  • May generate factually incorrect information
  • Not instruction-tuned