Spaces:

jonmabe
/

tiny-llm-demo

Sleeping

App Files Files Community

tiny-llm-demo / README.md

jonmabe

Upload folder using huggingface_hub

53dd03d verified 11 days ago

preview code

raw

history blame contribute delete

1.14 kB

A newer version of the Gradio SDK is available: 6.5.1

Upgrade

metadata

title: Tiny-LLM Text Generator
emoji: 🤖
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 4.44.0
python_version: '3.11'
app_file: app.py
pinned: false
license: apache-2.0

Tiny-LLM Text Generator

A 54 million parameter language model trained from scratch on Wikipedia.

About

This demonstrates that meaningful language models can be trained on consumer hardware with modest compute budgets!

Architecture

Component	Value
Parameters	54.93M
Layers	12
Hidden Size	512
Attention Heads	8
Intermediate (FFN)	1408
Vocab Size	32,000
Max Sequence Length	512
Position Encoding	RoPE
Normalization	RMSNorm
Activation	SwiGLU

Training

Training Steps: 50,000
Tokens: ~100M
Hardware: NVIDIA RTX 5090 (32GB)
Training Time: ~3 hours

Model

jonmabe/tiny-llm-54m

Limitations

Small model size limits knowledge and capabilities
Trained only on Wikipedia - limited domain coverage
May generate factually incorrect information
Not instruction-tuned