Spaces:

jonmabe
/

tiny-llm-demo

Sleeping

tiny-llm-demo / README.md

Upload folder using huggingface_hub

53dd03d verified 11 days ago

1.14 kB

	---
	title: Tiny-LLM Text Generator
	emoji: 🤖
	colorFrom: blue
	colorTo: purple
	sdk: gradio
	sdk_version: 4.44.0
	python_version: "3.11"
	app_file: app.py
	pinned: false
	license: apache-2.0
	---

	# Tiny-LLM Text Generator

	A 54 million parameter language model trained from scratch on Wikipedia.

	## About

	This demonstrates that meaningful language models can be trained on consumer hardware with modest compute budgets!

	## Architecture

	\| Component \| Value \|
	\|-----------\|-------\|
	\| Parameters \| 54.93M \|
	\| Layers \| 12 \|
	\| Hidden Size \| 512 \|
	\| Attention Heads \| 8 \|
	\| Intermediate (FFN) \| 1408 \|
	\| Vocab Size \| 32,000 \|
	\| Max Sequence Length \| 512 \|
	\| Position Encoding \| RoPE \|
	\| Normalization \| RMSNorm \|
	\| Activation \| SwiGLU \|

	## Training

	- Training Steps: 50,000
	- Tokens: ~100M
	- Hardware: NVIDIA RTX 5090 (32GB)
	- Training Time: ~3 hours

	## Model

	[jonmabe/tiny-llm-54m](https://huggingface.co/jonmabe/tiny-llm-54m)

	## Limitations

	- Small model size limits knowledge and capabilities
	- Trained only on Wikipedia - limited domain coverage
	- May generate factually incorrect information
	- Not instruction-tuned