Spaces:

jonmabe
/

tiny-llm-demo

Sleeping

File size: 1,143 Bytes

80765dc
3aa6f7b
 
80765dc
3aa6f7b
80765dc
3aa6f7b
53dd03d
80765dc
 
3aa6f7b
80765dc
 
3aa6f7b

---
title: Tiny-LLM Text Generator
emoji: 🤖
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 4.44.0
python_version: "3.11"
app_file: app.py
pinned: false
license: apache-2.0
---

# Tiny-LLM Text Generator

A **54 million parameter** language model trained **from scratch** on Wikipedia.

## About

This demonstrates that meaningful language models can be trained on consumer hardware with modest compute budgets!

## Architecture

| Component | Value |
|-----------|-------|
| Parameters | 54.93M |
| Layers | 12 |
| Hidden Size | 512 |
| Attention Heads | 8 |
| Intermediate (FFN) | 1408 |
| Vocab Size | 32,000 |
| Max Sequence Length | 512 |
| Position Encoding | RoPE |
| Normalization | RMSNorm |
| Activation | SwiGLU |

## Training

- **Training Steps**: 50,000
- **Tokens**: ~100M
- **Hardware**: NVIDIA RTX 5090 (32GB)
- **Training Time**: ~3 hours

## Model

[jonmabe/tiny-llm-54m](https://huggingface.co/jonmabe/tiny-llm-54m)

## Limitations

- Small model size limits knowledge and capabilities
- Trained only on Wikipedia - limited domain coverage
- May generate factually incorrect information
- Not instruction-tuned