---
license: mit
language:
- en
pipeline_tag: text-generation
tags:
- transformer
- gpt2
- nano
- experimental
- tiny
- text-generation
- whirlwindai
new_version: WhirlwindAI/AtomZephyr
---
The entire model is smaller than some README files.
---
# The World's Tiniest Transformer
Big language models chase billions of parameters.
NanoZephyr went the opposite direction.
With only **372 parameters**, this tiny GPT-2 style model explores just how absurdly small a language model can become while still technically generating text.
It's not smart.
It was never supposed to be.
---
# Why?
Sometimes research starts with one simple question.
> **"How small can we make a Transformer before it becomes completely ridiculous?"**
NanoZephyr is our answer.
- Built for fun.
- Built for curiosity.
- Built to make AI researchers laugh.
---
# Specifications
| Property | Value |
|:---------|:-----:|
| Parameters | **372** |
| Architecture | GPT-2 Style |
| Layers | 1 |
| Attention Heads | 1 |
| Embedding Size | 4 |
| Feed Forward | 8 |
| Vocabulary | 32 Tokens |
| Context Length | 16 |
| Model Size | ~15 KB |
| Training | CPU |
---
# Example
```python
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained(
"WhirlwindAI/NanoZephyr"
)
model = AutoModelForCausalLM.from_pretrained(
"WhirlwindAI/NanoZephyr"
)
prompt = "The future of AI"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(
**inputs,
max_length=16,
do_sample=True,
temperature=2.0
)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```
Example output
```text
The future of AI vxbzq rpfm lo...
```
Beautiful.
---
# Live System Status
```text
┌───────────────────────────────┐
│ NanoZephyr Boot Sequence │
├───────────────────────────────┤
│ Parameters : 372 │
│ GPU Usage : Basically none │
│ Intelligence : █░░░░░░░░ 3% │
│ Confidence : ██████████ 100%│
│ Randomness : ██████████ MAX │
│ Status : ONLINE │
└───────────────────────────────┘
```
---
# Performance
| Benchmark | Result |
|:----------|-------:|
| Common Sense | 0.01 |
| Mathematics | 0.00 |
| Philosophy | ??? |
| Gibberish | 100.00 |
| Comedy | ∞ |
---
# Sample Outputs
```text
Input:
hello
Output:
helloclvtdzng o
```
```text
Input:
ROMEO:
Output:
ufbgdyo zia
```
```text
Input:
Once upon a time...
Output:
qxwwbbvh zjv
```
Every generation is a surprise.
Sometimes even to the model.
---
# Intended Uses
✅ Learning how Transformers work
✅ Educational demos
✅ Parameter-count experiments
✅ AI memes
✅ Making your 70B model feel better
---
# Not Intended For
❌ Homework
❌ Medical advice
❌ Programming
❌ Legal documents
❌ Anything requiring intelligence
---
# Awards
🥇 Smallest Transformer (probably)
🥇 Highest Gibberish Density
🥇 Lowest Storage Requirement
🥇 Fastest CPU Training
🏆 Most Honest AI Model
---
# License
MIT
Feel free to use it, modify it, laugh at it, or make it even smaller.
---