--- license: mit language: - en pipeline_tag: text-generation tags: - transformer - gpt2 - nano - experimental - tiny - text-generation - whirlwindai new_version: WhirlwindAI/AtomZephyr ---

_{The entire model is smaller than some README files.}

---

# The World's Tiniest Transformer Big language models chase billions of parameters. NanoZephyr went the opposite direction. With only **372 parameters**, this tiny GPT-2 style model explores just how absurdly small a language model can become while still technically generating text. It's not smart. It was never supposed to be.

--- # Why? Sometimes research starts with one simple question. > **"How small can we make a Transformer before it becomes completely ridiculous?"** NanoZephyr is our answer. - Built for fun. - Built for curiosity. - Built to make AI researchers laugh. --- # Specifications

| Property | Value | |:---------|:-----:| | Parameters | **372** | | Architecture | GPT-2 Style | | Layers | 1 | | Attention Heads | 1 | | Embedding Size | 4 | | Feed Forward | 8 | | Vocabulary | 32 Tokens | | Context Length | 16 | | Model Size | ~15 KB | | Training | CPU |

--- # Example ```python from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained( "WhirlwindAI/NanoZephyr" ) model = AutoModelForCausalLM.from_pretrained( "WhirlwindAI/NanoZephyr" ) prompt = "The future of AI" inputs = tokenizer(prompt, return_tensors="pt") outputs = model.generate( **inputs, max_length=16, do_sample=True, temperature=2.0 ) print(tokenizer.decode(outputs[0], skip_special_tokens=True)) ``` Example output ```text The future of AI vxbzq rpfm lo... ``` Beautiful. --- # Live System Status ```text ┌───────────────────────────────┐ │ NanoZephyr Boot Sequence │ ├───────────────────────────────┤ │ Parameters : 372 │ │ GPU Usage : Basically none │ │ Intelligence : █░░░░░░░░ 3% │ │ Confidence : ██████████ 100%│ │ Randomness : ██████████ MAX │ │ Status : ONLINE │ └───────────────────────────────┘ ``` --- # Performance

| Benchmark | Result | |:----------|-------:| | Common Sense | 0.01 | | Mathematics | 0.00 | | Philosophy | ??? | | Gibberish | 100.00 | | Comedy | ∞ |

--- # Sample Outputs ```text Input: hello Output: helloclvtdzng o ``` ```text Input: ROMEO: Output: ufbgdyo zia ``` ```text Input: Once upon a time... Output: qxwwbbvh zjv ``` Every generation is a surprise. Sometimes even to the model. --- # Intended Uses ✅ Learning how Transformers work ✅ Educational demos ✅ Parameter-count experiments ✅ AI memes ✅ Making your 70B model feel better --- # Not Intended For ❌ Homework ❌ Medical advice ❌ Programming ❌ Legal documents ❌ Anything requiring intelligence --- # Awards 🥇 Smallest Transformer (probably) 🥇 Highest Gibberish Density 🥇 Lowest Storage Requirement 🥇 Fastest CPU Training 🏆 Most Honest AI Model --- # License MIT Feel free to use it, modify it, laugh at it, or make it even smaller. ---