The Idea

TinyZephyr wasn't tiny enough.

NanoZephyr wasn't tiny enough.

AtomZephyr still had too many parameters.

So we continued removing neurons until the model reached a point where modern physics politely asked us to stop.

We ignored them.

SubatomZephyr is the result.

Why?

Most research asks

"How can we make models smarter?"

We asked

"How many parameters can we delete before Git starts feeling sorry for us?"

This repository is the answer.

Specifications

Property	Value
Parameters	21
Architecture	GPT-2
Layers	1
Attention Heads	1
Embedding Size	1
Context Length	1
Vocabulary	2 Tokens
Disk Size	<2 KB
Training Time	~20 Seconds

Performance

Task	Result
Copy "a"	✅
Copy "b"	✅
Understand Humans	❌
Understand Itself	❌
Break Records	✅

Quick Start

from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("WhirlwindAI/SubatomZephyr")
model = AutoModelForCausalLM.from_pretrained("WhirlwindAI/SubatomZephyr")

prompt = "a"

inputs = tokenizer(prompt, return_tensors="pt")

outputs = model.generate(
    **inputs,
    max_length=2,
    do_sample=True,
    temperature=2.0
)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Output

Peak artificial intelligence.

Example Conversation

User

Tell me a story.

SubatomZephyr

Oscar-worthy.

Scientific Explanation

SubatomZephyr doesn't generate language.

It doesn't reason.

It doesn't predict.

It doesn't even pretend anymore.

It has mastered exactly one skill:

Input:

a

Output:

a

100% accuracy.

Zero creativity.

Perfect confidence.

World Records

🥇 Smallest Generative Transformer

🏆 Highest Accuracy On The Letter "a"

🥈 Lowest Grocery Bill (21 Parameters)

🥉 First Model Smaller Than Most README Files

🎖️ Certified Quantum Intelligence™

Benchmarks

MMLU          : 💀

HumanEval     : 😂

TruthfulQA    : 🤨

Binary Copy   : 🏆 100%

Entertainment : ⭐⭐⭐⭐⭐

Frequently Asked Questions

Is this useful?

No.

Is this funny?

Hopefully.

Why does it exist?

Curiosity.

Can it beat GPT-4?

Only if the task is copying the letter "a".

What's next?

QuarkZephyr.

Probably.

Fun Facts

Smaller than many favicon files.
Downloads before you click download.
Has fewer parameters than this README has paragraphs.
The tokenizer is more complicated than the model.
Uses more electricity displaying this README than running inference.

Limitations

SubatomZephyr should not be used for:

Chatbots
Coding
Translation
Math
Science
Existing in production

It excels primarily at making ML engineers laugh.

License

MIT

If you somehow improve this model...

please tell us.

We're genuinely curious.

Built by WhirlwindAI

"When there are no parameters left to remove... remove expectations instead."

Downloads last month: 15

Safetensors

Model size

21 params

Tensor type

F32