NanoZephyr / README.md
QuantaSparkLabs's picture
Update README.md
b049507 verified
|
Raw
History Blame Contribute Delete
5 kB
---
license: mit
language:
- en
pipeline_tag: text-generation
tags:
- transformer
- gpt2
- nano
- experimental
- tiny
- text-generation
- whirlwindai
new_version: WhirlwindAI/AtomZephyr
---
<div align="center">
<img src="https://readme-typing-svg.demolab.com?font=Space+Grotesk&weight=700&size=26&duration=2200&pause=1200&color=F97316&center=true&vCenter=true&width=760&lines=NanoZephyr;372+Parameters.;Tiny+Enough+to+Fit+Everywhere.;Powered+by+Questionable+Decisions." />
<br><br>
<img src="https://capsule-render.vercel.app/api?type=venom&height=200&text=NanoZephyr&animation=fadeIn&color=0:F857A6,100:FF5858"/>
<br>
<img src="https://img.shields.io/badge/Parameters-372-F97316?style=for-the-badge">
<img src="https://img.shields.io/badge/Architecture-GPT--2-F59E0B?style=for-the-badge">
<img src="https://img.shields.io/badge/Status-Experimental-FACC15?style=for-the-badge">
<img src="https://img.shields.io/badge/Humor-Included-EAB308?style=for-the-badge">
<sub><b>The entire model is smaller than some README files.</b></sub>
</div>
---
<div align="center">
# The World's Tiniest Transformer
Big language models chase billions of parameters.
NanoZephyr went the opposite direction.
With only **372 parameters**, this tiny GPT-2 style model explores just how absurdly small a language model can become while still technically generating text.
It's not smart.
It was never supposed to be.
</div>
---
# Why?
Sometimes research starts with one simple question.
> **"How small can we make a Transformer before it becomes completely ridiculous?"**
NanoZephyr is our answer.
- Built for fun.
- Built for curiosity.
- Built to make AI researchers laugh.
---
# Specifications
<div align="center">
| Property | Value |
|:---------|:-----:|
| Parameters | **372** |
| Architecture | GPT-2 Style |
| Layers | 1 |
| Attention Heads | 1 |
| Embedding Size | 4 |
| Feed Forward | 8 |
| Vocabulary | 32 Tokens |
| Context Length | 16 |
| Model Size | ~15 KB |
| Training | CPU |
</div>
---
# Example
```python
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained(
"WhirlwindAI/NanoZephyr"
)
model = AutoModelForCausalLM.from_pretrained(
"WhirlwindAI/NanoZephyr"
)
prompt = "The future of AI"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(
**inputs,
max_length=16,
do_sample=True,
temperature=2.0
)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```
Example output
```text
The future of AI vxbzq rpfm lo...
```
Beautiful.
---
# Live System Status
```text
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ NanoZephyr Boot Sequence β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ Parameters : 372 β”‚
β”‚ GPU Usage : Basically none β”‚
β”‚ Intelligence : β–ˆβ–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘ 3% β”‚
β”‚ Confidence : β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ 100%β”‚
β”‚ Randomness : β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ MAX β”‚
β”‚ Status : ONLINE β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
```
---
# Performance
<div align="center">
| Benchmark | Result |
|:----------|-------:|
| Common Sense | 0.01 |
| Mathematics | 0.00 |
| Philosophy | ??? |
| Gibberish | 100.00 |
| Comedy | ∞ |
</div>
---
# Sample Outputs
```text
Input:
hello
Output:
helloclvtdzng o
```
```text
Input:
ROMEO:
Output:
ufbgdyo zia
```
```text
Input:
Once upon a time...
Output:
qxwwbbvh zjv
```
Every generation is a surprise.
Sometimes even to the model.
---
# Intended Uses
βœ… Learning how Transformers work
βœ… Educational demos
βœ… Parameter-count experiments
βœ… AI memes
βœ… Making your 70B model feel better
---
# Not Intended For
❌ Homework
❌ Medical advice
❌ Programming
❌ Legal documents
❌ Anything requiring intelligence
---
# Awards
πŸ₯‡ Smallest Transformer (probably)
πŸ₯‡ Highest Gibberish Density
πŸ₯‡ Lowest Storage Requirement
πŸ₯‡ Fastest CPU Training
πŸ† Most Honest AI Model
---
# License
MIT
Feel free to use it, modify it, laugh at it, or make it even smaller.
---
<div align="center">
<img src="https://readme-typing-svg.demolab.com?font=Fira+Code&weight=600&size=17&duration=1800&pause=1100&color=F97316&center=true&vCenter=true&width=700&lines=loading+372+parameters...;thinking...;still+thinking...;generated+gibberish.;mission+complete." />
<br><br>
<img src="https://img.shields.io/badge/Built%20by-WhirlwindAI-F97316?style=for-the-badge">
<img src="https://img.shields.io/badge/Open-Research-F59E0B?style=for-the-badge">
<img src="https://img.shields.io/badge/Experimental-AI-FACC15?style=for-the-badge">
<br><br>
<img src="https://capsule-render.vercel.app/api?type=waving&height=220&text=372%20PARAMETERS&fontSize=50&animation=twinkling&fontColor=ffffff&color=0:FB923C,50:FACC15,100:FDE68A"/>
</div>