| --- |
| license: mit |
| language: |
| - en |
| pipeline_tag: text-generation |
| tags: |
| - gpt2 |
| - tiny |
| - transformer |
| - experimental |
| - humor |
| - text-generation |
| - whirlwindai |
| new_version: WhirlwindAI/NanoZephyr |
| --- |
| |
| <div align="center"> |
|
|
| <img src="https://readme-typing-svg.demolab.com?font=Space+Grotesk&weight=700&size=26&duration=1800&pause=900&color=A855F7¢er=true&vCenter=true&width=820&lines=TinyZephyr;1%2C272+Parameters.;Technically+an+LLM.;Powered+by+Questionable+Engineering.;Runs+Before+You+Finish+Blinking." /> |
|
|
| <br> |
|
|
| <img src="https://img.shields.io/badge/Parameters-1,272-8B5CF6?style=for-the-badge"> |
| <img src="https://img.shields.io/badge/Architecture-GPT--2-A855F7?style=for-the-badge"> |
| <img src="https://img.shields.io/badge/Status-Experimental-7C3AED?style=for-the-badge"> |
| <img src="https://img.shields.io/badge/Humor-Included-6366F1?style=for-the-badge"> |
|
|
| <br><br> |
|
|
| <img src="https://capsule-render.vercel.app/api?type=rounded&height=190&text=TinyZephyr&fontColor=ffffff&fontSize=48&animation=twinkling&color=0:8B5CF6,50:A855F7,100:2563EB"/> |
|
|
| </div> |
|
|
| --- |
|
|
| # The Idea |
|
|
| <div align="center"> |
|
|
| <table width="92%"> |
| <tr> |
| <td align="center"> |
|
|
| ## Bigger isn't always better. |
|
|
| TinyZephyr explores the opposite extreme. |
|
|
| Instead of billions of parameters, this model asks one very important question: |
|
|
| **How ridiculously small can a transformer become before it completely loses its mind?** |
|
|
| The answer is... surprisingly entertaining. |
|
|
| </td> |
| </tr> |
| </table> |
|
|
| </div> |
|
|
| --- |
|
|
| # Why? |
|
|
| Nobody needed this. |
|
|
| Nobody requested it. |
|
|
| Nobody funded it. |
|
|
| Yet somehow... |
|
|
| **TinyZephyr exists.** |
|
|
| It was built purely as an experiment to explore the lower limits of transformer architectures while proving that even microscopic language models deserve beautiful documentation. |
|
|
| --- |
|
|
| # Specifications |
|
|
| | Property | Value | |
| |-----------|-------| |
| | Parameters | **1,272** | |
| | Architecture | GPT-2 | |
| | Layers | 1 | |
| | Attention Heads | 1 | |
| | Embedding Size | 8 | |
| | Context Length | 32 | |
| | Vocabulary | 50 Tokens | |
| | Model Size | ~25 KB | |
| | Training Time | ~4 Minutes (CPU) | |
|
|
| --- |
|
|
| # Benchmark |
|
|
| | Task | Result | |
| |------|--------| |
| | Write Python | β | |
| | Solve Math | β | |
| | Explain Physics | β | |
| | Generate Gibberish | β
| |
| | Exist | β
| |
|
|
| --- |
|
|
| # Quick Start |
|
|
| ```python |
| from transformers import AutoTokenizer, AutoModelForCausalLM |
| |
| tokenizer = AutoTokenizer.from_pretrained("WhirlwindAI/TinyZephyr") |
| model = AutoModelForCausalLM.from_pretrained("WhirlwindAI/TinyZephyr") |
| |
| prompt = "The meaning of life is" |
| |
| inputs = tokenizer(prompt, return_tensors="pt") |
| |
| outputs = model.generate( |
| **inputs, |
| do_sample=True, |
| temperature=1.6, |
| max_length=32 |
| ) |
| |
| print(tokenizer.decode(outputs[0], skip_special_tokens=True)) |
| ```` |
|
|
| Possible output: |
|
|
| ``` |
| The meaning of life is... |
| xqw fjczqnv lpoqa yv |
| ``` |
|
|
| Beautiful. |
|
|
| --- |
|
|
| # Example Conversation |
|
|
| **You** |
|
|
| > Write a poem about space. |
|
|
| **TinyZephyr** |
|
|
| > moon potato quantum fish sandwich |
|
|
| Mission accomplished. |
|
|
| --- |
|
|
| # System Requirements |
|
|
| CPU |
|
|
| > Yes. |
|
|
| GPU |
|
|
| > Optional. |
|
|
| RAM |
|
|
| > If your browser opens, you're probably fine. |
|
|
| Storage |
|
|
| > Less than most PNG files. |
|
|
| --- |
|
|
| # Frequently Asked Questions |
|
|
| ### Is this useful? |
|
|
| Not particularly. |
|
|
| ### Is this serious research? |
|
|
| Surprisingly... yes. |
|
|
| ### Can it replace ChatGPT? |
|
|
| Only if your expectations are extremely flexible. |
|
|
| ### Why did you build this? |
|
|
| Curiosity. |
|
|
| And because somebody had to. |
|
|
| --- |
|
|
| # Awards |
|
|
| π Fastest Model To Finish Inference |
|
|
| π₯ Most Parameters Removed Without Deleting Everything |
|
|
| π₯ Best Random Sentence Generator |
|
|
| π₯ Self-Proclaimed Champion of Tiny AI |
|
|
| --- |
|
|
| # Limitations |
|
|
| TinyZephyr was never trained to be helpful. |
|
|
| It doesn't know facts. |
|
|
| It doesn't reason. |
|
|
| It doesn't write code. |
|
|
| It mostly produces beautifully random nonsense. |
|
|
| And that's exactly what it was designed to do. |
|
|
| --- |
|
|
| # License |
|
|
| MIT |
|
|
| Use it. |
|
|
| Benchmark it. |
|
|
| Laugh at it. |
|
|
| Make it even smaller. |
|
|
| --- |
|
|
| <div align="center"> |
|
|
| ### Built by WhirlwindAI |
|
|
| *Sometimes the best experiments begin with terrible ideas.* |
|
|
| <br> |
|
|
| <img src="https://capsule-render.vercel.app/api?type=waving&height=120§ion=footer&color=0:8B5CF6,100:2563EB"/> |
|
|
| </div> |