Update README.md

6cfc12c verified 1 day ago

4.1 kB

	---
	license: mit
	language:
	- en
	pipeline_tag: text-generation
	tags:
	- transformer
	- gpt2
	- tiny
	- atom
	- experimental
	- humor
	- whirlwindai
	new_version: WhirlwindAI/SubatomZephyr
	---

	<div align="center">

	<img src="https://readme-typing-svg.demolab.com?font=Space+Grotesk&weight=700&size=27&duration=1900&pause=900&color=22C55E&center=true&vCenter=true&width=850&lines=AtomZephyr;27+Parameters.;Almost+an+LLM.;Mostly+a+Science+Experiment.;Powered+by+Pure+Curiosity." />

	<br>

	<img src="https://img.shields.io/badge/Parameters-27-22C55E?style=for-the-badge">
	<img src="https://img.shields.io/badge/Architecture-GPT--2-10B981?style=for-the-badge">
	<img src="https://img.shields.io/badge/Status-Experimental-14B8A6?style=for-the-badge">
	<img src="https://img.shields.io/badge/Braincells-27-06B6D4?style=for-the-badge">

	<br><br>

	<img src="https://capsule-render.vercel.app/api?type=soft&height=190&text=AtomZephyr&fontColor=ffffff&fontSize=48&animation=blinking&color=0:22C55E,100:06B6D4"/>

	</div>

	---

	# The Idea

	<div align="center">

	<table width="92%">

	<tr>

	<td align="center">

	## What if a transformer became... microscopic?

	AtomZephyr explores one of the smallest practical transformer architectures ever built.

	Not because anyone asked for it.

	Because someone eventually had to answer the question:

	"How absurdly small can an AI become before it forgets how to AI?"

	Turns out...

	27 parameters is still technically enough.

	</td>

	</tr>

	</table>

	</div>

	---

	# Why?

	Most AI models compete by getting bigger.

	AtomZephyr competes by removing parameters until people start questioning whether it's still a neural network.

	Every parameter had to earn its place.

	Most didn't.

	---

	# Specifications

	\| Property \| Value \|
	\|-----------\|-------\|
	\| Parameters \| 27 \|
	\| Architecture \| GPT-2 \|
	\| Layers \| 1 \|
	\| Attention Heads \| 1 \|
	\| Embedding Size \| 1 \|
	\| FFN Size \| 1 \|
	\| Context Length \| 4 \|
	\| Vocabulary \| 5 Tokens \|
	\| Model Size \| <5 KB \|
	\| Training Time \| ~6 Seconds (CPU) \|

	---

	# Performance

	\| Test \| Result \|
	\|------\|--------\|
	\| Understand English \| ❌ \|
	\| Write Code \| ❌ \|
	\| Solve Math \| ❌ \|
	\| Generate "abba" \| ✅ \|
	\| Break Expectations \| ✅ \|

	---

	# Quick Start

	```python
	from transformers import AutoTokenizer, AutoModelForCausalLM

	tokenizer = AutoTokenizer.from_pretrained("WhirlwindAI/AtomZephyr")
	model = AutoModelForCausalLM.from_pretrained("WhirlwindAI/AtomZephyr")

	prompt = "a"

	inputs = tokenizer(prompt, return_tensors="pt")

	outputs = model.generate(
	**inputs,
	do_sample=True,
	temperature=1.7,
	max_length=4
	)

	print(tokenizer.decode(outputs[0], skip_special_tokens=True))
	```

	Possible output

	```
	abaa
	```

	Groundbreaking.

	---

	# Example Conversation

	User

	> Tell me a joke.

	AtomZephyr

	```
	abba
	```

	Technically...

	that's an answer.

	---

	# Scientific Achievement

	Removing parameters is easy.

	Keeping a transformer alive afterwards...

	isn't.

	AtomZephyr exists purely to explore the absolute lower limits of transformer architectures while remaining a real, trainable language model.

	Whether it's useful is a completely different discussion.

	---

	# Awards

	🥇 Smallest Model That Still Has Self-Respect

	🏆 Best Binary Poetry Generator

	🥈 Most Efficient Waste Of Six Seconds

	🎖️ Official Representative Of Tiny AI

	---

	# Limitations

	AtomZephyr should not be used for:

	- Programming
	- Translation
	- Question Answering
	- Homework
	- Anything important

	It performs significantly better when asked to do absolutely nothing useful.

	---

	# Fun Facts

	- Fits inside most PNG images.
	- Smaller than many neural network tutorials.
	- Downloads faster than this README loads.
	- Has fewer parameters than some calculator manuals have pages.

	---

	# License

	MIT

	Take it apart.

	Make it smaller.

	Break another record.

	---

	<div align="center">

	### Built by WhirlwindAI

	"Sometimes progress isn't measured in billions... it's measured in what you can remove."

	<br>

	<img src="https://capsule-render.vercel.app/api?type=waving&height=120&section=footer&color=0:22C55E,100:06B6D4"/>

	</div>