reinforceai-labs commited on
Commit
8b4ac3b
·
verified ·
1 Parent(s): 3bbc6af

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +109 -0
README.md ADDED
@@ -0,0 +1,109 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ language:
4
+ - en
5
+ tags:
6
+ - text-generation
7
+ - story-generation
8
+ - tiny-model
9
+ - efficient-attention
10
+ - unified-attention
11
+ library_name: pytorch
12
+ pipeline_tag: text-generation
13
+ model-index:
14
+ - name: yocto
15
+ results:
16
+ - task:
17
+ type: text-generation
18
+ dataset:
19
+ name: TinyStories
20
+ type: roneneldan/TinyStories
21
+ metrics:
22
+ - name: Perplexity
23
+ type: perplexity
24
+ value: 9.58
25
+ ---
26
+
27
+ # YOCTO — World's Smallest Language Model
28
+
29
+ <p align="center">
30
+ <img src="https://img.shields.io/badge/Parameters-484K-blue" alt="Parameters">
31
+ <img src="https://img.shields.io/badge/Size-946KB-green" alt="Size">
32
+ <img src="https://img.shields.io/badge/Speed-81%20tok%2Fs-orange" alt="Speed">
33
+ <img src="https://img.shields.io/badge/Perplexity-9.58-purple" alt="Perplexity">
34
+ </p>
35
+
36
+ Yocto is a 484K parameter language model that tells children's stories. It achieves **9.58 perplexity** on TinyStories — matching models 2-4× larger.
37
+
38
+ ## Key Innovation: Unified Attention
39
+
40
+ Standard transformers use 3 separate projections (Q, K, V). Yocto uses **one unified projection** that splits into [seeking|offering|content] bands:
41
+
42
+ ```
43
+ Standard: Q = W_Q·x, K = W_K·x, V = W_V·x [3d² params]
44
+ Unified: u = W·x → [seeking|offering|content] [d² params]
45
+ ```
46
+
47
+ **Result:** 67% fewer attention parameters, better perplexity.
48
+
49
+ ## Quick Start
50
+
51
+ ```python
52
+ import torch
53
+ from huggingface_hub import hf_hub_download
54
+
55
+ # Download model
56
+ model_path = hf_hub_download(repo_id="reinforceai/yocto", filename="model.pt")
57
+ tokenizer_path = hf_hub_download(repo_id="reinforceai/yocto", filename="tokenizer.json")
58
+
59
+ # Load and generate (see GitHub for full code)
60
+ ```
61
+
62
+ ## Performance
63
+
64
+ | Metric | Value |
65
+ |--------|-------|
66
+ | Parameters | 484,272 |
67
+ | Size (fp16) | 946 KB |
68
+ | Attention share | 5.7% |
69
+ | Perplexity | 9.58 |
70
+ | Speed (CPU) | 81 tok/s |
71
+
72
+ ## Example Output
73
+
74
+ **Prompt:** "Once upon a time"
75
+
76
+ > Once upon a time, there was a little girl named Lily. She loved to play with her toys all day long. One day, she found a shiny thing on the shelf. The little girl said, "Look, mommy, look!" Her mommy explained that it's very cool, so Lily and her mommy went to the store to buy some tasty food.
77
+
78
+ ## Architecture
79
+
80
+ | Component | Value |
81
+ |-----------|-------|
82
+ | Embedding dim | 72 |
83
+ | Layers | 4 |
84
+ | Attention heads | 3 |
85
+ | FFN dim | 288 |
86
+ | Vocab size | 4,000 |
87
+ | Context length | 512 |
88
+
89
+ ## Links
90
+
91
+ - 🌐 **Website:** [reinforceai.com/yocto](https://www.reinforceai.com/yocto)
92
+ - 💻 **GitHub:** [github.com/reinforceai/yocto](https://github.com/reinforceai/yocto)
93
+ - 📄 **Paper:** [Attention Fields: Unified Projections for Efficient Language Models](https://github.com/reinforceai/yocto/blob/main/ATTENTION_FIELDS.md)
94
+
95
+ ## Citation
96
+
97
+ ```bibtex
98
+ @misc{deshwal2026yocto,
99
+ title={Attention Fields: Unified Projections for Efficient Language Models},
100
+ author={Deshwal, Viraj},
101
+ year={2026},
102
+ url={https://www.reinforceai.com/yocto},
103
+ howpublished={\url{https://github.com/reinforceai/yocto}}
104
+ }
105
+ ```
106
+
107
+ ## License
108
+
109
+ MIT