yocto / README.md

reinforceai-labs

Upload README.md with huggingface_hub

908e018 verified 4 days ago

preview code

raw

history blame contribute delete

3.18 kB

metadata

license: mit
language:
  - en
tags:
  - text-generation
  - story-generation
  - tiny-model
  - efficient-attention
  - unified-attention
library_name: pytorch
pipeline_tag: text-generation
model-index:
  - name: yocto
    results:
      - task:
          type: text-generation
        dataset:
          name: TinyStories
          type: roneneldan/TinyStories
        metrics:
          - name: Perplexity
            type: perplexity
            value: 9.58

YOCTO — World's Smallest Language Model

Yocto is a 484K parameter language model that tells children's stories. It achieves 9.58 perplexity on TinyStories — matching models 2-4× larger.

Key Innovation: Unified Attention

Standard transformers use 3 separate projections (Q, K, V). Yocto uses one unified projection that splits into [seeking|offering|content] bands:

Standard:  Q = W_Q·x,  K = W_K·x,  V = W_V·x   [3d² params]
Unified:   u = W·x → [seeking|offering|content] [d² params]

Result: 67% fewer attention parameters, better perplexity.

Quick Start

import torch
from huggingface_hub import hf_hub_download

# Download model
model_path = hf_hub_download(repo_id="Reinforce-ai/yocto", filename="model.pt")
tokenizer_path = hf_hub_download(repo_id="Reinforce-ai/yocto", filename="tokenizer.json")

# Load and generate (see GitHub for full code)

Performance

Metric	Value
Parameters	484,272
Size (fp16)	946 KB
Attention share	5.7%
Perplexity	9.58
Speed (CPU)	700+ tok/s

Example Output

Prompt: "Once upon a time"

Once upon a time, there was a little girl named Lily. She loved to play with her toys all day long. One day, she found a shiny thing on the shelf. The little girl said, "Look, mommy, look!" Her mommy explained that it's very cool, so Lily and her mommy went to the store to buy some tasty food.

Architecture

Component	Value
Embedding dim	72
Layers	4
Attention heads	3
FFN dim	288
Vocab size	4,000
Context length	512

Live Demo

Try Yocto in your browser: HuggingFace Space

Citation

@misc{deshwal2026yocto,
  title={Attention Fields: Unified Projections for Efficient Language Models},
  author={Deshwal, Viraj},
  year={2026},
  url={https://www.reinforceai.com/yocto},
  howpublished={\url{https://github.com/reinforceai/yocto}}
}

License

MIT

Reinforce-ai
/

yocto