Inside Tiny Aya: Sparse Autoencoders for Multilingual Interpretability
Sparse Autoencoders (SAEs) trained on all four Tiny Aya regional variants to study how multilingual language models represent 70+ languages internally.
Models
| SAE | Base Model | Focus Languages |
|---|---|---|
tiny-aya-global/layer_28 |
CohereLabs/tiny-aya-global | All 70+ languages |
tiny-aya-fire/layer_28 |
CohereLabs/tiny-aya-fire | South Asian languages |
tiny-aya-earth/layer_28 |
CohereLabs/tiny-aya-earth | African + West Asian languages |
tiny-aya-water/layer_28 |
CohereLabs/tiny-aya-water | Asia-Pacific + European languages |
SAE Details
- Architecture: BatchTopK (auto-converted to JumpReLU for inference)
- Input dimension: 2,048 (Tiny Aya hidden size)
- SAE width: 16,384 (8× expansion)
- k: 64 active features per token
- Hook point:
model.layers.28(global attention layer in final third) - Training data: Balanced CulturaX subset (~1M tokens per language, 61 languages)
- Training tokens: ~41M
- Framework: SAELens v6
Usage
from sae_lens import SAE
# Load any variant
sae = SAE.from_pretrained(
release="Farseen0/tiny-aya-saes",
sae_id="tiny-aya-global/layer_28",
device="cuda"
)
# Or load from disk after downloading
sae = SAE.load_from_disk("tiny-aya-global/layer_28", device="cuda")
# Encode activations
features = sae.encode(hidden_states) # [batch, seq, 16384]
# Decode back
reconstructed = sae.decode(features) # [batch, seq, 2048]
Research Questions
- What fraction of SAE features are language-specific vs universal vs script-specific?
- Do regional variants create new features or redistribute existing ones?
- Is there a correlation between dedicated feature count and generation quality?
- Can steering language-specific features improve low-resource generation?
Project
Part of Expedition Tiny Aya 2026 by Cohere Labs.
Team: Farseen Shaikh, Matthew Nguyen, Tra My (Chiffon) Nguyen
Code: github.com/mychiffonn/inside-tiny-aya
Citation
@misc{shaikh2026insidetinyaya,
title={Inside Tiny Aya: Mapping Multilingual Representations with Sparse Autoencoders},
author={Shaikh, Farseen and Nguyen, Matthew and Nguyen, Tra My},
year={2026},
url={https://huggingface.co/Farseen0/tiny-aya-saes}
}
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support