tiny-aya-saes / README.md
Farseen0's picture
Update README.md
d8fe7c9 verified
---
tags:
- SAELens
- sparse-autoencoder
- mechanistic-interpretability
- multilingual
- cohere
license: apache-2.0
language:
- multilingual
---
# Inside Tiny Aya: Sparse Autoencoders for Multilingual Interpretability
Sparse Autoencoders (SAEs) trained on all four [Tiny Aya](https://cohere.com/research/papers/tiny-aya) regional variants to study how multilingual language models represent 70+ languages internally.
## Models
| SAE | Base Model | Focus Languages |
|-----|-----------|-----------------|
| `tiny-aya-global/layer_28` | [CohereLabs/tiny-aya-global](https://huggingface.co/CohereLabs/tiny-aya-global) | All 70+ languages |
| `tiny-aya-fire/layer_28` | [CohereLabs/tiny-aya-fire](https://huggingface.co/CohereLabs/tiny-aya-fire) | South Asian languages |
| `tiny-aya-earth/layer_28` | [CohereLabs/tiny-aya-earth](https://huggingface.co/CohereLabs/tiny-aya-earth) | African + West Asian languages |
| `tiny-aya-water/layer_28` | [CohereLabs/tiny-aya-water](https://huggingface.co/CohereLabs/tiny-aya-water) | Asia-Pacific + European languages |
## SAE Details
- **Architecture:** BatchTopK (auto-converted to JumpReLU for inference)
- **Input dimension:** 2,048 (Tiny Aya hidden size)
- **SAE width:** 16,384 (8× expansion)
- **k:** 64 active features per token
- **Hook point:** `model.layers.28` (global attention layer in final third)
- **Training data:** Balanced CulturaX subset (~1M tokens per language, 61 languages)
- **Training tokens:** ~41M
- **Framework:** [SAELens v6](https://github.com/decoderesearch/SAELens)
## Usage
```python
from sae_lens import SAE
# Load any variant
sae = SAE.from_pretrained(
release="Farseen0/tiny-aya-saes",
sae_id="tiny-aya-global/layer_28",
device="cuda"
)
# Or load from disk after downloading
sae = SAE.load_from_disk("tiny-aya-global/layer_28", device="cuda")
# Encode activations
features = sae.encode(hidden_states) # [batch, seq, 16384]
# Decode back
reconstructed = sae.decode(features) # [batch, seq, 2048]
```
## Research Questions
1. What fraction of SAE features are language-specific vs universal vs script-specific?
2. Do regional variants create new features or redistribute existing ones?
3. Is there a correlation between dedicated feature count and generation quality?
4. Can steering language-specific features improve low-resource generation?
## Project
Part of [Expedition Tiny Aya 2026](https://www.notion.so/cohereai/Expedition-Tiny-Aya-2f04398375db804c93c4c9f5fbb94833) by Cohere Labs.
**Team:** Farseen Shaikh, Matthew Nguyen, Tra My (Chiffon) Nguyen
**Code:** [github.com/mychiffonn/inside-tiny-aya](https://github.com/mychiffonn/inside-tiny-aya)
## Citation
```bibtex
@misc{shaikh2026insidetinyaya,
title={Inside Tiny Aya: Mapping Multilingual Representations with Sparse Autoencoders},
author={Shaikh, Farseen and Nguyen, Matthew and Nguyen, Tra My},
year={2026},
url={https://huggingface.co/Farseen0/tiny-aya-saes}
}
```