fabthebest's picture
Upload folder using huggingface_hub
9dc5d9a verified
---
license: apache-2.0
tags:
- mechanistic-interpretability
- gpt2
- probing
- logit-lens
- activation-patching
language:
- en
---
# The Champollion Protocol
Deciphering a language model's internal representations using the same method
Champollion used to decipher Egyptian hieroglyphs in 1822.
**Author:** Fabrice Fils-Aimé
## Method
1. **Rosetta Stone** — Factual prompts with known answers, verified against the model
2. **Cartouches** — Hidden-state extraction at every layer
3. **Cross-comparison** — Logit lens to track prediction emergence
4. **Partial alphabet** — Linear probes (5-fold CV) to decode layer-wise information
5. **Coptic validation** — Held-out test set to verify generalization
6. **Mixed system** — Activation patching to identify causal vs structural layers
## Citation
```bibtex
@misc{filsaime2026champollion,
title={The Champollion Protocol: Deciphering LLM Internal Representations},
author={Fils-Aim\'e, Fabrice},
year={2026},
url={https://huggingface.co/fabthebest/champollion-protocol}
}
```