| license: apache-2.0 | |
| tags: | |
| - mechanistic-interpretability | |
| - gpt2 | |
| - probing | |
| - logit-lens | |
| - activation-patching | |
| language: | |
| - en | |
| # The Champollion Protocol | |
| Deciphering a language model's internal representations using the same method | |
| Champollion used to decipher Egyptian hieroglyphs in 1822. | |
| **Author:** Fabrice Fils-Aimé | |
| ## Method | |
| 1. **Rosetta Stone** — Factual prompts with known answers, verified against the model | |
| 2. **Cartouches** — Hidden-state extraction at every layer | |
| 3. **Cross-comparison** — Logit lens to track prediction emergence | |
| 4. **Partial alphabet** — Linear probes (5-fold CV) to decode layer-wise information | |
| 5. **Coptic validation** — Held-out test set to verify generalization | |
| 6. **Mixed system** — Activation patching to identify causal vs structural layers | |
| ## Citation | |
| ```bibtex | |
| @misc{filsaime2026champollion, | |
| title={The Champollion Protocol: Deciphering LLM Internal Representations}, | |
| author={Fils-Aim\'e, Fabrice}, | |
| year={2026}, | |
| url={https://huggingface.co/fabthebest/champollion-protocol} | |
| } | |
| ``` | |