fabthebest's picture
Upload folder using huggingface_hub
9dc5d9a verified
metadata
license: apache-2.0
tags:
  - mechanistic-interpretability
  - gpt2
  - probing
  - logit-lens
  - activation-patching
language:
  - en

The Champollion Protocol

Deciphering a language model's internal representations using the same method Champollion used to decipher Egyptian hieroglyphs in 1822.

Author: Fabrice Fils-Aimé

Method

  1. Rosetta Stone — Factual prompts with known answers, verified against the model
  2. Cartouches — Hidden-state extraction at every layer
  3. Cross-comparison — Logit lens to track prediction emergence
  4. Partial alphabet — Linear probes (5-fold CV) to decode layer-wise information
  5. Coptic validation — Held-out test set to verify generalization
  6. Mixed system — Activation patching to identify causal vs structural layers

Citation

@misc{filsaime2026champollion,
  title={The Champollion Protocol: Deciphering LLM Internal Representations},
  author={Fils-Aim\'e, Fabrice},
  year={2026},
  url={https://huggingface.co/fabthebest/champollion-protocol}
}