fabthebest
/

champollion-protocol

mechanistic-interpretability

activation-patching

Model card Files Files and versions

champollion-protocol / README.md

fabthebest's picture

Upload folder using huggingface_hub

9dc5d9a verified 21 days ago

|

history blame contribute delete

1.07 kB

	---
	license: apache-2.0
	tags:
	- mechanistic-interpretability
	- gpt2
	- probing
	- logit-lens
	- activation-patching
	language:
	- en
	---

	# The Champollion Protocol

	Deciphering a language model's internal representations using the same method
	Champollion used to decipher Egyptian hieroglyphs in 1822.

	Author: Fabrice Fils-Aimé

	## Method

	1. Rosetta Stone — Factual prompts with known answers, verified against the model
	2. Cartouches — Hidden-state extraction at every layer
	3. Cross-comparison — Logit lens to track prediction emergence
	4. Partial alphabet — Linear probes (5-fold CV) to decode layer-wise information
	5. Coptic validation — Held-out test set to verify generalization
	6. Mixed system — Activation patching to identify causal vs structural layers

	## Citation

	```bibtex
	@misc{filsaime2026champollion,
	title={The Champollion Protocol: Deciphering LLM Internal Representations},
	author={Fils-Aim\'e, Fabrice},
	year={2026},
	url={https://huggingface.co/fabthebest/champollion-protocol}
	}
	```