Transluce
/

features_explain_llama3.1_8b_llama3.1_8b_instruct

Text Generation

Model card Files Files and versions

Add pipeline tag and GitHub link

#1

by nielsr HF Staff - opened Jan 1

base: refs/heads/main

←

from: refs/pr/1

Discussion Files changed

Files changed (1) hide show

README.md +9 -7

README.md CHANGED Viewed

@@ -1,18 +1,20 @@
 ---
-license: mit
-language:
-- en
 base_model:
 - meta-llama/Llama-3.1-8B-Instruct
 ---
 # Model Card
-This is a Llama-3.1-8B-Instruct model fine-tuned to explain continuous features from Llama-3.1-8B.
-This model was trained to map SAE features from Llama-3.1-8B's residual stream to their explanations derived from Neuronpedia.
-It generalizes to explaining any arbitrary continuous feature from Llama-3.1-8B's residual stream.
-See [paper](https://arxiv.org/abs/2511.08579) for more details.
 ## Usage

 ---
 base_model:
 - meta-llama/Llama-3.1-8B-Instruct
+language:
+- en
+license: mit
+pipeline_tag: text-generation
 ---
 # Model Card
+This is a Llama-3.1-8B-Instruct model fine-tuned to explain continuous features from Llama-3.1-8B, as described in the paper [Training Language Models to Explain Their Own Computations](https://arxiv.org/abs/2511.08579).
+This model was trained to map SAE features from Llama-3.1-8B's residual stream to their explanations derived from Neuronpedia. It generalizes to explaining any arbitrary continuous feature from Llama-3.1-8B's residual stream.
+- **Repository:** [https://github.com/TransluceAI/introspective-interp](https://github.com/TransluceAI/introspective-interp)
+- **Paper:** [https://arxiv.org/abs/2511.08579](https://arxiv.org/abs/2511.08579)
 ## Usage