diff-interpretation-tuning
/

loras

Diff Interpretation Tuning

Model card Files Files and versions

ttw commited on Oct 11, 2025

Commit

15a213b

·

verified ·

1 Parent(s): bba5437

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -8,7 +8,7 @@ datasets:
 - diff-interpretation-tuning/finetuning-data
 ---
-# Diff Interpretation Tuning
 This repository contains the weight diffs and DIT adapters used in the paper [Learning to Interpret Weight Differences in Language Models (Goel et al. 2025)](https://arxiv.org/abs/2510.05092).
 This paper introduces *Diff Interpretation Tuning*, a method that trains a LoRA adapter than can be applied to a model to get it to describe its own finetuning induced modifications.

 - diff-interpretation-tuning/finetuning-data
 ---
+# Diff Interpretation Tuning: Weight Diffs and Adapters
 This repository contains the weight diffs and DIT adapters used in the paper [Learning to Interpret Weight Differences in Language Models (Goel et al. 2025)](https://arxiv.org/abs/2510.05092).
 This paper introduces *Diff Interpretation Tuning*, a method that trains a LoRA adapter than can be applied to a model to get it to describe its own finetuning induced modifications.