Spaces:

diff-interpretation-tuning
/

README

Running

App Files Files Community

README / README.md

ttw

Update README.md

2762dd9 verified 2 months ago

preview code

raw

history blame

937 Bytes

metadata

title: README
colorFrom: gray
colorTo: yellow
sdk: static
pinned: false
license: mit

Diff Interpretation Tuning

This organization hosts the weight diffs, DIT adapters, and finetuning data used in paper Learning to Interpret Weight Differences in Language Models (Goel et al. 2025). The paper introduces Diff Interpretation Tuning, a method that trains a LoRA adapter than can be applied to a model to get it to describe its own finetuning induced modifications.

In addition to the loras and finetuning-data repositories hosted here, you can also check out our Google Colab demo notebook and our Github.