Understanding and Enforcing Weight Disentanglement in Task Arithmetic — LoRA-ATT Checkpoints

[CVPR 2026] Official LoRA-ATT checkpoints for the paper "Understanding and Enforcing Weight Disentanglement in Task Arithmetic".

[Paper] [Code] [OrthoReg repo]

🎯 Abstract

Task arithmetic provides an efficient, training-free way to edit pre-trained models, yet lacks a fundamental theoretical explanation for its success. The existing concept of "weight disentanglement" describes the ideal outcome of non-interfering task composition but does not reveal its underlying cause. Crucially, what intrinsic properties of the pre-trained model ($\theta_0$) or the task vectors ($\tau_t$) enable this disentanglement remains underexplored. In this paper, we introduce Task-Feature Specialization (TFS), a model's ability to allocate distinct internal features to different tasks, as the fundamental principle. We first prove that TFS is a sufficient condition for weight disentanglement. More importantly, we find that TFS also gives rise to an observable geometric consequence: weight vector orthogonality. This positions TFS as the common cause for both the desired functional outcome (disentanglement) and a measurable geometric property (orthogonality). This relationship provides the key insight for our method: since the abstract TFS property is intractable to enforce directly, we can instead promote weight disentanglement by shaping its concrete geometric consequence, orthogonality. Therefore, we propose OrthoReg, a simple and effective regularization method that actively enforces an internal orthogonal structure on weight updates ($\Delta W$) that constitute $\tau_t$ during fine-tuning. And we theoretically prove that OrthoReg promotes disentanglement. Extensive experiments demonstrate that OrthoReg consistently and significantly enhances the performance of various task arithmetic methods.

✨ Key Contributions

📐 Theory: We identify TFS as a sufficient condition for weight disentanglement, and WVO as its geometric consequence, providing the first principled explanation for task arithmetic.
🔧 Method (OrthoReg): A simple regularization term added to the fine-tuning loss that enforces column-wise orthogonality on ΔW, for which we prove theoretical efficacy.
🔗 Connection to TTA: We show that OrthoReg and Tangent Task Arithmetic (TTA) share the same underlying mechanism (i.e. inter-task vector orthogonality), but OrthoReg achieves this more efficiently.
📊 Experiments: Consistent and significant improvements over Non-linear FT, TTA, ATT-FT, LoRA-ATT across ViT-B-32, ViT-B-16, and ViT-L-14.

The OrthoReg Loss on LoRA-ATT

The OrthoReg loss is applied to the equivalent dense weight update implied by each LoRA module:

$\Delta W = \frac{\alpha}{r} \cdot B A$

$\mathcal{L}_{\text{ortho}} = \sum_l \left\| (\Delta W^{(l)})^\top \Delta W^{(l)} - I \right\|_F^2$

The total training loss is:

$\mathcal{L} = \mathcal{L}_{\text{task}} + \lambda \cdot \mathcal{L}_{\text{ortho}}$

📁 Checkpoint Structure

This repository contains fine-tuned LoRA-ATT checkpoints for ViT-B-32, ViT-B-16, and ViT-L-14 on all 8 tasks, covering the following finetuning modes:

Directory	Mode	Description
`loraatt_1e-03_{model}/`	`loraatt`	LoRA-ATT baseline (attention-only LoRA fine-tuning)
`loraatt_ortho_1e-03_lambda{λ}_{model}/`	`loraatt_ortho`	LoRA-ATT + OrthoReg

Lambda values used per model:

Model	`--ortho-lambda`
ViT-B-32	10.0
ViT-B-16	10.0
ViT-L-14	1.0

Each mode directory is organized by dataset:

{mode}_{lr}_{model}/
├── head_CarsVal.pt           # linear classification head
├── head_DTDVal.pt
├── head_EuroSATVal.pt
├── head_GTSRBVal.pt
├── head_MNISTVal.pt
├── head_RESISC45Val.pt
├── head_SUN397Val.pt
├── head_SVHNVal.pt
├── {mode}_ft_accuracies.json # single-task accuracy results
├── {mode}_additions.json     # task addition results
├── CarsVal/
│   ├── {mode}_finetuned.pt   # fine-tuned model weights (merged task vector + θ₀)
│   └── {mode}_zeroshot.pt    # zero-shot reference weights
├── DTDVal/
...
└── SVHNVal/

All checkpoints use seed=1993, lr=1e-3, lora-rank=8, and lora-alpha=8.0 to match the paper's reported results.

🚀 Usage

Step 1 — Clone this repository

git lfs install
git clone https://huggingface.co/RL-MIND/OrthoReg_lora_checkpoints

Place the cloned folder as checkpoints_1993/ inside your code directory:

mv OrthoReg_lora_checkpoints/* orthoreg_lora/checkpoints_1993/

Step 2 — Install the codebase

git clone https://github.com/lshangge/OrthoReg_lora
cd orthoreg_lora
conda env create -f environment.yml
conda activate tta_peft
export PYTHONPATH="$PYTHONPATH:$PWD"

Step 3 — Run evaluation

Generate zero-shot accuracies (required once before task addition/negation):

python src/eval_single_task.py \
    --model ViT-B-32 \
    --finetuning-mode none \
    --data-location /path/to/datasets/

Evaluate single-task accuracy:

python src/eval_single_task.py \
    --model ViT-B-32 \
    --finetuning-mode loraatt \
    --lora-rank 8 \
    --lora-alpha 8.0 \
    --lr 1e-3 \
    --seed 1993 \
    --data-location /path/to/datasets/

Evaluate task addition:

python src/eval_task_addition.py \
    --model ViT-B-32 \
    --finetuning-mode loraatt \
    --lora-rank 8 \
    --lora-alpha 8.0 \
    --lr 1e-3 \
    --seed 1993 \
    --data-location /path/to/datasets/

Evaluate task negation:

python src/eval_task_negation.py \
    --model ViT-B-32 \
    --finetuning-mode loraatt \
    --lora-rank 8 \
    --lora-alpha 8.0 \
    --lr 1e-3 \
    --seed 1993 \
    --data-location /path/to/datasets/

To evaluate the OrthoReg variant, replace --finetuning-mode loraatt with --finetuning-mode loraatt_ortho and add --ortho-lambda 10.0 (or 1.0 for ViT-L-14).

Argument reference

Argument	Value for these checkpoints
`--seed`	`1993`
`--lr`	`1e-3`
`--lora-rank`	`8`
`--lora-alpha`	`8.0`
`--ortho-lambda`	`0` for `loraatt`; `10.0` for B-32/B-16, `1.0` for L-14 with `loraatt_ortho`
`--finetuning-mode`	`loraatt` or `loraatt_ortho`

📦 Datasets

We evaluate on 8 image classification benchmarks: Cars · DTD · EuroSAT · GTSRB · MNIST · RESISC45 · SUN397 · SVHN

For dataset preparation, follow the instructions in the TTA repository.

📝 Citation

If you find this work useful, please cite:

@inproceedings{liu2026orthoreg,
  title     = {Understanding and Enforcing Weight Disentanglement in Task Arithmetic},
  author    = {Liu, Shangge and Yin, Yuehan and Wang, Lei and Fan, Qi and
               Shi, Yinghuan and Li, Wenbin and Gao, Yang and Tao, Dacheng},
  booktitle = {CVPR},
  year      = {2026}
}

📬 Acknowledgements

This codebase is built on top of Task Arithmetic, Tangent Task Arithmetic, and Attention-Only Fine-tuning. We thank the authors for releasing their code.

Downloads last month: -; Downloads are not tracked for this model. How to track

Model tree for gezi2333/OrthoReg_lora_checkpoints

Base model

openai/clip-vit-base-patch16

Adapter

(18)

this model

Paper for gezi2333/OrthoReg_lora_checkpoints

Understanding and Enforcing Weight Disentanglement in Task Arithmetic

Paper • 2604.17078 • Published 20 days ago • 14