Understanding and Enforcing Weight Disentanglement in Task Arithmetic β LoRA-ATT Checkpoints
[CVPR 2026] Official LoRA-ATT checkpoints for the paper "Understanding and Enforcing Weight Disentanglement in Task Arithmetic".
[Paper] [Code] [OrthoReg repo]
π― Abstract
Task arithmetic provides an efficient, training-free way to edit pre-trained models, yet lacks a fundamental theoretical explanation for its success. The existing concept of "weight disentanglement" describes the ideal outcome of non-interfering task composition but does not reveal its underlying cause. Crucially, what intrinsic properties of the pre-trained model ($\theta_0$) or the task vectors ($\tau_t$) enable this disentanglement remains underexplored. In this paper, we introduce Task-Feature Specialization (TFS), a model's ability to allocate distinct internal features to different tasks, as the fundamental principle. We first prove that TFS is a sufficient condition for weight disentanglement. More importantly, we find that TFS also gives rise to an observable geometric consequence: weight vector orthogonality. This positions TFS as the common cause for both the desired functional outcome (disentanglement) and a measurable geometric property (orthogonality). This relationship provides the key insight for our method: since the abstract TFS property is intractable to enforce directly, we can instead promote weight disentanglement by shaping its concrete geometric consequence, orthogonality. Therefore, we propose OrthoReg, a simple and effective regularization method that actively enforces an internal orthogonal structure on weight updates ($\Delta W$) that constitute $\tau_t$ during fine-tuning. And we theoretically prove that OrthoReg promotes disentanglement. Extensive experiments demonstrate that OrthoReg consistently and significantly enhances the performance of various task arithmetic methods.
β¨ Key Contributions
- π Theory: We identify TFS as a sufficient condition for weight disentanglement, and WVO as its geometric consequence, providing the first principled explanation for task arithmetic.
- π§ Method (OrthoReg): A simple regularization term added to the fine-tuning loss that enforces column-wise orthogonality on ΞW, for which we prove theoretical efficacy.
- π Connection to TTA: We show that OrthoReg and Tangent Task Arithmetic (TTA) share the same underlying mechanism (i.e. inter-task vector orthogonality), but OrthoReg achieves this more efficiently.
- π Experiments: Consistent and significant improvements over Non-linear FT, TTA, ATT-FT, LoRA-ATT across ViT-B-32, ViT-B-16, and ViT-L-14.
The OrthoReg Loss on LoRA-ATT
The OrthoReg loss is applied to the equivalent dense weight update implied by each LoRA module:
The total training loss is:
π Checkpoint Structure
This repository contains fine-tuned LoRA-ATT checkpoints for ViT-B-32, ViT-B-16, and ViT-L-14 on all 8 tasks, covering the following finetuning modes:
| Directory | Mode | Description |
|---|---|---|
loraatt_1e-03_{model}/ |
loraatt |
LoRA-ATT baseline (attention-only LoRA fine-tuning) |
loraatt_ortho_1e-03_lambda{Ξ»}_{model}/ |
loraatt_ortho |
LoRA-ATT + OrthoReg |
Lambda values used per model:
| Model | --ortho-lambda |
|---|---|
| ViT-B-32 | 10.0 |
| ViT-B-16 | 10.0 |
| ViT-L-14 | 1.0 |
Each mode directory is organized by dataset:
{mode}_{lr}_{model}/
βββ head_CarsVal.pt # linear classification head
βββ head_DTDVal.pt
βββ head_EuroSATVal.pt
βββ head_GTSRBVal.pt
βββ head_MNISTVal.pt
βββ head_RESISC45Val.pt
βββ head_SUN397Val.pt
βββ head_SVHNVal.pt
βββ {mode}_ft_accuracies.json # single-task accuracy results
βββ {mode}_additions.json # task addition results
βββ CarsVal/
β βββ {mode}_finetuned.pt # fine-tuned model weights (merged task vector + ΞΈβ)
β βββ {mode}_zeroshot.pt # zero-shot reference weights
βββ DTDVal/
...
βββ SVHNVal/
All checkpoints use seed=1993, lr=1e-3, lora-rank=8, and lora-alpha=8.0 to match the paper's reported results.
π Usage
Step 1 β Clone this repository
git lfs install
git clone https://huggingface.co/RL-MIND/OrthoReg_lora_checkpoints
Place the cloned folder as checkpoints_1993/ inside your code directory:
mv OrthoReg_lora_checkpoints/* orthoreg_lora/checkpoints_1993/
Step 2 β Install the codebase
git clone https://github.com/lshangge/OrthoReg_lora
cd orthoreg_lora
conda env create -f environment.yml
conda activate tta_peft
export PYTHONPATH="$PYTHONPATH:$PWD"
Step 3 β Run evaluation
Generate zero-shot accuracies (required once before task addition/negation):
python src/eval_single_task.py \
--model ViT-B-32 \
--finetuning-mode none \
--data-location /path/to/datasets/
Evaluate single-task accuracy:
python src/eval_single_task.py \
--model ViT-B-32 \
--finetuning-mode loraatt \
--lora-rank 8 \
--lora-alpha 8.0 \
--lr 1e-3 \
--seed 1993 \
--data-location /path/to/datasets/
Evaluate task addition:
python src/eval_task_addition.py \
--model ViT-B-32 \
--finetuning-mode loraatt \
--lora-rank 8 \
--lora-alpha 8.0 \
--lr 1e-3 \
--seed 1993 \
--data-location /path/to/datasets/
Evaluate task negation:
python src/eval_task_negation.py \
--model ViT-B-32 \
--finetuning-mode loraatt \
--lora-rank 8 \
--lora-alpha 8.0 \
--lr 1e-3 \
--seed 1993 \
--data-location /path/to/datasets/
To evaluate the OrthoReg variant, replace --finetuning-mode loraatt with --finetuning-mode loraatt_ortho and add --ortho-lambda 10.0 (or 1.0 for ViT-L-14).
Argument reference
| Argument | Value for these checkpoints |
|---|---|
--seed |
1993 |
--lr |
1e-3 |
--lora-rank |
8 |
--lora-alpha |
8.0 |
--ortho-lambda |
0 for loraatt; 10.0 for B-32/B-16, 1.0 for L-14 with loraatt_ortho |
--finetuning-mode |
loraatt or loraatt_ortho |
π¦ Datasets
We evaluate on 8 image classification benchmarks: Cars Β· DTD Β· EuroSAT Β· GTSRB Β· MNIST Β· RESISC45 Β· SUN397 Β· SVHN
For dataset preparation, follow the instructions in the TTA repository.
π Citation
If you find this work useful, please cite:
@inproceedings{liu2026orthoreg,
title = {Understanding and Enforcing Weight Disentanglement in Task Arithmetic},
author = {Liu, Shangge and Yin, Yuehan and Wang, Lei and Fan, Qi and
Shi, Yinghuan and Li, Wenbin and Gao, Yang and Tao, Dacheng},
booktitle = {CVPR},
year = {2026}
}
π¬ Acknowledgements
This codebase is built on top of Task Arithmetic, Tangent Task Arithmetic, and Attention-Only Fine-tuning. We thank the authors for releasing their code.
Model tree for gezi2333/OrthoReg_lora_checkpoints
Base model
openai/clip-vit-base-patch16