|
|
--- |
|
|
license: apache-2.0 |
|
|
library_name: transformers |
|
|
pipeline_tag: image-classification |
|
|
--- |
|
|
|
|
|
# Model Merging with Functional Dual Anchors |
|
|
|
|
|
This repository is the official PyTorch implementation of the paper "[Model Merging with Functional Dual Anchors](https://huggingface.co/papers/2510.21223)", by Kexuan Shi, Yandong Wen, Weiyang Liu. |
|
|
|
|
|
**Functional Dual Anchors (FDAs)** propose a novel framework for efficiently integrating knowledge from multiple fine-tuned checkpoints of a shared foundation model. Unlike existing methods that operate in the parameter space, FDAs model knowledge in the input-representation space. FDAs are synthetic inputs whose induced gradients align with task vectors, capturing task-specific functional shifts relative to the pre-trained model. This perspective bridges joint multi-task training and post-hoc merging, offering both robustness and flexibility across various tasks, including vision, natural language processing, and natural language generation. |
|
|
|
|
|
<p align="center"> |
|
|
<img src="https://github.com/Sphere-AI-Lab/fda/raw/main/docs/assets/framework_trajectory.png" width="90%" /> |
|
|
</p> |
|
|
|
|
|
You can find more details on the [project page](https://spherelab.ai/fda/) and in the [official GitHub repository](https://github.com/Sphere-AI-Lab/fda/tree/main). |
|
|
|
|
|
## 🚀 Quick Start |
|
|
|
|
|
The official GitHub repository provides detailed instructions for setting up the environment, downloading checkpoints and corresponding FDAs, and running adaptation/construction scripts. |
|
|
|
|
|
For vision, NLP, and NLG tasks, the framework leverages base models such as `RoBERTa` and `Llama-2` from Hugging Face. |
|
|
|
|
|
### Checkpoints and Corresponding FDAs |
|
|
|
|
|
The checkpoints for vision, NLP, and NLG tasks and their corresponding FDAs are available for download via the [official GitHub repository](https://github.com/Sphere-AI-Lab/fda/tree/main). Specifically, vision and NLU FDAs are hosted on Hugging Face: [fda_for_vision](https://huggingface.co/datasets/SphereLab/FDA_for_Vision) and [fda_for_nlu](https://huggingface.co/datasets/SphereLab/FDA_for_NLU/tree/main). |
|
|
|
|
|
### Environment |
|
|
|
|
|
For Vision and NLP tasks, the environment can be installed by: |
|
|
```bash |
|
|
cd FDA/Vision #cd FDA/NLU |
|
|
# Create conda environment |
|
|
conda env create -f environment.yaml |
|
|
# Activate environment |
|
|
conda activate fda |
|
|
``` |
|
|
For NLG tasks, please use: ```NLG/environment.yaml``` |
|
|
|
|
|
### Adapt by FDAs |
|
|
|
|
|
Please follow the path comments in the code file ```adapt.py```, replace them with the paths to your local checkpoints and FDAs, and then run the following commands to reproduce the FDA adaptation results: |
|
|
```bash |
|
|
cd FDA/Vision #cd FDA/NLU cd FDA/NLG |
|
|
sh adapt.sh |
|
|
``` |
|
|
|
|
|
For models in NLG tasks, please split the model first: |
|
|
```bash |
|
|
cd FDA/NLG |
|
|
python split_model.py |
|
|
``` |
|
|
|
|
|
### Construct FDAs |
|
|
|
|
|
If you want to construct FDAs for your finetuned checkpoint, please follow the path comments in the code file ```construct_fda.py```, replace them with the paths to your finetuned checkpoints. Then, |
|
|
```bash |
|
|
sh construct.sh |
|
|
``` |
|
|
|
|
|
## Citation |
|
|
If you find this work useful, please consider citing: |
|
|
|
|
|
```bibtex |
|
|
@article{shi2025modelmergingfunctionaldual, |
|
|
title = {Model Merging with Functional Dual Anchors}, |
|
|
author = {Shi, Kexuan and Wen, Yandong and Liu, Weiyang}, |
|
|
year = {2025}, |
|
|
journal = {arXiv preprint arXiv:2510.21223}, |
|
|
archivePrefix = {arXiv}, |
|
|
primaryClass = {cs.LG}, |
|
|
url = {https://arxiv.org/abs/2510.21223} |
|
|
} |
|
|
``` |