nielsr's picture
nielsr HF Staff
Add model card for Model Merging with Functional Dual Anchors
65e0fcd verified
|
raw
history blame
3.41 kB
metadata
license: apache-2.0
library_name: transformers
pipeline_tag: image-classification

Model Merging with Functional Dual Anchors

This repository is the official PyTorch implementation of the paper "Model Merging with Functional Dual Anchors", by Kexuan Shi, Yandong Wen, Weiyang Liu.

Functional Dual Anchors (FDAs) propose a novel framework for efficiently integrating knowledge from multiple fine-tuned checkpoints of a shared foundation model. Unlike existing methods that operate in the parameter space, FDAs model knowledge in the input-representation space. FDAs are synthetic inputs whose induced gradients align with task vectors, capturing task-specific functional shifts relative to the pre-trained model. This perspective bridges joint multi-task training and post-hoc merging, offering both robustness and flexibility across various tasks, including vision, natural language processing, and natural language generation.

You can find more details on the project page and in the official GitHub repository.

๐Ÿš€ Quick Start

The official GitHub repository provides detailed instructions for setting up the environment, downloading checkpoints and corresponding FDAs, and running adaptation/construction scripts.

For vision, NLP, and NLG tasks, the framework leverages base models such as RoBERTa and Llama-2 from Hugging Face.

Checkpoints and Corresponding FDAs

The checkpoints for vision, NLP, and NLG tasks and their corresponding FDAs are available for download via the official GitHub repository. Specifically, vision and NLU FDAs are hosted on Hugging Face: fda_for_vision and fda_for_nlu.

Environment

For Vision and NLP tasks, the environment can be installed by:

cd FDA/Vision #cd FDA/NLU
# Create conda environment
conda env create -f environment.yaml
# Activate environment
conda activate fda

For NLG tasks, please use: NLG/environment.yaml

Adapt by FDAs

Please follow the path comments in the code file adapt.py, replace them with the paths to your local checkpoints and FDAs, and then run the following commands to reproduce the FDA adaptation results:

cd FDA/Vision #cd FDA/NLU cd FDA/NLG
sh adapt.sh

For models in NLG tasks, please split the model first:

cd FDA/NLG
python split_model.py

Construct FDAs

If you want to construct FDAs for your finetuned checkpoint, please follow the path comments in the code file construct_fda.py, replace them with the paths to your finetuned checkpoints. Then,

sh construct.sh

Citation

If you find this work useful, please consider citing:

@article{shi2025modelmergingfunctionaldual,
  title     = {Model Merging with Functional Dual Anchors},
  author    = {Shi, Kexuan and Wen, Yandong and Liu, Weiyang},
  year      = {2025},
  journal   = {arXiv preprint arXiv:2510.21223},
  archivePrefix = {arXiv},
  primaryClass  = {cs.LG},
  url       = {https://arxiv.org/abs/2510.21223}
}