Add model card for Model Merging with Functional Dual Anchors
Browse filesThis PR adds a comprehensive model card for "Model Merging with Functional Dual Anchors".
It includes:
- A link to the paper: [Model Merging with Functional Dual Anchors](https://huggingface.co/papers/2510.21223).
- The appropriate `license` (Apache 2.0).
- The `library_name` (transformers), as the methodology is compatible with models from this library (e.g., RoBERTa, Llama-2).
- The `pipeline_tag` (image-classification), reflecting its application to vision tasks.
- Links to the project page and the GitHub repository for further details and usage.
- A "Quick Start" section with code snippets, directly extracted from the GitHub README, providing clear instructions for environment setup, model adaptation, and FDA construction.
Please review and merge if everything looks good!
|
@@ -0,0 +1,75 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: apache-2.0
|
| 3 |
+
library_name: transformers
|
| 4 |
+
pipeline_tag: image-classification
|
| 5 |
+
---
|
| 6 |
+
|
| 7 |
+
# Model Merging with Functional Dual Anchors
|
| 8 |
+
|
| 9 |
+
This repository is the official PyTorch implementation of the paper "[Model Merging with Functional Dual Anchors](https://huggingface.co/papers/2510.21223)", by Kexuan Shi, Yandong Wen, Weiyang Liu.
|
| 10 |
+
|
| 11 |
+
**Functional Dual Anchors (FDAs)** propose a novel framework for efficiently integrating knowledge from multiple fine-tuned checkpoints of a shared foundation model. Unlike existing methods that operate in the parameter space, FDAs model knowledge in the input-representation space. FDAs are synthetic inputs whose induced gradients align with task vectors, capturing task-specific functional shifts relative to the pre-trained model. This perspective bridges joint multi-task training and post-hoc merging, offering both robustness and flexibility across various tasks, including vision, natural language processing, and natural language generation.
|
| 12 |
+
|
| 13 |
+
<p align="center">
|
| 14 |
+
<img src="https://github.com/Sphere-AI-Lab/fda/raw/main/docs/assets/framework_trajectory.png" width="90%" />
|
| 15 |
+
</p>
|
| 16 |
+
|
| 17 |
+
You can find more details on the [project page](https://spherelab.ai/fda/) and in the [official GitHub repository](https://github.com/Sphere-AI-Lab/fda/tree/main).
|
| 18 |
+
|
| 19 |
+
## 🚀 Quick Start
|
| 20 |
+
|
| 21 |
+
The official GitHub repository provides detailed instructions for setting up the environment, downloading checkpoints and corresponding FDAs, and running adaptation/construction scripts.
|
| 22 |
+
|
| 23 |
+
For vision, NLP, and NLG tasks, the framework leverages base models such as `RoBERTa` and `Llama-2` from Hugging Face.
|
| 24 |
+
|
| 25 |
+
### Checkpoints and Corresponding FDAs
|
| 26 |
+
|
| 27 |
+
The checkpoints for vision, NLP, and NLG tasks and their corresponding FDAs are available for download via the [official GitHub repository](https://github.com/Sphere-AI-Lab/fda/tree/main). Specifically, vision and NLU FDAs are hosted on Hugging Face: [fda_for_vision](https://huggingface.co/datasets/SphereLab/FDA_for_Vision) and [fda_for_nlu](https://huggingface.co/datasets/SphereLab/FDA_for_NLU/tree/main).
|
| 28 |
+
|
| 29 |
+
### Environment
|
| 30 |
+
|
| 31 |
+
For Vision and NLP tasks, the environment can be installed by:
|
| 32 |
+
```bash
|
| 33 |
+
cd FDA/Vision #cd FDA/NLU
|
| 34 |
+
# Create conda environment
|
| 35 |
+
conda env create -f environment.yaml
|
| 36 |
+
# Activate environment
|
| 37 |
+
conda activate fda
|
| 38 |
+
```
|
| 39 |
+
For NLG tasks, please use: ```NLG/environment.yaml```
|
| 40 |
+
|
| 41 |
+
### Adapt by FDAs
|
| 42 |
+
|
| 43 |
+
Please follow the path comments in the code file ```adapt.py```, replace them with the paths to your local checkpoints and FDAs, and then run the following commands to reproduce the FDA adaptation results:
|
| 44 |
+
```bash
|
| 45 |
+
cd FDA/Vision #cd FDA/NLU cd FDA/NLG
|
| 46 |
+
sh adapt.sh
|
| 47 |
+
```
|
| 48 |
+
|
| 49 |
+
For models in NLG tasks, please split the model first:
|
| 50 |
+
```bash
|
| 51 |
+
cd FDA/NLG
|
| 52 |
+
python split_model.py
|
| 53 |
+
```
|
| 54 |
+
|
| 55 |
+
### Construct FDAs
|
| 56 |
+
|
| 57 |
+
If you want to construct FDAs for your finetuned checkpoint, please follow the path comments in the code file ```construct_fda.py```, replace them with the paths to your finetuned checkpoints. Then,
|
| 58 |
+
```bash
|
| 59 |
+
sh construct.sh
|
| 60 |
+
```
|
| 61 |
+
|
| 62 |
+
## Citation
|
| 63 |
+
If you find this work useful, please consider citing:
|
| 64 |
+
|
| 65 |
+
```bibtex
|
| 66 |
+
@article{shi2025modelmergingfunctionaldual,
|
| 67 |
+
title = {Model Merging with Functional Dual Anchors},
|
| 68 |
+
author = {Shi, Kexuan and Wen, Yandong and Liu, Weiyang},
|
| 69 |
+
year = {2025},
|
| 70 |
+
journal = {arXiv preprint arXiv:2510.21223},
|
| 71 |
+
archivePrefix = {arXiv},
|
| 72 |
+
primaryClass = {cs.LG},
|
| 73 |
+
url = {https://arxiv.org/abs/2510.21223}
|
| 74 |
+
}
|
| 75 |
+
```
|