yxlu0
/

SAK

Model card Files Files and versions

SAK / README.md

yxlu0's picture

Update README.md

4c33a99 verified 10 months ago

|

history blame contribute delete

2.53 kB

	---
	license: mit
	datasets:
	- ILSVRC/imagenet-1k
	---

	# SAK

	<!-- Provide a quick summary of what the model is/does. -->

	These are checkpoints for our ICLR2025 paper: Swiss Army Knife: Synergizing Biases in Knowledge from Vision Foundation Models for Multi-Task Learning.

	## Model Details

	### Model Description

	<!-- Provide a longer summary of what this model is. -->



	- Developed by: Yuxiang Lu, Shengcao Cao, Yu-Xiong Wang
	- License: mit

	### Model Sources

	<!-- Provide the basic links for the model. -->

	- Repository: https://github.com/innovator-zero/SAK
	- Paper [OpenReview]: https://openreview.net/forum?id=eePww5u7J3
	- Paper [arXiv]: https://arxiv.org/abs/2410.14633
	- Project Page: https://innovator-zero.github.io/SAK/

	## Uses

	<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->

	Currently we directly provide checkpoints of pre-trained models in this repository. For detailed information on usage, please refer to our [github repository](https://github.com/innovator-zero/SAK).

	Following are the checkpoint lists:

	Stage 1
	\| Teachers \| Student backbone \| Checkpoint \|
	\| ----------------------- \| ---------------- \| ---------- \|
	\| DINOv2-B, CLIP-B, SAM-B \| ViT-S \| [BS_s1.pth](https://huggingface.co/yxlu0/SAK/blob/main/BS_s1.pth) \|
	\| DINOv2-B, CLIP-B, SAM-B \| ViT-B \| [BB_s1.pth](https://huggingface.co/yxlu0/SAK/blob/main/BB_s1.pth) \|
	\| DINOv2-L, CLIP-L, SAM-L \| ViT-B \| [LB_s1.pth](https://huggingface.co/yxlu0/SAK/blob/main/LB_s1.pth) \|
	\| DINOv2-L, CLIP-L, SAM-L \| ViT-L \| [LL_s1.pth](https://huggingface.co/yxlu0/SAK/blob/main/LL_s1.pth) \|

	Stage 2

	We provide two example checkpoints after Stage 2 training, initialized by BB_s1.pth from Stage 1 training:

	- PASCAL-Context: [BB_s2_pascal.pth](https://huggingface.co/yxlu0/SAK/blob/main/BB_s2_pascal.pth)
	- NYUD-v2: [BB_s2_nyud.pth](https://huggingface.co/yxlu0/SAK/blob/main/BB_s2_nyud.pth)

	## Citation

	<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->

	```bibtex
	@inproceedings{lu2025swiss,
	title={Swiss Army Knife: Synergizing Biases in Knowledge from Vision Foundation Models for Multi-Task Learning},
	author={Yuxiang Lu and Shengcao Cao and Yu-Xiong Wang},
	booktitle={The Thirteenth International Conference on Learning Representations},
	year={2025}
	}
	```