nielsr HF Staff

Add pipeline tag and improve model card

484e47c verified 3 months ago

1.78 kB

license: mit
pipeline_tag: image-classification

CARE Transformer: Mobile-Friendly Linear Visual Transformer via Decoupled Dual Interaction

This repository contains the model weights for the paper CARE Transformer: Mobile-Friendly Linear Visual Transformer via Decoupled Dual Interaction.

CARE (deCoupled duAl-interactive lineaR attEntion) is a novel linear attention mechanism designed to unleash the power of linear attention for resource-constrained mobile devices. It utilizes an asymmetrical feature decoupling strategy to manage local inductive bias and long-range dependencies, alongside a dual interaction module to facilitate communication across features and layers.

Performance

The CARE Transformer achieves high efficiency and accuracy on the ImageNet-1K dataset:

Method	Type	GMACs	Params (M)	Top-1 Acc (%)
CARE-S0	LA+CONV	0.7	7.3	78.4
CARE-S1	LA+CONV	1.0	9.6	80.1
CARE-S2	LA+CONV	1.9	19.5	82.1

Resources

Paper: https://arxiv.org/abs/2411.16170
Official GitHub Repository: https://github.com/zhouyuan888888/CARE-Transformer

Citation

If you find this work useful, please cite:

@inproceedings{zhou2025care,
  title={CARE Transformer: Mobile-Friendly Linear Visual Transformer via Decoupled Dual Interaction},
  author={Zhou, Yuan and Xu, Qingshan and Cui, Jiequan and Zhou, Junbao and Zhang, Jing and Hong, Richang and Zhang, Hanwang},
  booktitle={Proceedings of the Computer Vision and Pattern Recognition Conference},
  pages={20135--20145},
  year={2025}
}