CARE-Transformer / README.md
nielsr's picture
nielsr HF Staff
Add pipeline tag and improve model card
484e47c verified
|
raw
history blame
1.78 kB
metadata
license: mit
pipeline_tag: image-classification

CARE Transformer: Mobile-Friendly Linear Visual Transformer via Decoupled Dual Interaction

This repository contains the model weights for the paper CARE Transformer: Mobile-Friendly Linear Visual Transformer via Decoupled Dual Interaction.

CARE (deCoupled duAl-interactive lineaR attEntion) is a novel linear attention mechanism designed to unleash the power of linear attention for resource-constrained mobile devices. It utilizes an asymmetrical feature decoupling strategy to manage local inductive bias and long-range dependencies, alongside a dual interaction module to facilitate communication across features and layers.

Performance

The CARE Transformer achieves high efficiency and accuracy on the ImageNet-1K dataset:

Method Type GMACs Params (M) Top-1 Acc (%)
CARE-S0 LA+CONV 0.7 7.3 78.4
CARE-S1 LA+CONV 1.0 9.6 80.1
CARE-S2 LA+CONV 1.9 19.5 82.1

Resources

Citation

If you find this work useful, please cite:

@inproceedings{zhou2025care,
  title={CARE Transformer: Mobile-Friendly Linear Visual Transformer via Decoupled Dual Interaction},
  author={Zhou, Yuan and Xu, Qingshan and Cui, Jiequan and Zhou, Junbao and Zhang, Jing and Hong, Richang and Zhang, Hanwang},
  booktitle={Proceedings of the Computer Vision and Pattern Recognition Conference},
  pages={20135--20145},
  year={2025}
}