Add pipeline tag and improve model card
#1
by nielsr HF Staff - opened
README.md
CHANGED
|
@@ -1,10 +1,39 @@
|
|
| 1 |
---
|
| 2 |
license: mit
|
|
|
|
| 3 |
---
|
| 4 |
|
| 5 |
# CARE Transformer: Mobile-Friendly Linear Visual Transformer via Decoupled Dual Interaction
|
| 6 |
|
| 7 |
-
|
| 8 |
|
|
|
|
| 9 |
|
| 10 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
---
|
| 2 |
license: mit
|
| 3 |
+
pipeline_tag: image-classification
|
| 4 |
---
|
| 5 |
|
| 6 |
# CARE Transformer: Mobile-Friendly Linear Visual Transformer via Decoupled Dual Interaction
|
| 7 |
|
| 8 |
+
This repository contains the model weights for the paper [CARE Transformer: Mobile-Friendly Linear Visual Transformer via Decoupled Dual Interaction](https://huggingface.co/papers/2411.16170).
|
| 9 |
|
| 10 |
+
CARE (de**C**oupled du**A**l-interactive linea**R** att**E**ntion) is a novel linear attention mechanism designed to unleash the power of linear attention for resource-constrained mobile devices. It utilizes an asymmetrical feature decoupling strategy to manage local inductive bias and long-range dependencies, alongside a dual interaction module to facilitate communication across features and layers.
|
| 11 |
|
| 12 |
+
## Performance
|
| 13 |
+
|
| 14 |
+
The CARE Transformer achieves high efficiency and accuracy on the ImageNet-1K dataset:
|
| 15 |
+
|
| 16 |
+
| Method | Type | GMACs | Params (M) | Top-1 Acc (%) |
|
| 17 |
+
| :---: | :---: | :---: | :--------: | :-----------: |
|
| 18 |
+
| CARE-S0 | LA+CONV | 0.7 | 7.3 | 78.4 |
|
| 19 |
+
| CARE-S1 | LA+CONV | 1.0 | 9.6 | 80.1 |
|
| 20 |
+
| CARE-S2 | LA+CONV | 1.9 | 19.5 | 82.1 |
|
| 21 |
+
|
| 22 |
+
## Resources
|
| 23 |
+
|
| 24 |
+
- **Paper:** [https://arxiv.org/abs/2411.16170](https://arxiv.org/abs/2411.16170)
|
| 25 |
+
- **Official GitHub Repository:** [https://github.com/zhouyuan888888/CARE-Transformer](https://github.com/zhouyuan888888/CARE-Transformer)
|
| 26 |
+
|
| 27 |
+
## Citation
|
| 28 |
+
|
| 29 |
+
If you find this work useful, please cite:
|
| 30 |
+
|
| 31 |
+
```bibtex
|
| 32 |
+
@inproceedings{zhou2025care,
|
| 33 |
+
title={CARE Transformer: Mobile-Friendly Linear Visual Transformer via Decoupled Dual Interaction},
|
| 34 |
+
author={Zhou, Yuan and Xu, Qingshan and Cui, Jiequan and Zhou, Junbao and Zhang, Jing and Hong, Richang and Zhang, Hanwang},
|
| 35 |
+
booktitle={Proceedings of the Computer Vision and Pattern Recognition Conference},
|
| 36 |
+
pages={20135--20145},
|
| 37 |
+
year={2025}
|
| 38 |
+
}
|
| 39 |
+
```
|