Improve model card for TAPFormer

#1
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +47 -4
README.md CHANGED
@@ -1,10 +1,53 @@
1
  ---
2
  license: mit
 
 
 
 
 
 
3
  ---
4
- TAPFormer: Robust Arbitrary Point Tracking via Transient Asynchronous Fusion of Frames and Events (CVPR 2026)
5
 
6
- https://arxiv.org/abs/2603.04989
7
 
8
- Project website: tapformer.github.io
9
 
10
- The github repo: https://github.com/ljx1002/TAPFormer
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: mit
3
+ pipeline_tag: other
4
+ tags:
5
+ - computer-vision
6
+ - point-tracking
7
+ - event-camera
8
+ - multimodal-fusion
9
  ---
 
10
 
11
+ # TAPFormer: Robust Arbitrary Point Tracking via Transient Asynchronous Fusion of Frames and Events
12
 
13
+ This repository contains the weights for **TAPFormer**, presented at CVPR 2026.
14
 
15
+ [**Paper (arXiv)**](https://arxiv.org/abs/2603.04989) | [**Project Page**](https://tapformer.github.io/) | [**GitHub Repository**](https://github.com/ljx1002/TAPFormer)
16
+
17
+ ## Introduction
18
+
19
+ Tracking any point (TAP) is a fundamental yet challenging task in computer vision, requiring high precision and long-term motion reasoning. **TAPFormer** is a transformer-based framework that performs asynchronous temporal-consistent fusion of frames and events for robust and high-frequency arbitrary point tracking.
20
+
21
+ Our key innovation is a **Transient Asynchronous Fusion (TAF)** mechanism, which explicitly models the temporal evolution between discrete frames through continuous event updates, bridging the gap between low-rate frames and high-rate events. In addition, a **Cross-modal Locally Weighted Fusion (CLWF)** module adaptively adjusts spatial attention according to modality reliability, yielding stable and discriminative features even under blur or low light.
22
+
23
+ ## Key Features
24
+
25
+ - **Asynchronous Fusion**: The first framework to explicitly model temporal continuity between frames and events via TAF.
26
+ - **Modality Reliability**: CLWF adaptively handles challenging conditions like motion blur or low illumination.
27
+ - **SOTA Performance**: Achieves a 28.2% improvement in average pixel error on real-world benchmarks.
28
+
29
+ ## Installation and Usage
30
+
31
+ For detailed instructions on environment setup, data preparation, and running evaluation scripts, please refer to the [official GitHub repository](https://github.com/ljx1002/TAPFormer).
32
+
33
+ ## Citation
34
+
35
+ If you find this work useful in your research, please cite:
36
+
37
+ ```bibtex
38
+ @article{liu2026tapformer,
39
+ title={TAPFormer: Robust Arbitrary Point Tracking via Transient Asynchronous Fusion of Frames and Events},
40
+ author={Liu, Jiaxiong and Tan, Zhen and Zhang, Jinpu and Zhou, Yi and Shen, Hui and Chen, Xieyuanli and Hu, Dewen},
41
+ journal={arXiv preprint arXiv:2603.04989},
42
+ year={2026}
43
+ }
44
+
45
+ @inproceedings{liu2025tracking,
46
+ title={Tracking any point with frame-event fusion network at high frame rate},
47
+ author={Liu, Jiaxiong and Wang, Bo and Tan, Zhen and Zhang, Jinpu and Shen, Hui and Hu, Dewen},
48
+ booktitle={2025 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)},
49
+ pages={18834--18840},
50
+ year={2025},
51
+ organization={IEEE}
52
+ }
53
+ ```